Develop with Native SDK

🚀 Develop and customize new agents

This guide will walk you through creating and running your own agents for AIOS.

Agent Structure

First, let's look at how to organize your agent's files. Every agent needs three essential components:

author/
└── agent_name/
      │── entry.py        # Your agent's main logic
      │── config.json     # Configuration and metadata
      └── meta_requirements.txt  # Additional dependencies

For example, if your name is demo_author and you're building a demo_agent that searches and summarizes articles, your folder structure would look like this:

demo_author/
   └── demo_agent/
         │── entry.py
         │── config.json
         └── meta_requirements.txt

Note: If your agent needs any libraries beyond AIOS's built-in ones, make sure to list them in meta_requirements.txt. Apart from the above three files, you can have any other files in your folder.

Configure the agent

Your agent needs a config.json file that describes its functionality. Here's what it should include:

{
   "name": "demo_agent",
   "description": [
      "Demo agent that can help search AIOS-related papers"
   ],
   "tools": [
      "demo_author/arxiv"
   ],
   "meta": {
      "author": "demo_author",
      "version": "0.0.1",
      "license": "CC0"
   },
   "build": {
      "entry": "agent.py",
      "module": "DemoAgent"
   }
}

When setting up your agent, you'll need to specify which tools it will use. Below is a list of all currently available tools and how to reference them in your configuration:

Author
Name
How to Use

example

arxiv

example/arxiv

example

bing_search

example/bing_search

example

currency_converter

example/currency_converter

example

wolfram_alpha

example/wolfram_alpha

example

google_search

example/google_search

openai

speech_to_text

openai/speech_to_text

example

web_browser

example/web_browser

timbrooks

image_to_image

timbrooks/image_to_image

example

downloader

example/downloader

example

doc_question_answering

example/doc_question_answering

stability-ai

text_to_image

stability-ai/text_to_image

example

text_to_speech

example/text_to_speech

To use these tools in your agent, simply include their reference (from the "How to Use" column) in your agent's configuration file. For example, if you want your agent to be able to search academic papers and convert currencies, you would include both example/arxiv and example/currency_converter in the configuration of your agent.

If you would like to create your new tools, you can either integrate the tool within your agent code or you can follow the tool examples in the tool folder to develop your standalone tools. The detailed instructions are in How to develop new tools.

Let's walk through creating your agent's core functionality.

Set up the Base Agent Class

First, create your agent class by inheriting from BaseAgent:

from cerebrum.agents.base import BaseAgent
from cerebrum.llm.communication import LLMQuery
import json

class DemoAgent(BaseAgent):
    def __init__(self, agent_name, task_input, config_):
        super().__init__(agent_name, task_input, config_)

        self.plan_max_fail_times = 3
        self.tool_call_max_fail_times = 3

        self.start_time = None
        self.end_time = None
        self.request_waiting_times: list = []
        self.request_turnaround_times: list = []
        self.task_input = task_input
        self.messages = []
        self.workflow_mode = "manual"  # (manual, automatic)
        self.rounds = 0

Import Query Functions

AIOS provides several Query classes for different types of interactions and use the Response class in here to receive results from the AIOS kernel.

Query Class
Arguments
Output

LLMQuery

messages: List, tools: List, action_type: str, message_return_type: str

response: Response

MemoryQuery

TBD

response: Response

StorageQuery

TBD

response: Response

ToolQuery

tool_calls: List

response: Response

Here's how to import a specific query

from cerebrum.llm.communication import LLMQuery  # Using LLMQuery as an example

Construct system instructions

Here's how to set up your agent's system instructions and you need to put this function inside your agent class

def build_system_instruction(self):
    prefix = "".join(["".join(self.config["description"])])

    plan_instruction = "".join(
        [
            f"You are given the available tools from the tool list: {json.dumps(self.tool_info)} to help you solve problems. ",
            "Generate a plan with comprehensive yet minimal steps to fulfill the task. ",
            "The plan must follow the json format as below: ",
            "[",
            '{"action_type": "action_type_value", "action": "action_value","tool_use": [tool_name1, tool_name2,...]}',
            '{"action_type": "action_type_value", "action": "action_value", "tool_use": [tool_name1, tool_name2,...]}',
            "...",
            "]",
            "In each step of the planned plan, identify tools to use and recognize no tool is necessary. ",
            "Followings are some plan examples. ",
            "[" "[",
            '{"action_type": "tool_use", "action": "gather information from arxiv. ", "tool_use": ["arxiv"]},',
            '{"action_type": "chat", "action": "write a summarization based on the gathered information. ", "tool_use": []}',
            "];",
            "[",
            '{"action_type": "tool_use", "action": "gather information from arxiv. ", "tool_use": ["arxiv"]},',
            '{"action_type": "chat", "action": "understand the current methods and propose ideas that can improve ", "tool_use": []}',
            "]",
            "]",
        ]
    )

    if self.workflow_mode == "manual":
        self.messages.append({"role": "system", "content": prefix})

    else:
        assert self.workflow_mode == "automatic"
        self.messages.append({"role": "system", "content": prefix})
        self.messages.append({"role": "user", "content": plan_instruction})

Create Workflows

You can create a workflow for the agent to execute its task and you need to put this function inside your agent class.

Manual workflow example:

def manual_workflow(self):
    workflow = [
        {
            "action_type": "tool_use",
            "action": "Search for relevant papers",
            "tool_use": ["demo_author/arxiv"],
        },
        {
            "action_type": "chat",
            "action": "Provide responses based on the user's query",
            "tool_use": [],
        },
    ]
    return workflow

Implement the Run Method

Finally, implement the run method to execute your agent's workflow and you need to put this function inside your agent class.

def run(self):
    self.build_system_instruction()

    task_input = self.task_input

    self.messages.append({"role": "user", "content": task_input})

    workflow = None

    if self.workflow_mode == "automatic":
        workflow = self.automatic_workflow()
        self.messages = self.messages[:1]  # clear long context

    else:
        assert self.workflow_mode == "manual"
        workflow = self.manual_workflow()

    self.messages.append(
        {
            "role": "user",
            "content": f"[Thinking]: The workflow generated for the problem is {json.dumps(workflow)}. Follow the workflow to solve the problem step by step. ",
        }
    )

    try:
        if workflow:
            final_result = ""

            for i, step in enumerate(workflow):
                action_type = step["action_type"]
                action = step["action"]
                tool_use = step["tool_use"]

                prompt = f"At step {i + 1}, you need to: {action}. "
                self.messages.append({"role": "user", "content": prompt})

                if tool_use:
                    selected_tools = self.pre_select_tools(tool_use)

                else:
                    selected_tools = None

                response = self.send_request(
                    agent_name=self.agent_name,
                    query=LLMQuery(
                        messages=self.messages,
                        tools=selected_tools,
                        action_type=action_type,
                    ),
                )["response"]
                
                self.messages.append({"role": "assistant", "content": response.response_message})

                self.rounds += 1


            final_result = self.messages[-1]["content"]
            
            return {
                "agent_name": self.agent_name,
                "result": final_result,
                "rounds": self.rounds,
            }

        else:
            return {
                "agent_name": self.agent_name,
                "result": "Failed to generate a valid workflow in the given times.",
                "rounds": self.rounds,
            }
            
    except Exception as e:
        return {}

Run the Agent

To test your agent, use the run_agent command to run:

run-agent --llm_name <llm_name> --llm_backend <llm_backend> --agent_name_or_path <agent_name_or_path> --task <task_input> --aios_kernel_url <aios_kernel_url>

Replace the placeholders with your specific values:

  • <llm_name>: The name of the language model you want to use

  • <llm_backend>: The backend service for the language model

  • <your_agent_folder_path>: The path to your agent's folder

  • <task_input>: The task you want your agent to complete

  • <aios_kernel_url>:

or you can run the agent using the source code in the cerebrum/example/run_agent

python cerebrum/example/run_agent --llm_name <llm_name> --llm_backend <llm_backend> --agent_name_or_path <agent_name_or_path>> --task <task_input> --aios_kernel_url <aios_kernel_url>

Running Your Agent

To test your agent, use the run_agent command to run:

run-agent --llm_name <llm_name> --llm_backend <llm_backend> --agent_name_or_path <agent_name_or_path> --task <task_input>

or you can run the agent using the source code in the cerebrum/example/run_agent

python cerebrum/example/run_agent --llm_name <llm_name> --llm_backend <llm_backend> --agent_name_or_path <agent_name_or_path>> --task <task_input> --aios_kernel_url <aios_kernel_url>

Replace the placeholders with your specific values:

  • <llm_name>: The name of the language model you want to use

  • <llm_backend>: The backend service for the language model

  • <your_agent_folder_path>: The path to your agent's folder

  • <task_input>: The task you want your agent to complete

  • <aios_kernel_url>: The url that is connected to the aios kernel

🔧Develop and Customize New Tools

Tool Structure

Similar as developing new agents, developing tools also need to follow a simple directory structure:

demo_author/
└── demo_tool/
    │── entry.py      # Contains your tool's main logic
    └── config.json   # Tool configuration and metadata

Setting up config.json

Your tool needs a configuration file that describes its properties. Here's an example of how to set it up:

{
    "name": "demo_tool",
    "description": [
        "The arxiv tool that can be used to search for papers on arxiv"
    ],
    "meta": {
        "author": "demo_author",
        "version": "1.0.6",
        "license": "CC0"
    },
    "build": {
        "entry": "tool.py",
        "module": "DemoTool"
    }
}

Create Tool Class

In entry.py, you'll need to implement a tool class which is identified in the config.json with two essential methods:

  1. get_tool_call_format: Defines how LLMs should interact with your tool

  2. run: Contains your tool's main functionality

Here's an example:

class Arxiv:
    def get_tool_call_format(self):
        tool_call_format = {
            "type": "function",
            "function": {
                "name": "demo_author/arxiv",
                "description": "Query articles or topics in arxiv",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "Input query that describes what to search in arxiv"
                        }
                    },
                    "required": [
                        "query"
                    ]
                }
            }
        }
        return tool_call_format

    def run(self, params: dict):
        """
        Main tool logic goes here.
        Args:
            params: Dictionary containing tool parameters
        Returns:
            Your tool's output
        """
        # Your code here
        result = do_something(params['param_name'])
        return result

Integration Tips

When integrating your tool for the agents you develop:

  • Use absolute paths to reference your tool in agent configurations

  • Example: /path/to/your/tools/example/your_tool instead of just author/tool_name

Last updated