Quickstart

Be sure to complete the installation instructions before continuing with this guide.

Before starting AIOS, you need to make sure you have installed the LLM backends that you would like to run. Here are the LLM providers for supported backends for AIOS.

Provider 🏢
Model Name 🤖
Open Source 🔓
Model String ⌨️
Backend ⚙️
Required API Key

Anthropic

Claude 3.5 Sonnet

claude-3-5-sonnet-20241022

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3.5 Haiku

claude-3-5-haiku-20241022

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3 Opus

claude-3-opus-20240229

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3 Sonnet

claude-3-sonnet-20240229

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3 Haiku

claude-3-haiku-20240307

anthropic

ANTHROPIC_API_KEY

Deepseek

Deepseek-reasoner (R1)

deepseek-reasoner

deepseek

DEEPSEEK_API_KEY

Deepseek

Deepseek-chat (V3)

deepseek-chat

deepseek

DEEPSEEK_API_KEY

OpenAI

GPT-4

gpt-4

openai

OPENAI_API_KEY

OpenAI

GPT-4 Turbo

gpt-4-turbo

openai

OPENAI_API_KEY

OpenAI

GPT-4o

gpt-4o

openai

OPENAI_API_KEY

OpenAI

GPT-4o mini

gpt-4o-mini

openai

OPENAI_API_KEY

OpenAI

GPT-3.5 Turbo

gpt-3.5-turbo

openai

OPENAI_API_KEY

Google

Gemini 1.5 Flash

gemini-1.5-flash

google

GEMINI_API_KEY

Google

Gemini 1.5 Flash-8B

gemini-1.5-flash-8b

google

GEMINI_API_KEY

Google

Gemini 1.5 Pro

gemini-1.5-pro

google

GEMINI_API_KEY

Google

Gemini 1.0 Pro

gemini-1.0-pro

google

GEMINI_API_KEY

Groq

Llama 3.2 90B Vision

llama-3.2-90b-vision-preview

groq

GROQ_API_KEY

Groq

Llama 3.2 11B Vision

llama-3.2-11b-vision-preview

groq

GROQ_API_KEY

Groq

Llama 3.1 70B

llama-3.1-70b-versatile

groq

GROQ_API_KEY

Groq

Llama Guard 3 8B

llama-guard-3-8b

groq

GROQ_API_KEY

Groq

Llama 3 70B

llama3-70b-8192

groq

GROQ_API_KEY

Groq

Llama 3 8B

llama3-8b-8192

groq

GROQ_API_KEY

Groq

Mixtral 8x7B

mixtral-8x7b-32768

groq

GROQ_API_KEY

Groq

Gemma 7B

gemma-7b-it

groq

GROQ_API_KEY

Groq

Gemma 2B

gemma2-9b-it

groq

GROQ_API_KEY

Groq

Llama3 Groq 70B

llama3-groq-70b-8192-tool-use-preview

groq

GROQ_API_KEY

Groq

Llama3 Groq 8B

llama3-groq-8b-8192-tool-use-preview

groq

GROQ_API_KEY

ollama

model-name

ollama

-

vLLM

model-name

vllm

-

HuggingFace

model-name

huggingface

HF_HOME

Configuration

Set up configuration file directly (Recommended)

You need API keys for services like OpenAI, Anthropic, Groq and HuggingFace. The simplest way to configure them is to edit the aios/config/config.yaml.

[!TIP] It is important to mention that, we strongly recommend using the aios/config/config.yaml file to set up your API keys. This method is straightforward and helps avoid potential sychronization issues with environment variables.

A simple example to set up your API keys in aios/config/config.yaml is shown below:

api_keys:
  openai: "your-openai-key"    
  gemini: "your-gemini-key"    
  groq: "your-groq-key"      
  anthropic: "your-anthropic-key" 
  huggingface:
    auth_token: "token to authorize specific models for use"  
    home: "path to store downloaded model weights"

To obtain these API keys:

  1. Deepseek API: https://api-docs.deepseek.com/

  2. OpenAI API: https://platform.openai.com/api-keys

  3. Google Gemini API: https://makersuite.google.com/app/apikey

  4. Groq API: https://console.groq.com/keys

  5. HuggingFace Token: https://huggingface.co/settings/tokens

  6. Anthropic API: https://console.anthropic.com/keys

Configure LLM Models

You can configure which LLM models to use in the same aios/config/config.yaml file. Here's an example configuration:

llms:
  models:
    # Ollama Models
    - name: "qwen2.5:7b"
      backend: "ollama"
      hostname: "http://localhost:11434"  # Make sure to run ollama server

    # vLLM Models
    - name: "meta-llama/Llama-3.1-8B-Instruct"
      backend: "vllm"
      hostname: "http://localhost:8091/v1"  # Make sure to run vllm server

Using Ollama Models:

  1. First, download ollama from https://ollama.com/

  2. Start the ollama server in a separate terminal:

ollama serve
  1. Pull your desired models from https://ollama.com/library:

ollama pull qwen2.5:7b  # example model

Ollama supports both CPU-only and GPU environments. For more details about ollama usage, visit ollama documentation

Using vLLM Models:

  1. Install vLLM following their installation guide

  2. Start the vLLM server in a separate terminal:

vllm serve meta-llama/Llama-3.1-8B-Instruct --port 8091

vLLM currently only supports Linux and GPU-enabled environments. If you don't have a compatible environment, please choose other backend options. To enable the tool calling feature of vllm, refer to https://docs.vllm.ai/en/latest/features/tool_calling.html

Using HuggingFace Models: You can configure HuggingFace models with specific GPU memory allocation:

- name: "meta-llama/Llama-3.1-8B-Instruct"
  backend: "huggingface"
  max_gpu_memory: {0: "24GB", 1: "24GB"}  # GPU memory allocation
  eval_device: "cuda:0"  # Device for model evaluation

Launch AIOS

After you setup the required keys, you can run the following command to launch the AIOS kernel.

bash runtime/launch_kernel.sh

And then you can start a client to interact with the AIOS kernel using Terminal or WebUI.

Last updated