Quickstart

Be sure to complete the installation instructions before continuing with this guide.

Before starting AIOS, you need to make sure you have installed the LLM backends that you would like to run. Here are the LLM providers for supported backends for AIOS.

Provider 🏢

Model Name 🤖

Open Source 🔓

Model String ⌨️

Backend ⚙️

Required API Key

Anthropic

Claude 3.5 Sonnet

❌

claude-3-5-sonnet-20241022

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3.5 Haiku

❌

claude-3-5-haiku-20241022

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3 Opus

❌

claude-3-opus-20240229

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3 Sonnet

❌

claude-3-sonnet-20240229

anthropic

ANTHROPIC_API_KEY

Anthropic

Claude 3 Haiku

❌

claude-3-haiku-20240307

anthropic

ANTHROPIC_API_KEY

Deepseek

Deepseek-reasoner (R1)

❌

deepseek-reasoner

deepseek

DEEPSEEK_API_KEY

Deepseek

Deepseek-chat (V3)

❌

deepseek-chat

deepseek

DEEPSEEK_API_KEY

OpenAI

GPT-4

❌

gpt-4

openai

OPENAI_API_KEY

OpenAI

GPT-4 Turbo

❌

gpt-4-turbo

openai

OPENAI_API_KEY

OpenAI

GPT-4o

❌

gpt-4o

openai

OPENAI_API_KEY

OpenAI

GPT-4o mini

❌

gpt-4o-mini

openai

OPENAI_API_KEY

OpenAI

GPT-3.5 Turbo

❌

gpt-3.5-turbo

openai

OPENAI_API_KEY

Google

Gemini 1.5 Flash

❌

gemini-1.5-flash

google

GEMINI_API_KEY

Google

Gemini 1.5 Flash-8B

❌

gemini-1.5-flash-8b

google

GEMINI_API_KEY

Google

Gemini 1.5 Pro

❌

gemini-1.5-pro

google

GEMINI_API_KEY

Google

Gemini 1.0 Pro

❌

gemini-1.0-pro

google

GEMINI_API_KEY

Groq

Llama 3.2 90B Vision

✅

llama-3.2-90b-vision-preview

groq

GROQ_API_KEY

Groq

Llama 3.2 11B Vision

✅

llama-3.2-11b-vision-preview

groq

GROQ_API_KEY

Groq

Llama 3.1 70B

✅

llama-3.1-70b-versatile

groq

GROQ_API_KEY

Groq

Llama Guard 3 8B

✅

llama-guard-3-8b

groq

GROQ_API_KEY

Groq

Llama 3 70B

✅

llama3-70b-8192

groq

GROQ_API_KEY

Groq

Llama 3 8B

✅

llama3-8b-8192

groq

GROQ_API_KEY

Groq

Mixtral 8x7B

✅

mixtral-8x7b-32768

groq

GROQ_API_KEY

Groq

Gemma 7B

✅

gemma-7b-it

groq

GROQ_API_KEY

Groq

Gemma 2B

✅

gemma2-9b-it

groq

GROQ_API_KEY

Groq

Llama3 Groq 70B

✅

llama3-groq-70b-8192-tool-use-preview

groq

GROQ_API_KEY

Groq

Llama3 Groq 8B

✅

llama3-groq-8b-8192-tool-use-preview

groq

GROQ_API_KEY

ollama

All Models

✅

model-name

ollama

vLLM

All Models

✅

model-name

vllm

HuggingFace

All Models

✅

model-name

huggingface

HF_HOME

Configuration

Set up configuration file directly (Recommended)

You need API keys for services like OpenAI, Anthropic, Groq and HuggingFace. The simplest way to configure them is to edit the aios/config/config.yaml.

[!TIP] It is important to mention that, we strongly recommend using the aios/config/config.yaml file to set up your API keys. This method is straightforward and helps avoid potential sychronization issues with environment variables.

A simple example to set up your API keys in aios/config/config.yaml is shown below:

api_keys:
  openai: "your-openai-key"    
  gemini: "your-gemini-key"    
  groq: "your-groq-key"      
  anthropic: "your-anthropic-key" 
  huggingface:
    auth_token: "token to authorize specific models for use"  
    home: "path to store downloaded model weights"

To obtain these API keys:

Deepseek API: https://api-docs.deepseek.com/
OpenAI API: https://platform.openai.com/api-keys
Google Gemini API: https://makersuite.google.com/app/apikey
Groq API: https://console.groq.com/keys
HuggingFace Token: https://huggingface.co/settings/tokens
Anthropic API: https://console.anthropic.com/keys

Configure LLM Models

You can configure which LLM models to use in the same aios/config/config.yaml file. Here's an example configuration:

llms:
  models:
    # Ollama Models
    - name: "qwen2.5:7b"
      backend: "ollama"
      hostname: "http://localhost:11434"  # Make sure to run ollama server

    # vLLM Models
    - name: "meta-llama/Llama-3.1-8B-Instruct"
      backend: "vllm"
      hostname: "http://localhost:8091/v1"  # Make sure to run vllm server

Using Ollama Models:

First, download ollama from https://ollama.com/
Start the ollama server in a separate terminal:

ollama serve

Pull your desired models from https://ollama.com/library:

ollama pull qwen2.5:7b  # example model

Ollama supports both CPU-only and GPU environments. For more details about ollama usage, visit ollama documentation

Using vLLM Models:

Install vLLM following their installation guide
Start the vLLM server in a separate terminal:

vllm serve meta-llama/Llama-3.1-8B-Instruct --port 8091

vLLM currently only supports Linux and GPU-enabled environments. If you don't have a compatible environment, please choose other backend options. To enable the tool calling feature of vllm, refer to https://docs.vllm.ai/en/latest/features/tool_calling.html

Using HuggingFace Models: You can configure HuggingFace models with specific GPU memory allocation:

- name: "meta-llama/Llama-3.1-8B-Instruct"
  backend: "huggingface"
  max_gpu_memory: {0: "24GB", 1: "24GB"}  # GPU memory allocation
  eval_device: "cuda:0"  # Device for model evaluation

Launch AIOS

After you setup the required keys, you can run the following command to launch the AIOS kernel.

bash runtime/launch_kernel.sh

And then you can start a client to interact with the AIOS kernel using Terminal or WebUI.

PreviousInstallation NextUse Terminal

Last updated 3 months ago