# vLLM Backend

[Source code](https://github.com/agiresearch/AIOS/blob/main/aios/llm_core/adapter.py)

vLLM backends are handled using the OpenAI client class due to compatibility issues with LiteLLM.

These backends are initialized as OpenAI client instances with a custom base URL:

```python
case "vllm":
    self.llms.append(OpenAI(
        base_url=config.hostname,
        api_key="sk-1234"  # Dummy API key
    ))

case "sglang":
    self.llms.append(OpenAI(
        base_url=config.hostname,
        api_key="sk-1234"  # Dummy API key
    ))
```

{% hint style="warning" %}
It is important to note that a dummy API key ("sk-1234") is required to set up the OpenAI hosted client since these backends typically don't require authentication when run locally.
{% endhint %}

**Standard Text Input**

For standard text input, the OpenAI client is used directly:

```python
completed_response = model.chat.completions.create(
    model=model_name,
    **completion_kwargs
)
return completed_response.choices[0].message.content, True
```

**Tool Calls**

When processing tool calls with OpenAI client-based backends:

```python
# Add tools to completion parameters
if tools:
    completion_kwargs["tools"] = tools

completed_response = model.chat.completions.create(
    model=model_name,
    **completion_kwargs
)

if tools:
    completed_response = decode_litellm_tool_calls(completed_response)
    return completed_response, True
```

The response is processed using the same [`decode_litellm_tool_calls`](https://github.com/agiresearch/AIOS/blob/main/aios/llm_core/utils.py) function as for LiteLLM backends.

**JSON-Formatted Responses**

JSON formatting uses the same approach as standard OpenAI clients:

```python
if message_return_type == "json":
    completion_kwargs["format"] = "json"
    if response_format:
        completion_kwargs["response_format"] = response_format
```
