LiteLLM Compatible Backend

LiteLLM is a unified interface for various LLM providers, offering a consistent API across different model providers. Supported model providers can be found at https://docs.litellm.ai/docs/providers.

LiteLLM compatible backends are initialized by using string identifiers for models in the format {provider}/{model}, such as:

  • openai/gpt-4o-mini

  • anthropic/claude-3.5-sonnet

  • gemini/gemini-1.5-flash

In the code, these models are initialized and stored as string identifiers in the llms array:

# During initialization
if config.backend == "google":
    config.backend = "gemini"  # Convert backend name for compatibility

prefix = f"{config.backend}/"
if not config.name.startswith(prefix):
    self.llms.append(prefix + config.name)

Standard Text Input

For standard text generation, LiteLLM backends process requests using the completion() function:

completion_kwargs = {
    "messages": messages,
    "temperature": temperature,
    "max_tokens": max_tokens
}

completed_response = completion(model=model, **completion_kwargs)
return completed_response.choices[0].message.content, True

The system passes the messages array, temperature, and max_tokens parameters to the completion function.

Tool Calls

When processing requests with tools, the tool definitions are added to the completion parameters:

if tools:
    tools = slash_to_double_underscore(tools)  # Prevent invalid tool string
    completion_kwargs["tools"] = tools
    
completed_response = completion(model=model, **completion_kwargs)
completed_response = decode_litellm_tool_calls(completed_response)
return completed_response, True

The decode_litellm_tool_calls() function processes the raw response to extract and format the tool calls.

JSON-Formatted Responses

For JSON-formatted responses, the adapter adds the format parameter:

if message_return_type == "json":
    completion_kwargs["format"] = "json"
    if response_format:
        completion_kwargs["response_format"] = response_format

Last updated