Hugging Face Backend
The Hugging Face Local Backend allows running models locally using the Hugging Face Transformers library.
The HF Local Backend is initialized as a class instance:
It handles loading and running Hugging Face models locally, with options for GPU memory allocation.
Standard Text Input
For standard text requests, the backend uses the generate()
method:
Tool Calls
As huggingface models do not natively support tool calls, the adapter merges tool information into messages before generation and decodes tool calls after generation.
The merge_messages_with_tools()
function formats the tool information into the prompt, and decode_hf_tool_calls()
extracts tool calls from the text response.
JSON-Formatted Responses
JSON formatting is handled by merging the response format into the messages:
The merge_messages_with_response_format()
function likely adds instructions for the model to respond in JSON format.
Last updated