Function calling

Let the model call your code. OpenAI-spec tools API, supported on Tomoul models that advertise `capabilities.tools: true` in the catalog.

The flow

  1. You send the user message plus a tools array.
  2. Model returns a tool_calls array instead of (or alongside) content.
  3. You run the tool.
  4. You send the tool's output back to the model as a tool role message.
  5. Model returns the final answer.

Define tools

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

Parse the call

import json
 
resp = client.chat.completions.create(
    model="microsoft/phi-4",
    messages=[{"role": "user", "content": "Weather in Lagos?"}],
    tools=tools,
)
call = resp.choices[0].message.tool_calls[0]
args = json.loads(call.function.arguments)
result = get_weather(**args)

Respond with the result

final = client.chat.completions.create(
    model="microsoft/phi-4",
    messages=[
        {"role": "user", "content": "Weather in Lagos?"},
        resp.choices[0].message,
        {
            "role": "tool",
            "tool_call_id": call.id,
            "content": json.dumps(result),
        },
    ],
    tools=tools,
)
print(final.choices[0].message.content)

Supported models

Filter for capabilities.tools: true in GET /v1/models. At launch:

  • microsoft/phi-4
  • qwen/qwen3-30b-a3b
  • openai/gpt-oss-20b
  • openai/gpt-oss-120b
Parallel tool calls.

Some models can emit multiple tool_calls in one response. Always iterate the array — don't hard-code tool_calls[0] in production code.

Last updated 13 May 2026Edit this page on GitHub