API Documentation

Integrate with the Avian LLM inference API. OpenAI-compatible endpoints, simple authentication, and streaming support.

quickstart

Get started in seconds

The Avian API is fully OpenAI-compatible. Switch your base URL and start making requests.

curl https://api.avian.io/v1/chat/completions \
  -H "Authorization: Bearer avian-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "Hello, world!"}
    ]
  }'
from openai import OpenAI

client = OpenAI(
    base_url="https://api.avian.io/v1",
    api_key="avian-YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-v3.2",
    messages=[
        {"role": "user", "content": "Hello, world!"}
    ],
)

print(response.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://api.avian.io/v1",
    apiKey: "avian-YOUR_API_KEY",
});

const response = await client.chat.completions.create({
    model: "deepseek/deepseek-v3.2",
    messages: [
        { role: "user", content: "Hello, world!" },
    ],
});

console.log(response.choices[0].message.content);
configuration

Base URL & Authentication

All API requests require a valid API key passed in the Authorization header.

Base URL

All API endpoints are served from a single base URL.

https://api.avian.io/v1

Authentication

Pass your API key as a Bearer token in the Authorization header.

Authorization: Bearer avian-...
models

Available Models

All models are available through the same endpoint. Specify the model ID in your request body.

Model Model ID Context Input / 1M Output / 1M
DeepSeek V3.2 deepseek/deepseek-v3.2 163K $0.26 $0.38
MiniMax M2.5 minimax/minimax-m2.5 196K $0.30 $1.10
GLM-5 z-ai/glm-5 205K $0.30 $2.55
Kimi K2.5 moonshotai/kimi-k2.5 262K $0.45 $2.20
endpoint

Chat Completions

POST /v1/chat/completions

Creates a chat completion. Send a list of messages and receive a model-generated response.

Request Parameters

Parameter Type Required Description
model string Yes The model ID to use (e.g. deepseek/deepseek-v3.2)
messages array Yes Array of message objects with role and content fields
temperature number No Sampling temperature between 0 and 2. Defaults to 1
max_tokens integer No Maximum number of tokens to generate
stream boolean No If true, returns a stream of server-sent events. Defaults to false
tools array No List of tool/function definitions for function calling

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709000000,
  "model": "deepseek/deepseek-v3.2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}
feature

Function Calling

Define tools that the model can invoke. The model will return a tool_calls array when it decides a function should be called.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.avian.io/v1",
    api_key="avian-YOUR_API_KEY",
)

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g. San Francisco",
                    }
                },
                "required": ["location"],
            },
        },
    }
]

response = client.chat.completions.create(
    model="deepseek/deepseek-v3.2",
    messages=[{"role": "user", "content": "What's the weather in London?"}],
    tools=tools,
)

# The model may return tool_calls in the response
tool_calls = response.choices[0].message.tool_calls
feature

Streaming

Set stream: true to receive partial responses as server-sent events (SSE). Tokens are delivered incrementally as they are generated.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.avian.io/v1",
    api_key="avian-YOUR_API_KEY",
)

stream = client.chat.completions.create(
    model="deepseek/deepseek-v3.2",
    messages=[{"role": "user", "content": "Explain quantum computing"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
curl https://api.avian.io/v1/chat/completions \
  -H "Authorization: Bearer avian-YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "deepseek/deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "stream": true
  }'
reference

Error Codes

The API returns standard HTTP status codes. Errors include a JSON body with a descriptive message.

Status Meaning Description
200 OK Request succeeded
400 Bad Request Invalid request parameters or malformed JSON
401 Unauthorized Missing or invalid API key
402 Payment Required Insufficient credit balance
429 Too Many Requests Rate limit exceeded. Retry after a brief delay
500 Internal Server Error Something went wrong on our end. Please retry

Ready to get started?

Create an account and get your API key in under a minute.

Get an API Key