model catalog

Production-ready models

Access frontier open-source LLMs through a single API. Pay per token, no subscription required.

New deepseek/deepseek-v4-flash

Context 1M

Max Output 384K

Input / 1M $0.105

Output / 1M $0.21

Cache / 1M $0.021

Try it

New deepseek/deepseek-v4-pro

Context 1M

Max Output 384K

Input / 1M $1.305

Output / 1M $2.61

Cache / 1M $0.10875

Try it

New deepseek/deepseek-v3.2

Context 163K

Max Output 65K

Input / 1M $0.23

Output / 1M $0.33

Cache / 1M $0.012

Try it

New minimax/minimax-m2.5

Context 196K

Max Output 131K

Input / 1M $0.27

Output / 1M $1.08

Cache / 1M $0.15

Try it

New z-ai/glm-5

Context 205K

Max Output 131K

Input / 1M $0.95

Output / 1M $2.55

Cache / 1M $0.2

Try it

New z-ai/glm-5.1

Context 202K

Max Output 202K

Input / 1M $1

Output / 1M $3.2

Cache / 1M $0.2

Try it

New moonshotai/kimi-k2.5

Context 262K

Max Output 262K

Input / 1M $0.45

Output / 1M $2.2

Cache / 1M $0.225

Try it

New moonshotai/kimi-k2.6

Context 262K

Max Output 262K

Input / 1M $0.95

Output / 1M $4

Cache / 1M $0.16

Try it

Built for AI-powered coding

489 tokens/sec means your AI assistant thinks faster. Cursor autocomplete feels instant, Claude Code edits land quicker, and coding agents iterate in seconds instead of minutes.

4x faster than OpenAI

~90% cheaper than GPT-4o

Works with

Cursor Claude Code Cline Windsurf Kilo Code 20+ more

Output speed comparison

Avian (DeepSeek V3.2)489 tok/s

OpenAI (GPT-4o)120 tok/s

Anthropic (Claude 3.5)90 tok/s

Cost per 1M output tokens

Avian (DeepSeek V3.2)$0.33

OpenAI (GPT-4o)$10.00

Anthropic (Claude 3.5)$15.00

Set up in 60 seconds

OpenAI Compatible

Drop-in replacement for the OpenAI SDK. Change one line of code to switch.

No Rate Limits

Scale without restrictions. Send as many requests as you need.

NVIDIA B200 GPUs

Latest hardware for the fastest inference available today.

Start building with Avian

Get your API key in under a minute.

Get Started Free

Production-ready models

DeepSeek V4 Flash

DeepSeek V4 Pro

DeepSeek V3.2

MiniMax M2.5

GLM-5

GLM-5.1

Kimi K2.5

Kimi K2.6

Built for AI-powered coding

OpenAI Compatible

No Rate Limits

NVIDIA B200 GPUs

Start building with Avian