for developers LLM inference API

Pay only for what you use

Simple per-token pricing with prepaid credits. No subscriptions, no commitments. OpenAI-compatible API for Kimi K2.5, DeepSeek V3.2, and more.

Get Started Free

per-token pricing

Transparent model pricing

Pay per token from your prepaid balance — no minimum spend

DeepSeek V4 Flash

Input

$0.105

Output

$0.21

Cache

$0.021

1M context · 384K max output · per M tokens

DeepSeek V4 Pro

Input

$1.305

Output

$2.61

Cache

$0.10875

1M context · 384K max output · per M tokens

DeepSeek V3.2

Input

$0.23

Output

$0.33

Cache

$0.012

163K context · 65K max output · per M tokens

MiniMax M2.5

Input

$0.27

Output

$1.08

Cache

$0.15

196K context · 131K max output · per M tokens

GLM-5

Input

$0.95

Output

$2.55

Cache

$0.2

205K context · 131K max output · per M tokens

GLM-5.1

Input

Output

$3.2

Cache

$0.2

202K context · 202K max output · per M tokens

Kimi K2.5

Input

$0.45

Output

$2.2

Cache

$0.225

262K context · 262K max output · per M tokens

Kimi K2.6

Input

$0.95

Output

Cache

$0.16

262K context · 262K max output · per M tokens

Dedicated Deployments

Need guaranteed capacity? Deploy models on dedicated NVIDIA H200 or H100 GPUs with reserved throughput and custom configurations. Contact sales for pricing.

All models run on NVIDIA B200 GPUs with speculative decoding for industry-leading speeds. Pay only for the tokens you use with prepaid credits.

Built for AI-powered coding

489 tokens/sec means your AI assistant thinks faster. Cursor autocomplete feels instant, Claude Code edits land quicker, and coding agents iterate in seconds instead of minutes.

4x faster than OpenAI

~90% cheaper than GPT-4o

Works with

Cursor Claude Code Cline Windsurf Kilo Code 20+ more

Output speed comparison

Avian (DeepSeek V3.2)489 tok/s

OpenAI (GPT-4o)120 tok/s

Anthropic (Claude 3.5)90 tok/s

Cost per 1M output tokens

Avian (DeepSeek V3.2)$0.33

OpenAI (GPT-4o)$10.00

Anthropic (Claude 3.5)$15.00

Set up in 60 seconds

prepaid credits

Add credits to get started

Top up your balance and start making API calls immediately

$50

API Credits

$100

API Credits

$150

API Credits

$250

API Credits

Frequently Asked Questions

Everything you need to know about Avian pricing

Avian offers access to Kimi K2.5, DeepSeek V3.2, MiniMax M2.5, GLM-5, and GLM-5.1. All models are available to every user — just add credits and start making requests.

Avian uses a simple prepaid credit system. Add credits to your account, then each API request deducts the token cost from your balance. No subscriptions, no monthly fees — you only pay for the tokens you use.

Yes. The Avian API follows the OpenAI Chat Completions format. Just change the base URL to https://api.avian.io/v1 in any OpenAI SDK and it works out of the box.

No. Credits never expire. Use them whenever you need — there's no time limit on your prepaid balance.

No. As long as you have credits in your account, you can make as many API requests as you need. Your balance is the only limit.

Pay only for what you use

Transparent model pricing

Built for AI-powered coding

Add credits to get started

Frequently Asked Questions

Start building with the Avian API