for developers LLM inference API

Pay only for what you use

Simple per-token pricing with prepaid credits. No subscriptions, no commitments. OpenAI-compatible API for Kimi K2.5, DeepSeek V3.2, and more.

Get Started Free
per-token pricing

Transparent model pricing

Pay per token from your prepaid balance — no minimum spend

DeepSeek V3.2
Input
$0.23
Output
$0.33
Cache
$0.012
163K context · 65K max output · per M tokens
MiniMax M2.5
Input
$0.27
Output
$1.08
Cache
$0.15
196K context · 131K max output · per M tokens
GLM-5
Input
$0.95
Output
$2.55
Cache
$0.2
205K context · 131K max output · per M tokens
Kimi K2.5
Input
$0.45
Output
$2.2
Cache
$0.225
262K context · 262K max output · per M tokens

Dedicated Deployments

Need guaranteed capacity? Deploy models on dedicated NVIDIA H200 or H100 GPUs with reserved throughput and custom configurations. Contact sales for pricing.

All models run on NVIDIA B200 GPUs with speculative decoding for industry-leading speeds. Pay only for the tokens you use with prepaid credits.

Built for AI-powered coding

489 tokens/sec means your AI assistant thinks faster. Cursor autocomplete feels instant, Claude Code edits land quicker, and coding agents iterate in seconds instead of minutes.

4x faster than OpenAI
~90% cheaper than GPT-4o
Works with
Cursor Claude Code Cline Windsurf Kilo Code 20+ more
Output speed comparison
Avian (DeepSeek V3.2)489 tok/s
OpenAI (GPT-4o)120 tok/s
Anthropic (Claude 3.5)90 tok/s
Cost per 1M output tokens
Avian (DeepSeek V3.2)$0.33
OpenAI (GPT-4o)$10.00
Anthropic (Claude 3.5)$15.00
Set up in 60 seconds
prepaid credits

Add credits to get started

Top up your balance and start making API calls immediately

$50
API Credits
$100
API Credits
$150
API Credits
$250
API Credits

Frequently Asked Questions

Everything you need to know about Avian pricing

Avian offers access to Kimi K2.5, DeepSeek V3.2, MiniMax M2.5, and GLM-5. All models are available to every user — just add credits and start making requests.

Avian uses a simple prepaid credit system. Add credits to your account, then each API request deducts the token cost from your balance. No subscriptions, no monthly fees — you only pay for the tokens you use.

Yes. The Avian API follows the OpenAI Chat Completions format. Just change the base URL to https://api.avian.io/v1 in any OpenAI SDK and it works out of the box.

No. Credits never expire. Use them whenever you need — there's no time limit on your prepaid balance.

No. As long as you have credits in your account, you can make as many API requests as you need. Your balance is the only limit.

Start building with the Avian API

Get your API key in under a minute. Pay only for what you use.

Get Started Free