for developers
LLM inference API
Simple per-token pricing with prepaid credits. No subscriptions, no commitments. OpenAI-compatible API for Kimi K2.5, DeepSeek V3.2, and more.
Get Started FreePay per token from your prepaid balance — no minimum spend
Dedicated Deployments
Need guaranteed capacity? Deploy models on dedicated NVIDIA H200 or H100 GPUs with reserved throughput and custom configurations. Contact sales for pricing.
All models run on NVIDIA B200 GPUs with speculative decoding for industry-leading speeds. Pay only for the tokens you use with prepaid credits.
489 tokens/sec means your AI assistant thinks faster. Cursor autocomplete feels instant, Claude Code edits land quicker, and coding agents iterate in seconds instead of minutes.
Top up your balance and start making API calls immediately
Everything you need to know about Avian pricing
Avian offers access to Kimi K2.5, DeepSeek V3.2, MiniMax M2.5, and GLM-5. All models are available to every user — just add credits and start making requests.
Avian uses a simple prepaid credit system. Add credits to your account, then each API request deducts the token cost from your balance. No subscriptions, no monthly fees — you only pay for the tokens you use.
Yes. The Avian API follows the OpenAI Chat Completions format. Just change the base URL to https://api.avian.io/v1 in any OpenAI SDK and it works out of the box.
No. Credits never expire. Use them whenever you need — there's no time limit on your prepaid balance.
No. As long as you have credits in your account, you can make as many API requests as you need. Your balance is the only limit.