
DeepSeek V4 Flash
DeepSeek V4 Flash cost-efficient, 284B total / 13B active params, 1M context, fast response
Context Window
1.0M
Input price / 1M tokens
$0.151M tokens
Output price / 1M tokens
$0.311M tokens
Cached input / 1M tokens
$0.00301M tokens
Max Completion
384K
Input Modalities
text
Output Modalities
text
reasoningFunction callingChatJSONStreamingrecommendedNew
Description
DeepSeek V4 Flash cost-efficient, 284B total / 13B active params, 1M context, fast response
Available Providers
AllToken can route requests to the providers below based on route priority and policy.
ProviderContext LengthInput PriceOutput PriceCached / MLatency p50Throughput
Best For
DeepSeek V4 Flash cost-efficient, 284B total / 13B active params, 1M context, fast response
How To Use This Model
Use the exact model ID shown below. This is the safest way to avoid call failures, variant mismatches, or incorrect route assumptions.
curl https://api.alltoken.ai/v1/chat/completions \
-H "Authorization: Bearer sk-your-key" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek-v4-flash",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'Supported Parameters
temperaturetop_pmax_tokenstoolsresponse_formatreasoning_effortAPI Key Setup
Smart Routing
Let the platform choose the best provider path automatically.
Default Model
If a request does not specify a model, default the key to deepseek-v4-flash.
Forced Model
Always override incoming requests and lock the key to deepseek-v4-flash.