Model Routing
How AllToken routes requests to the best available provider.
How routing works
AllToken routes each request to the best available provider for the requested model, based on:
- Availability — is the provider healthy and responding?
- Latency — which provider has the lowest response time?
- Cost — which provider offers the best price?
- Priority — does the account have provider preferences configured?
Routing is automatic. No provider selection logic required on your end.
Routing modes
Configure routing behavior per API key in Settings → API Keys:
- Smart Routing — AllToken selects the best provider path automatically. Recommended for most use cases.
- Default Model — requests without a
modelfield fall back to this one. - Forced Model — overrides all incoming requests to use a specific model, regardless of what the client sends.
Provider priority
When a model is available through multiple providers, AllToken assigns each a priority score based on historical performance. View the ranking on any model's detail page under "Available Providers".
Priority 1 is the preferred provider. If it's unavailable, AllToken falls back to priority 2, and so on.
Multi-provider models
Popular models like Claude and GPT-4o are served by multiple providers — direct API, Amazon Bedrock, Google Vertex AI, and others. AllToken routes to the healthiest, fastest option automatically.
See which providers serve a model on its detail page under the "Providers" tab.