Model Routing

How AllToken routes requests to the best available provider.

How routing works

AllToken routes each request to the best available provider for the requested model, based on:

  • Availability — is the provider healthy and responding?
  • Latency — which provider has the lowest response time?
  • Cost — which provider offers the best price?
  • Priority — does the account have provider preferences configured?

Routing is automatic. No provider selection logic required on your end.

Routing modes

Configure routing behavior per API key in Settings → API Keys:

  • Smart Routing — AllToken selects the best provider path automatically. Recommended for most use cases.
  • Default Model — requests without a model field fall back to this one.
  • Forced Model — overrides all incoming requests to use a specific model, regardless of what the client sends.

Provider priority

When a model is available through multiple providers, AllToken assigns each a priority score based on historical performance. View the ranking on any model's detail page under "Available Providers".

Priority 1 is the preferred provider. If it's unavailable, AllToken falls back to priority 2, and so on.

Multi-provider models

Popular models like Claude and GPT-4o are served by multiple providers — direct API, Amazon Bedrock, Google Vertex AI, and others. AllToken routes to the healthiest, fastest option automatically.

See which providers serve a model on its detail page under the "Providers" tab.