Routing Strategies
Intelligently route your AI requests to optimize for cost, speed, or reliability
💰 Cost-Based Routing
Route to the cheapest provider that supports your requested model. Perfect for batch processing and non-time-sensitive tasks.
{
"model": "gpt-3.5-turbo",
"messages": [...],
"routingStrategy": "cost"
}⚡ Latency-Based Routing
Route to the fastest provider based on historical response times. Ideal for real-time chat and interactive applications.
{
"model": "gpt-4",
"messages": [...],
"routingStrategy": "latency"
}🎯 Priority Routing
Use your preferred provider with automatic fallback. Great for compliance requirements or provider-specific features.
{
"model": "claude-3-opus",
"messages": [...],
"routingStrategy": "priority",
"preferredProvider": "anthropic"
}🛡️ Fallback Routing
Maximum reliability by automatically trying multiple providers. Essential for mission-critical systems.
{
"model": "gpt-3.5-turbo",
"messages": [...],
"routingStrategy": "fallback"
}⚡ Response Caching
All routing strategies benefit from intelligent caching. Identical requests return instantly at $0 cost!
Benefits:
- 0ms latency for cached responses
- $0 cost for cached responses
- Enabled by default
- Automatic cache key generation
Strategy Comparison
| Strategy | Best For | Pros |
|---|---|---|
| Cost | Batch processing | Lowest cost |
| Latency | Real-time chat | Fastest response |
| Priority | Compliance needs | Control + fallback |
| Fallback | Critical systems | Maximum reliability |