Smart Balancers
One endpoint. A live routing board behind it.
Keep the client contract stable while a control-plane board routes easy prompts to the efficient route, hard prompts to the reasoning route, and failed primary paths to fallback.
Live board
support-balancer
request
customer prompt
endpoint
support-balancer
efficient route
active
qwen-cheap
summaries, drafts, simple replies
reasoning route
active
gpt-oss-120b
multi-step support decisions
fallback route
active
backup group
same endpoint keeps answering
Control plane
Move routing decisions into the control surface.
Treat Smart Balancers like a routing control surface: keep one caller contract in place, tune route priorities behind it, and decide when easy prompts take the fast lane or fallback takes over.
Request enters
01Your app calls one stable endpoint.
The SDK keeps using support-balancer instead of selecting raw model endpoints in application code.
Route selected
02The board chooses the right lane.
Easy prompts can stay on efficient capacity while harder prompts move to a stronger reasoning route.
Fallback stays ready
03The backup path is already wired.
When the primary route fails, Smart Balancers keep the same caller contract and move traffic behind it.
Fallback routing
Fail over without teaching every app a second route.
When the primary path turns unavailable, Smart Balancers keep the caller on support-balancer and move the route in the control layer.
support-balancer
client unchanged
primary route
priority 1
health state
unavailable
fallback route
active
01 Primary
Requests enter
support-balancerand try the primary group first.02 Failure
The primary route is marked unavailable inside the control layer.
03 Fallback
The same endpoint keeps answering through the fallback group.
Prompt-based routing
Spend strong models where the prompt earns it.
Put cost-vs-capability routing behind one endpoint. Smart Balancers keep easy prompts on the efficient route and reserve stronger models for the reasoning route.
board endpoint
support-balancer
Both decisions happen behind the same endpoint, so clients keep calling the same model name while the board chooses the route.
Prompt
Summarise this support reply.
efficient route
Prompt
Resolve a multi-step refund dispute.
reasoning route
What Smart Balancers already include.
Routing, fallback, and cost control without changing the client contract.
Target groups
Group multiple models behind one balancer so the endpoint stays stable while routing logic evolves.
Fallback routing
Pair a primary group with a fallback group and keep traffic moving when the first path breaks.
Prompt-based routing
Use cheap models for easy prompts and reserve stronger models for harder prompts.
Route priorities
Define route order explicitly with priority so primary and fallback behavior stays predictable.
OpenAI-compatible endpoint
Keep existing SDK flows and point them at one balancer endpoint instead of one raw model endpoint.
Capacity pricing inheritance
Balancers inherit the same capacity-first economics behind each target group instead of reintroducing token billing.
Existing SDK flow
One balancer endpoint. Existing SDK flow.
Your caller talks to one endpoint. Smart Balancers decide which group answers behind it, in what order, and when fallback takes over.
from openai import OpenAI
client = OpenAI(
base_url="https://api.qdiv0.com/v1",
api_key="your-api-key",
)
response = client.chat.completions.create(
model="support-balancer",
messages=[{"role": "user", "content": prompt}],
)efficient_group
Route simple prompts to cheaper capacity.
reasoning_group
Reserve stronger models for complex work.
fallback_group
Keep a backup route behind the same endpoint.
Balance cost, quality, and continuity behind one endpoint.
Put routing logic in the control layer, not in every client that calls it.