Smart Balancers

One endpoint. A live routing board behind it.

Keep the client contract stable while a control-plane board routes easy prompts to the efficient route, hard prompts to the reasoning route, and failed primary paths to fallback.

Live board

support-balancer

online

request

customer prompt

endpoint

support-balancer

efficient route

active

qwen-cheap

summaries, drafts, simple replies

reasoning route

active

gpt-oss-120b

multi-step support decisions

fallback route

active

backup group

same endpoint keeps answering

Control plane

Move routing decisions into the control surface.

Treat Smart Balancers like a routing control surface: keep one caller contract in place, tune route priorities behind it, and decide when easy prompts take the fast lane or fallback takes over.

Request enters

01

Your app calls one stable endpoint.

The SDK keeps using support-balancer instead of selecting raw model endpoints in application code.

Route selected

02

The board chooses the right lane.

Easy prompts can stay on efficient capacity while harder prompts move to a stronger reasoning route.

Fallback stays ready

03

The backup path is already wired.

When the primary route fails, Smart Balancers keep the same caller contract and move traffic behind it.

Fallback routing

Fail over without teaching every app a second route.

When the primary path turns unavailable, Smart Balancers keep the caller on support-balancer and move the route in the control layer.

support-balancer

client unchanged

primary route

priority 1

health state

unavailable

fallback route

active

  1. 01 Primary

    Requests enter support-balancer and try the primary group first.

  2. 02 Failure

    The primary route is marked unavailable inside the control layer.

  3. 03 Fallback

    The same endpoint keeps answering through the fallback group.

Prompt-based routing

Spend strong models where the prompt earns it.

Put cost-vs-capability routing behind one endpoint. Smart Balancers keep easy prompts on the efficient route and reserve stronger models for the reasoning route.

board endpoint

support-balancer

Both decisions happen behind the same endpoint, so clients keep calling the same model name while the board chooses the route.

  1. Prompt

    Summarise this support reply.

    efficient route

  2. Prompt

    Resolve a multi-step refund dispute.

    reasoning route

What Smart Balancers already include.

Routing, fallback, and cost control without changing the client contract.

Target groups

Group multiple models behind one balancer so the endpoint stays stable while routing logic evolves.

Fallback routing

Pair a primary group with a fallback group and keep traffic moving when the first path breaks.

Prompt-based routing

Use cheap models for easy prompts and reserve stronger models for harder prompts.

Route priorities

Define route order explicitly with priority so primary and fallback behavior stays predictable.

OpenAI-compatible endpoint

Keep existing SDK flows and point them at one balancer endpoint instead of one raw model endpoint.

Capacity pricing inheritance

Balancers inherit the same capacity-first economics behind each target group instead of reintroducing token billing.

Existing SDK flow

One balancer endpoint. Existing SDK flow.

Your caller talks to one endpoint. Smart Balancers decide which group answers behind it, in what order, and when fallback takes over.

quickstart.py
from openai import OpenAI

client = OpenAI(
    base_url="https://api.qdiv0.com/v1",
    api_key="your-api-key",
)

response = client.chat.completions.create(
    model="support-balancer",
    messages=[{"role": "user", "content": prompt}],
)
route_config.json

efficient_group

Route simple prompts to cheaper capacity.

reasoning_group

Reserve stronger models for complex work.

fallback_group

Keep a backup route behind the same endpoint.

Balance cost, quality, and continuity behind one endpoint.

Put routing logic in the control layer, not in every client that calls it.