Create a Smart Balancer

The create wizard is opinionated. The defaults cover the most common case (an intent classifier with three routes and a default). Most production balancers only need a few additional routes on top of that.

Steps

  1. Pick a name and a serving name

    The name is internal, the serving name is what clients will send as the model id.

  2. Choose the workload kind

    chat or embeddings. The balancer rejects mismatched calls.

  3. Choose a routing mode

    intent_classifier for many specialised routes, ordered for failover or weighted routing.

  4. Pick a router model

    Required for intent_classifier. A small chat instance is enough.

  5. Add routes

    Each route needs a name, an intent (a short declarative sentence), and a destination. Mark one route as default.

  6. Save and test

    The platform validates the configuration and shows a sample call you can paste into the playground.

Default route is required

The wizard blocks the save action if no route is marked as default. The default is what the balancer falls back to when no intent matches, and it is the only way to guarantee a response for ambiguous prompts.

Target groups (optional)

A target group is a named pool of instances. Use it when several balancers should share the same pool, or when you want to rotate capacity without editing each balancer.

target-group.json
1{
2  "name": "tech-support-pool",
3  "display_name": "Tech support pool",
4  "description": "Pool of tech-support instances behind a single name.",
5  "workload_kind": "chat",
6  "selection_policy": "weighted_round_robin",
7  "enabled": true,
8  "members": [
9    { "instance_id": "inst_a", "weight": 100, "enabled": true, "priority": 1 },
10    { "instance_id": "inst_b", "weight": 100, "enabled": true, "priority": 2 }
11  ]
12}

Smoke testing

After the balancer is created, the detail page shows a sample call you can paste into any OpenAI-compatible client. The smoke test hits the default route by default; the playground exposes a dropdown to pick a specific route for testing.