Error handling
The chat completions API returns standard HTTP status codes. The same codes apply to Compute instances, Public Models, and Smart Balancer routes. Clients should treat 5xx responses as retryable with backoff; 4xx responses should be inspected for the underlying cause before retrying.
| Status | When |
|---|---|
| 400 | Malformed request, invalid model id, unknown firewall slug, unknown tool type, unknown vector database slug, or other client-side validation failure. |
| 401 | Missing or invalid API key. Check the Authorization header and the API key itself. |
| 402 | Account balance is below the launch threshold for new Compute instances, or below the minimum required for a Public Model or Smart Balancer call. Top up and retry. |
| 403 | Firewall blocked the request. The response body includes the matched rule(s) and, when configured, a reason from the LLM judge. |
| 404 | Unknown public model, unknown vector database slug, unknown firewall slug, or unknown serving name on the chosen backend. |
| 502 / 503 | Provider is unreachable, or all providers failed (for Public Models with multi-provider failover). The platform retries automatically on recoverable failures before returning this status. |
Inspecting the response body
Every error response includes a JSON body with a message field describing the failure and, where relevant, a type and structured details object. The shape is compatible with the OpenAI error envelope, so existing OpenAI SDK error handling code (for example, openai.APIStatusError) works without changes.
For 403 responses, the detailsobject carries the firewall decision: the rule that matched, the judge's verdict (if the firewall uses one), and whether the prompt was rewritten, blocked, or allowed with an annotation.