Launch flow

A detailed walkthrough of what happens between clicking Launch on a new instance and the model answering its first request. Use this guide to debug launches that hang, fail, or land in the wrong state.

Pre-flight

Before the launch request is sent, the platform verifies:

the model repo exists, is public, and the architecture is supported by the chosen runtime preset
the runtime image is allowed by the account policy (e.g. verified, signed)
the account balance is above the minimum threshold to bill at least one hour
if Smart mode is used, at least one provider matches the constraints (region, max price, capacity tier)

VRAM is estimated, not measured

The launch screen shows an estimate based on the model config and quantization. Real VRAM depends on the actual weights, the runtime batcher, and the chosen context size. If the estimate is close to the GPU limit, prefer the next tier up.

Instance states

State	Meaning
non_bootstrap	Initial state. The instance is being registered against the provider.
pending	The provider accepted the request and is allocating capacity. Typical duration: 30s – 5min.
running	The model is loaded and the OpenAI endpoint is serving traffic.
stopped	The user stopped the instance. The provider may still hold the underlying pod until it is deleted.
failed	A non-recoverable error happened. The instance is preserved with a `failure_reason` for inspection.

Common failure reasons

Reason	What it means
OOM_KILLED	The runtime exceeded available VRAM. Pick a larger GPU or reduce context size.
PROVIDER_TIMEOUT	The provider did not respond within the allocation window. The scheduler will retry on the next start.
IMAGE_PULL_FAILED	The runtime image could not be pulled. Usually a transient provider issue; retry.
MODEL_LOAD_ERROR	The model files failed integrity checks. Confirm the repo and quantization are supported.
BILLING_BLOCKED	The account balance is below the launch threshold. Top up and try again.

Recovery patterns

If the instance is in failed, open it and read failure_reason. The platform also surfaces a one-line remediation hint.
Most provider-side failures are transient. Stop and start the instance again; the scheduler will re-route.
For OOM, delete the instance and re-launch with a larger GPU or a smaller context size.
For billing blocks, open the Top up page and confirm the ledger entry before retrying the launch.
Still stuck? Use Troubleshooting for diagnostic flows or contact support from the instance detail page.