Quickstart
Deploy a model and make your first API call in under five minutes. The whole flow uses the OpenAI SDK, so any client you already have keeps working once you change the base URL.
No SDK to learn
1. Install the OpenAI SDK
The QDivZero API is OpenAI-compatible, so the official Python SDK is the fastest way to start. Node, Go, and curl work the same way.
1pip install openai2. Configure your API key
Set the key as an environment variable so it never lands in source control. The same key works for the dashboard, the CLI, and the API.
1export QDIV0_API_KEY="qdiv0_sk_..."3. Launch an instance
Open the platform, choose any Hugging Face repo from the catalog, and click Launch. The example below uses Qwen 3.6 35B A3B MTP, a strong general model with multi-token prediction enabled.
4. Make your first request
Point the OpenAI client at the QDivZero base URL and call chat.completions. The model id is the serving name you set on the instance.
1from openai import OpenAI
2import os
3
4client = OpenAI(
5 base_url="https://api.qdiv0.com/v1",
6 api_key=os.environ["QDIV0_API_KEY"],
7)
8
9response = client.chat.completions.create(
10 model="qwen-3.6-35b-a3b-mtp",
11 messages=[
12 {"role": "user", "content": "Explain capacity pricing in one sentence."}
13 ],
14)
15
16print(response.choices[0].message.content)5. Stop the meter
When you are done, stop the instance from the dashboard or the API. The billing meter pauses immediately and the next start reuses the same configuration.
1curl -X POST https://api.qdiv0.com/v1/instances/$INSTANCE_ID/stop \
2 -H "Authorization: Bearer $QDIV0_API_KEY"What's next?
- Smart Balancers — route traffic between multiple models with a single serving name.
- Firewalls — evaluate every prompt with an LLM judge before it reaches the model.
- Vector databases — add retrieval over your own data with multimodal embeddings.
- Billing — understand the ledger, top-up limits, and how invoices are generated.