Getting started
QDivZero is a managed inference platform. You pick a model, we pick the GPU, and the model is exposed behind an OpenAI-compatible endpoint. This guide walks you from a brand-new account to your first successful API call in roughly five minutes.
What you need
1. Create an account
Head over to the registration page and create an account. The platform creates a default workspace for you, assigns an account ID, and grants the first member theownerrole. Additional members can be invited from the account settings later.
2. Top up your balance
Compute is billed per active GPU hour. Open Billing → Top up, choose an amount in EUR, and complete the secure checkout. The ledger records every movement (top-ups, debits, refunds) so you always know what your balance is doing.
Try it with a small balance
3. Create an API key
API keys are the only way to authenticate against the QDivZero gateway. Open API keys and create one. The full token is shown once; copy it to your secret manager before leaving the page.
1# Set the key for the current shell
2export QDIV0_API_KEY="qdiv0_sk_..."4. Launch your first instance
An instance is a deployed model behind a public serving name. From Compute → New instance pick any Hugging Face repo. QDivZero validates VRAM requirements, selects a compatible GPU, and exposes the model on the standard OpenAI base URL.
- Choose between Manual mode (you pick the GPU) and Smart mode (the scheduler picks for you).
- Set a serving name you will remember, e.g.
qwen35-demo. - Optionally attach a firewall to evaluate every prompt before it reaches the model.
5. Call the endpoint
Once the instance transitions to running, it accepts OpenAI-compatible traffic. The base URL is https://api.qdiv0.com/v1 and the model name is the serving name you set during launch.
1from openai import OpenAI
2
3client = OpenAI(
4 base_url="https://api.qdiv0.com/v1",
5 api_key="your-api-key",
6)
7
8response = client.chat.completions.create(
9 model="qwen35-demo",
10 messages=[{"role": "user", "content": "Hello, world!"}],
11)
12print(response.choices[0].message.content)Need embeddings instead? Use the /v1/embeddings endpoint with an embeddings workload instance.