Public Models

Public models are pre-deployed inference endpoints operated by QDivZero. You call them by name — no launch wizard, no GPU selection, no scheduler — and pay per token consumed. Under the hood, the platform routes the request across the providers that host each model.

Public models vs. Compute instances

A Compute instance is a model you deploy against a Hugging Face repo, billed per active GPU hour, with a serving name you pick. A public model is a model QDivZero deploys, billed per token, with a fixed name from the catalog. Use Compute for private or experimental models; use public models for production traffic against the most common frontier models without managing capacity.

Catalog

The current catalog. The platform adds models as providers expose them; the pricing is in EUR per million tokens (input and output priced separately).

ModelWorkloadEuropeanInput / MOutput / M
deepseek-v3.2chatNo€0.35€1.43
deepseek-v3.2-europeanchatYes€0.81€2.18
gpt-oss-120b-europeanchatYes€0.34€1.37
gpt-oss-safeguard-120b-europeanchatYes€0.34€1.37
gpt-oss-20b-europeanchatYes€0.09€0.39
gpt-oss-safeguard-20b-europeanchatYes€0.09€0.39

Variants ending in -european

European variants are pinned to a single EU region and stay inside the EU for data-residency. They are billed at a different rate than the global variant of the same model.

Where to go next