The AI that runs on your own hardware.
Train a custom small language model on your operations data. Ship it as a desktop app, a CLI, or an edge service. No per-token bills. No data leaving your machine. Built for Indian SMBs and operations-heavy businesses by ZeroOne D.O.T.S AI.
Generic AI is a tax on every conversation.
A custom SLM is the only AI strategy that gets cheaper, more accurate, and more defensible as you use it. Four reasons why this matters for operations-heavy businesses.
Your data trains it. Your data stays.
Tickets, call transcripts, PO data, SOPs, vendor emails — whatever you have. Encrypted in transit, processed on your hardware. We never see it.
Trained for the job you actually do.
Generic LLMs hallucinate on your part numbers, your supplier names, your warranty SOPs. A fine-tuned SLM doesn't — because it learned from yours.
Runs where you run.
Export GGUF, ONNX, or a Docker image. Deploy to a laptop, a Jetson, a 4-core VM in your colo. No internet required. No vendor lock-in.
Zero per-token bills. Forever.
Train once, run forever. Compare against ₹40,000–₹2,00,000/month in GPT-4 API costs for a busy ops team. Your CFO will thank you.
Four steps. No ML PhD required.
Custom SLM is the studio + runtime ZeroOne wished existed when we built our first client model. It's now the same pipeline we use internally.
Bring your data
CSV, JSONL, Markdown, PDFs, call recordings, support tickets, SOPs. The studio normalizes it into instruction-pair training format. Bad rows get flagged, not dropped silently.
~ 200–10,000 rows is the sweet spot. We'll tell you if you need more.
Pick a base model
Qwen 2.5 (3B / 7B), Llama 3.2 (1B / 3B), Phi 3.5, Gemma 2, or our DOTS-tuned starter. Picked by your hardware target — desktop, edge, server — and your latency budget.
Hindi · Marathi · Gujarati · Tamil support varies by base. We benchmark for you.
Fine-tune on your hardware
LoRA or QLoRA fine-tuning on your GPU, your colo, or rented H100 hour. Loss curves, eval set accuracy, hallucination rate against your hold-out — all visible.
Train time: 20 minutes (3B, LoRA, 1k rows) → 6 hours (7B, full FT, 10k rows).
Ship anywhere
Export GGUF for llama.cpp / Ollama, ONNX for edge devices, or a Docker image with a built-in OpenAI-compatible server. Update without re-training. Roll back without losing data.
Already wired for: Mac laptops · Jetson Orin · 4-core VMs · Raspberry Pi 5.
Built in public. Ship dates, not promises.
Four phases. Studio is the day-one product. Hub, Cloud, and Vault unlock as the waitlist converts. Founder seats get a vote on prioritization and a forever-locked rate as we add capability.
Studio
Available in v0.1
Web-based fine-tuning studio. Dataset upload, training runs, eval dashboards, GGUF export. The piece you need to ship your first model.
Hub
Q3 2026
Public + private model registry. Pin a base model, share fine-tunes with your team, fork from community models. Built on Hugging Face conventions with a privacy layer.
Cloud
Q4 2026
Optional hosted inference for teams that don't want to manage GPUs. Per-second billing, OpenAI-compatible API, EU + India data residency.
Vault
Q1 2027
Encrypted dataset vault with row-level audit. SOC 2 + ISO 27001 path. The piece your CISO will want signed off before scaling.
One tier. Twenty-five seats. Then it closes.
We don't know what the right price is yet. The founders are paying ZeroOne to figure that out together — and locking in their rate forever in exchange. After 25 seats, the next tier is announced and won't be this cheap.
Founder
25 seats onlyFor founders, CTOs, and ops leads at SMBs that want to own their AI stack instead of renting it.
₹59,988 / year · billed monthly or annually · locked for life
- Up to 10 fine-tuning runs per month
- Up to 3 production models exported
- Train on Qwen, Llama 3.2, Phi 3.5, Gemma 2 — and ZeroOne starter bases
- GGUF · ONNX · Docker export · all formats included
- Direct Slack channel with the founding ZeroOne team
- Vote on Hub, Cloud, Vault roadmap priorities
- Forever-locked pricing — no renewal hikes, ever
- Migration support if you're moving off GPT-4 / Claude / Gemini APIs
Tiered pricing on a pre-launch product is theater. We'd rather tell you what we don't know. We don't know what the right per-seat price is at scale. We don't know if you need 10 runs/month or 50.
What we do know: 25 founders paying ₹4,999/month each gives us ₹15L/year — enough runway to ship Studio and Hub without pretending we're an enterprise platform on day one.
Not a founder? You can still
Join the free waitlistWe'll reach out when general access opens. No card, no commitment.
The questions worth asking.
Cost, control, and accuracy on your domain. A 3B fine-tuned model on your operations data routinely outperforms GPT-4 on your specific tasks at 100× lower per-inference cost, with zero data leaving your machine. You give up generality; you gain ownership.
Yes. Small bases (1–3B) train fine on a Mac M3 / M4 in a few hours, or on a rented L4 GPU hour (~₹100). For larger models we can rent H100 time for you transparently. The Studio shows you the cheapest option for each run.
200 high-quality instruction-response pairs is the floor for a useful fine-tune on a domain task. 1,000–5,000 is the sweet spot. We help you bootstrap from raw data (tickets, transcripts, SOPs) into the right format.
Depends on the base model. Llama 3.2 and Qwen 2.5 both have credible Hindi performance. Tamil and Bengali are weaker. We benchmark each base on your eval set before training so you know what you're getting.
Nowhere we don't tell you. Local training stays on your machine, full stop. If you rent GPU through us, data is uploaded encrypted, processed in an ephemeral container, and deleted on run completion. Vault (Q1 2027) adds row-level audit + SOC 2 path.
The Studio shows you eval-set accuracy, hallucination rate against your hold-out, and per-example failures before you export. We help you diagnose: more data, different base, different hyperparameters. Bad runs don't burn your credits.
Adjacent. Those are excellent for indie devs in the US/EU. We're focused on Indian SMBs and operations-heavy businesses — manufacturing, logistics, BPO, fintech ops — where the use cases are different (regional language, on-prem deploy, INR economics). We're betting that market is underserved.
Yes. The consulting layer at zeroonedotsai.consulting offers fine-tuning, eval-set construction, and deployment as a service — for teams that want results, not tools. The platform powers both self-serve and consulting work.