Same API shape as OpenAI. Change the base URL. Pay less — because the network subsidises real compute, not random mining.
Get an API key from Abakos (preview soon)
Set base URL to https://api.abakos.ai/v1
Pick a coder model — streaming works out of the box
Cursor · VS Code + Continue · Cline · Roo Code · Python openai · Node · LangChain · LiteLLM
No custom SDK. If it speaks OpenAI, it speaks Abakos.
Use these configs once API keys are live. Join the waitlist to get early access and free credits.
Settings → Models → Override OpenAI Base URL
Base URL: https://api.abakos.ai/v1
API Key: abk_live_… (from your Abakos dashboard)
Model: abakos/qwen2.5-coder-32b-instruct
~/.continue/config.json
{
"models": [{
"title": "Abakos Coder",
"provider": "openai",
"model": "abakos/qwen2.5-coder-32b-instruct",
"apiBase": "https://api.abakos.ai/v1",
"apiKey": "abk_live_…"
}]
}
from openai import OpenAI
client = OpenAI(
base_url="https://api.abakos.ai/v1",
api_key="abk_live_…",
)
client.chat.completions.create(
model="abakos/qwen2.5-coder-32b-instruct",
messages=[{"role": "user", "content": "Refactor this function…"}],
stream=True,
)
Open weights only (Llama, Qwen, DeepSeek, Mistral, Gemma) — the models you can actually run anywhere. Closed frontier models stay on their makers' APIs.
| Model | Best for | Status |
|---|---|---|
| abakos/qwen2.5-coder-32b-instruct | IDE chat & agents | Launch model |
| abakos/deepseek-coder-v2-lite | Tab autocomplete | Launch model |
| abakos/llama-3.3-70b-instruct | General tasks | Phase 2 |
| abakos/nomic-embed-text | Embeddings / RAG | Phase 2 |
Early API keys, free credits, and launch pricing. We'll email you when the preview is ready.
Preview is in development. Join the waitlist — we're onboarding developers in batches before public launch.
Miners earn network rewards on the same GPU cycles that serve your request — a real subsidy tied to paid work, not speculation. That lets us price below aggregators that only resell cloud GPUs.
No. Pay with a card or prepaid balance. ABK token is optional for a discount later.
IDE use needs streaming under ~800ms first token. The preview routes to optimised partner GPUs; the decentralised network scales as supply grows.