Access free and premium models through a single OpenAI-compatible endpoint.
Smart routing across Google, OpenAI, Anthropic, xAI / Grok, AWS Bedrock, Groq, and Cerebras — no model names to track.
$ curl https://api.lxg2it.com/v1/chat/completions \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "standard",
"prefer": "cheap",
"messages": [{"role": "user", "content": "Hello"}]
}'
You pick what matters — the capability tier and the optimisation direction. The router picks the model. When a cheaper option launches or a provider goes down, your requests adapt automatically. No model names to track. No code to change.
You can also pass a specific model name
(gpt-4.1,
claude-sonnet-4-6)
to pin routing and bypass tier selection entirely.
Context-window guard: never routes to a model that can't handle your input. Circuit breakers reroute around provider outages automatically.
Fast models via Groq and Cerebras are routed at no cost — no credits, no card required. Sign up and start making requests immediately. Add credits when you need the full range of premium models.
A 4% fee on credit deposits. Requests are billed at actual provider market rates —
you pay what the model costs, nothing more.
Every response includes X-Model-Router-Model and X-Model-Router-Provider
headers so you always know exactly what ran and what it cost.
Set model: "auto" and the router analyses your full conversation context —
system prompt, code blocks, message history, tool use, reasoning markers — and picks the right tier automatically.
No heuristics on individual messages; it reads the whole picture.
Every auto-routed response includes X-Model-Router-Auto-Tier and X-Model-Router-Auto-Score
headers so you can see exactly why a tier was chosen.
How it works →
Circuit breakers detect provider outages and reroute requests in real time. Context-window guards ensure requests never go to a model that can't handle them. Your code doesn't change — routing adapts automatically.
Block providers you don't want to fund. Set daily spend limits. Enable auto-recharge so you never hit a wall mid-project. Export request traces to any OTLP backend — Axiom, Grafana, Honeycomb, Datadog.
Same key, same endpoint pattern. embed-small, embed-large,
and embed-titan aliases route to the best available embedding model.
Batch inputs, optional dimension truncation, billed at input tokens only.
https://api.lxg2it.com