mirror of
https://github.com/diegosouzapw/OmniRoute.git
synced 2026-05-03 00:30:26 +00:00
1446 lines
61 KiB
Markdown
1446 lines
61 KiB
Markdown
<div align="center">
|
||
<img src="./docs/screenshots/MainOmniRoute.png" alt="OmniRoute Dashboard" width="800"/>
|
||
|
||
# 🚀 OmniRoute — The Free AI Gateway
|
||
|
||
### Never stop coding. Smart routing to **FREE & low-cost AI models** with automatic fallback.
|
||
|
||
_Your universal API proxy — one endpoint, 36+ providers, zero downtime._
|
||
|
||
**Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • 100% TypeScript**
|
||
|
||
---
|
||
|
||
[](https://www.npmjs.com/package/omniroute)
|
||
[](https://hub.docker.com/r/diegosouzapw/omniroute)
|
||
[](https://github.com/diegosouzapw/OmniRoute/blob/main/LICENSE)
|
||
[](https://omniroute.online)
|
||
[](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t)
|
||
|
||
[🌐 Website](https://omniroute.online) • [🚀 Quick Start](#-quick-start) • [💡 Features](#-key-features) • [📖 Docs](#-documentation) • [💰 Pricing](#-pricing-at-a-glance) • [💬 WhatsApp](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t)
|
||
|
||
🌐 **Available in:** 🇺🇸 [English](README.md) | 🇧🇷 [Português (Brasil)](README.pt-BR.md) | 🇪🇸 [Español](README.es.md) | 🇫🇷 [Français](README.fr.md) | 🇮🇹 [Italiano](README.it.md) | 🇷🇺 [Русский](README.ru.md) | 🇨🇳 [中文 (简体)](README.zh-CN.md) | 🇩🇪 [Deutsch](README.de.md) | 🇮🇳 [हिन्दी](README.in.md) | 🇹🇭 [ไทย](README.th.md) | 🇺🇦 [Українська](README.uk-UA.md) | 🇸🇦 [العربية](README.ar.md) | 🇯🇵 [日本語](README.ja.md) | 🇻🇳 [Tiếng Việt](README.vi.md) | 🇧🇬 [Български](README.bg.md) | 🇩🇰 [Dansk](README.da.md) | 🇫🇮 [Suomi](README.fi.md) | 🇮🇱 [עברית](README.he.md) | 🇭🇺 [Magyar](README.hu.md) | 🇮🇩 [Bahasa Indonesia](README.id.md) | 🇰🇷 [한국어](README.ko.md) | 🇲🇾 [Bahasa Melayu](README.ms.md) | 🇳🇱 [Nederlands](README.nl.md) | 🇳🇴 [Norsk](README.no.md) | 🇵🇹 [Português (Portugal)](README.pt.md) | 🇷🇴 [Română](README.ro.md) | 🇵🇱 [Polski](README.pl.md) | 🇸🇰 [Slovenčina](README.sk.md) | 🇸🇪 [Svenska](README.sv.md) | 🇵🇭 [Filipino](README.phi.md)
|
||
|
||
</div>
|
||
|
||
---
|
||
|
||
### 🤖 Free AI Provider for your favorite coding agents
|
||
|
||
_Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding._
|
||
|
||
<table>
|
||
<tr>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/openclaw/openclaw">
|
||
<img src="./public/providers/openclaw.png" alt="OpenClaw" width="48"/><br/>
|
||
<b>OpenClaw</b>
|
||
</a><br/>
|
||
<sub>⭐ 205K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/HKUDS/nanobot">
|
||
<img src="./public/providers/nanobot.png" alt="NanoBot" width="48"/><br/>
|
||
<b>NanoBot</b>
|
||
</a><br/>
|
||
<sub>⭐ 20.9K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/sipeed/picoclaw">
|
||
<img src="./public/providers/picoclaw.jpg" alt="PicoClaw" width="48"/><br/>
|
||
<b>PicoClaw</b>
|
||
</a><br/>
|
||
<sub>⭐ 14.6K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/zeroclaw-labs/zeroclaw">
|
||
<img src="./public/providers/zeroclaw.png" alt="ZeroClaw" width="48"/><br/>
|
||
<b>ZeroClaw</b>
|
||
</a><br/>
|
||
<sub>⭐ 9.9K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/nearai/ironclaw">
|
||
<img src="./public/providers/ironclaw.png" alt="IronClaw" width="48"/><br/>
|
||
<b>IronClaw</b>
|
||
</a><br/>
|
||
<sub>⭐ 2.1K</sub>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/anomalyco/opencode">
|
||
<img src="./public/providers/opencode.svg" alt="OpenCode" width="48"/><br/>
|
||
<b>OpenCode</b>
|
||
</a><br/>
|
||
<sub>⭐ 106K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/openai/codex">
|
||
<img src="./public/providers/codex.png" alt="Codex CLI" width="48"/><br/>
|
||
<b>Codex CLI</b>
|
||
</a><br/>
|
||
<sub>⭐ 60.8K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/anthropics/claude-code">
|
||
<img src="./public/providers/claude.png" alt="Claude Code" width="48"/><br/>
|
||
<b>Claude Code</b>
|
||
</a><br/>
|
||
<sub>⭐ 67.3K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/google-gemini/gemini-cli">
|
||
<img src="./public/providers/gemini-cli.png" alt="Gemini CLI" width="48"/><br/>
|
||
<b>Gemini CLI</b>
|
||
</a><br/>
|
||
<sub>⭐ 94.7K</sub>
|
||
</td>
|
||
<td align="center" width="110">
|
||
<a href="https://github.com/Kilo-Org/kilocode">
|
||
<img src="./public/providers/kilocode.png" alt="Kilo Code" width="48"/><br/>
|
||
<b>Kilo Code</b>
|
||
</a><br/>
|
||
<sub>⭐ 15.5K</sub>
|
||
</td>
|
||
</tr>
|
||
</table>
|
||
|
||
<sub>📡 All agents connect via <code>http://localhost:20128/v1</code> or <code>http://cloud.omniroute.online/v1</code> — one config, unlimited models and quota</sub>
|
||
|
||
---
|
||
|
||
## 📧 Support
|
||
|
||
> 💬 **Join our community!** [WhatsApp Group](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t) — Get help, share tips, and stay updated.
|
||
|
||
- **Website**: [omniroute.online](https://omniroute.online)
|
||
- **GitHub**: [github.com/diegosouzapw/OmniRoute](https://github.com/diegosouzapw/OmniRoute)
|
||
- **Issues**: [github.com/diegosouzapw/OmniRoute/issues](https://github.com/diegosouzapw/OmniRoute/issues)
|
||
- **WhatsApp**: [Community Group](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t)
|
||
- **Original Project**: [9router by decolua](https://github.com/decolua/9router)
|
||
|
||
---
|
||
|
||
## 🤔 Why OmniRoute?
|
||
|
||
**Stop wasting money and hitting limits:**
|
||
|
||
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Subscription quota expires unused every month
|
||
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Rate limits stop you mid-coding
|
||
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Expensive APIs ($20-50/month per provider)
|
||
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Manual switching between providers
|
||
|
||
**OmniRoute solves this:**
|
||
|
||
- ✅ **Maximize subscriptions** - Track quota, use every bit before reset
|
||
- ✅ **Auto fallback** - Subscription → API Key → Cheap → Free, zero downtime
|
||
- ✅ **Multi-account** - Round-robin between accounts per provider
|
||
- ✅ **Universal** - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool
|
||
|
||
---
|
||
|
||
## 🔄 How It Works
|
||
|
||
```
|
||
┌─────────────┐
|
||
│ Your CLI │ (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
|
||
│ Tool │
|
||
└──────┬──────┘
|
||
│ http://localhost:20128/v1
|
||
↓
|
||
┌─────────────────────────────────────────┐
|
||
│ OmniRoute (Smart Router) │
|
||
│ • Format translation (OpenAI ↔ Claude) │
|
||
│ • Quota tracking + Embeddings + Images │
|
||
│ • Auto token refresh │
|
||
└──────┬──────────────────────────────────┘
|
||
│
|
||
├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
|
||
│ ↓ quota exhausted
|
||
├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
|
||
│ ↓ budget limit
|
||
├─→ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
|
||
│ ↓ budget limit
|
||
└─→ [Tier 4: FREE] iFlow, Qwen, Kiro (unlimited)
|
||
|
||
Result: Never stop coding, minimal cost
|
||
```
|
||
|
||
---
|
||
|
||
## 🎯 What OmniRoute Solves — 16 Real Pain Points
|
||
|
||
> **Every developer using AI tools faces these problems daily.** OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to zero observability.
|
||
|
||
<details>
|
||
<summary><b>💸 1. "I pay for an expensive subscription but still get interrupted by limits"</b></summary>
|
||
|
||
Developers pay $20–200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Smart 4-Tier Fallback** — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
|
||
- **Real-Time Quota Tracking** — Shows token consumption in real-time with reset countdown (5h, daily, weekly)
|
||
- **Multi-Account Support** — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
|
||
- **Custom Combos** — Customizable fallback chains with 6 balancing strategies (fill-first, round-robin, P2C, random, least-used, cost-optimized)
|
||
- **Codex Business Quotas** — Business/Team workspace quota monitoring directly in the dashboard
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔌 2. "I need to use multiple providers but each has a different API"</b></summary>
|
||
|
||
OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Unified Endpoint** — A single `http://localhost:20128/v1` serves as proxy for all 36+ providers
|
||
- **Format Translation** — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
|
||
- **Response Sanitization** — Strips non-standard fields (`x_groq`, `usage_breakdown`, `service_tier`) that break OpenAI SDK v1.83+
|
||
- **Role Normalization** — Converts `developer` → `system` for non-OpenAI providers; `system` → `user` for GLM/ERNIE
|
||
- **Think Tag Extraction** — Extracts `<think>` blocks from models like DeepSeek R1 into standardized `reasoning_content`
|
||
- **Structured Output for Gemini** — `json_schema` → `responseMimeType`/`responseSchema` automatic conversion
|
||
- **`stream` defaults to `false`** — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🌐 3. "My AI provider blocks my region/country"</b></summary>
|
||
|
||
Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like `unsupported_country_region_territory` during OAuth and API connections. This is especially frustrating for developers from developing countries.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **3-Level Proxy Config** — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
|
||
- **Color-Coded Proxy Badges** — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
|
||
- **OAuth Token Exchange Through Proxy** — OAuth flow also goes through the proxy, solving `unsupported_country_region_territory`
|
||
- **Connection Tests via Proxy** — Connection tests use the configured proxy (no more direct bypass)
|
||
- **SOCKS5 Support** — Full SOCKS5 proxy support for outbound routing
|
||
- **TLS Fingerprint Spoofing** — Browser-like TLS fingerprint via `wreq-js` to bypass bot detection
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🆓 4. "I want to use AI for coding but I have no money"</b></summary>
|
||
|
||
Not everyone can pay $20–200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Free Tier Providers Built-in** — Native support for 100% free providers: iFlow (8 unlimited models), Qwen (3 unlimited models), Kiro (Claude for free), Gemini CLI (180K/month free)
|
||
- **Free-Only Combos** — Chain `gc/gemini-3-flash → if/kimi-k2-thinking → qw/qwen3-coder-plus` = $0/month with zero downtime
|
||
- **NVIDIA NIM Free Credits** — 1000 free credits integrated
|
||
- **Cost Optimized Strategy** — Routing strategy that automatically chooses the cheapest available provider
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔒 5. "I need to protect my AI gateway from unauthorized access"</b></summary>
|
||
|
||
When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **API Key Management** — Generation, rotation, and scoping per provider with a dedicated `/dashboard/api-manager` page
|
||
- **Model-Level Permissions** — Restrict API keys to specific models (`openai/*`, wildcard patterns), with Allow All/Restrict toggle
|
||
- **API Endpoint Protection** — Require a key for `/v1/models` and block specific providers from the listing
|
||
- **Auth Guard + CSRF Protection** — All dashboard routes protected with `withAuth` middleware + CSRF tokens
|
||
- **Rate Limiter** — Per-IP rate limiting with configurable windows
|
||
- **IP Filtering** — Allowlist/blocklist for access control
|
||
- **Prompt Injection Guard** — Sanitization against malicious prompt patterns
|
||
- **AES-256-GCM Encryption** — Credentials encrypted at rest
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🛑 6. "My provider went down and I lost my coding flow"</b></summary>
|
||
|
||
AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Circuit Breaker per-provider** — Auto-open/close with configurable thresholds and cooldown (Closed/Open/Half-Open)
|
||
- **Exponential Backoff** — Progressive retry delays
|
||
- **Anti-Thundering Herd** — Mutex + semaphore protection against concurrent retry storms
|
||
- **Combo Fallback Chains** — If the primary provider fails, automatically falls through the chain with no intervention
|
||
- **Combo Circuit Breaker** — Auto-disables failing providers within a combo chain
|
||
- **Health Dashboard** — Uptime monitoring, circuit breaker states, lockouts, cache stats, p50/p95/p99 latency
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔧 7. "Configuring each AI tool is tedious and repetitive"</b></summary>
|
||
|
||
Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **CLI Tools Dashboard** — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
|
||
- **GitHub Copilot Config Generator** — Generates `chatLanguageModels.json` for VS Code with bulk model selection
|
||
- **Onboarding Wizard** — Guided 4-step setup for first-time users
|
||
- **One endpoint, all models** — Configure `http://localhost:20128/v1` once, access 36+ providers
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔑 8. "Managing OAuth tokens from multiple providers is hell"</b></summary>
|
||
|
||
Claude Code, Codex, Gemini CLI, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with `client_secret is missing`, `redirect_uri_mismatch`, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Auto Token Refresh** — OAuth tokens refresh in background before expiration
|
||
- **OAuth 2.0 (PKCE) Built-in** — Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, iFlow
|
||
- **Multi-Account OAuth** — Multiple accounts per provider via JWT/ID token extraction
|
||
- **OAuth LAN/Remote Fix** — Private IP detection for `redirect_uri` + manual URL mode for remote servers
|
||
- **OAuth Behind Nginx** — Uses `window.location.origin` for reverse proxy compatibility
|
||
- **Remote OAuth Guide** — Step-by-step guide for Google Cloud credentials on VPS/Docker
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>📊 9. "I don't know how much I'm spending or where"</b></summary>
|
||
|
||
Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Cost Analytics Dashboard** — Per-token cost tracking and budget management per provider
|
||
- **Budget Limits per Tier** — Spending ceiling per tier that triggers automatic fallback
|
||
- **Per-Model Pricing Configuration** — Configurable prices per model
|
||
- **Usage Statistics Per API Key** — Request count and last-used timestamp per key
|
||
- **Analytics Dashboard** — Stat cards, model usage chart, provider table with success rates and latency
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🐛 10. "I can't diagnose errors and problems in AI calls"</b></summary>
|
||
|
||
When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Unified Logs Dashboard** — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
|
||
- **Console Log Viewer** — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
|
||
- **SQLite Proxy Logs** — Persistent logs that survive server restarts
|
||
- **Translator Playground** — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
|
||
- **Request Telemetry** — p50/p95/p99 latency + X-Request-Id tracing
|
||
- **File-Based Logging with Rotation** — Console interceptor captures everything to JSON log with size-based rotation
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🏗️ 11. "Deploying and maintaining the gateway is complex"</b></summary>
|
||
|
||
Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths, `EACCES` on directories, port conflicts, and cross-platform builds add friction.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **npm global install** — `npm install -g omniroute && omniroute` — done
|
||
- **Docker Multi-Platform** — AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
|
||
- **Docker Compose Profiles** — `base` (no CLI tools) and `cli` (with Claude Code, Codex, OpenClaw)
|
||
- **Electron Desktop App** — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
|
||
- **Split-Port Mode** — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
|
||
- **Cloud Sync** — Config synchronization across devices via Cloudflare Workers
|
||
- **DB Backups** — Automatic backup, restore, export and import of all settings
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🌍 12. "The interface is English-only and my team doesn't speak English"</b></summary>
|
||
|
||
Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Dashboard i18n — 30 Languages** — All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
|
||
- **RTL Support** — Right-to-left support for Arabic and Hebrew
|
||
- **Multi-Language READMEs** — 30 complete documentation translations
|
||
- **Language Selector** — Globe icon in header for real-time switching
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔄 13. "I need more than chat — I need embeddings, images, audio"</b></summary>
|
||
|
||
AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Embeddings** — `/v1/embeddings` with 6 providers and 9+ models
|
||
- **Image Generation** — `/v1/images/generations` with 10 providers and 20+ models (OpenAI, xAI, Together, Fireworks, Nebius, Hyperbolic, NanoBanana, Antigravity, SD WebUI, ComfyUI)
|
||
- **Text-to-Video** — `/v1/videos/generations` — ComfyUI (AnimateDiff, SVD) and SD WebUI
|
||
- **Text-to-Music** — `/v1/music/generations` — ComfyUI (Stable Audio Open, MusicGen)
|
||
- **Audio Transcription** — `/v1/audio/transcriptions` — Whisper + Nvidia NIM, HuggingFace, Qwen3
|
||
- **Text-to-Speech** — `/v1/audio/speech` — ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3, + existing providers
|
||
- **Moderations** — `/v1/moderations` — Content safety checks
|
||
- **Reranking** — `/v1/rerank` — Document relevance reranking
|
||
- **Responses API** — Full `/v1/responses` support for Codex
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🧪 14. "I have no way to test and compare quality across models"</b></summary>
|
||
|
||
Developers want to know which model is best for their use case — code, translation, reasoning — but comparing manually is slow. No integrated eval tools exist.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **LLM Evaluations** — Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
|
||
- **4 Match Strategies** — `exact`, `contains`, `regex`, `custom` (JS function)
|
||
- **Translator Playground Test Bench** — Batch testing with multiple inputs and expected outputs, cross-provider comparison
|
||
- **Chat Tester** — Full round-trip with visual response rendering
|
||
- **Live Monitor** — Real-time stream of all requests flowing through the proxy
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>📈 15. "I need to scale without losing performance"</b></summary>
|
||
|
||
As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **Semantic Cache** — Two-tier cache (signature + semantic) reduces cost and latency
|
||
- **Request Idempotency** — 5s deduplication window for identical requests
|
||
- **Rate Limit Detection** — Per-provider RPM, min gap, and max concurrent tracking
|
||
- **Editable Rate Limits** — Configurable defaults in Settings → Resilience with persistence
|
||
- **API Key Validation Cache** — 3-tier cache for production performance
|
||
- **Health Dashboard with Telemetry** — p50/p95/p99 latency, cache stats, uptime
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🤖 16. "I want to control model behavior globally"</b></summary>
|
||
|
||
Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.
|
||
|
||
**How OmniRoute solves it:**
|
||
|
||
- **System Prompt Injection** — Global prompt applied to all requests
|
||
- **Thinking Budget Validation** — Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
|
||
- **6 Routing Strategies** — Global strategies that determine how requests are distributed
|
||
- **Wildcard Router** — `provider/*` patterns route dynamically to any provider
|
||
- **Combo Enable/Disable Toggle** — Toggle combos directly from the dashboard
|
||
- **Provider Toggle** — Enable/disable all connections for a provider with one click
|
||
- **Blocked Providers** — Exclude specific providers from `/v1/models` listing
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
## ⚡ Quick Start
|
||
|
||
**1. Install globally:**
|
||
|
||
```bash
|
||
npm install -g omniroute
|
||
omniroute
|
||
```
|
||
|
||
🎉 Dashboard opens at `http://localhost:20128`
|
||
|
||
| Command | Description |
|
||
| ----------------------- | ----------------------------------------------------------- |
|
||
| `omniroute` | Start server (`PORT=20128`, API and dashboard on same port) |
|
||
| `omniroute --port 3000` | Set canonical/API port to 3000 |
|
||
| `omniroute --no-open` | Don't auto-open browser |
|
||
| `omniroute --help` | Show help |
|
||
|
||
Optional split-port mode:
|
||
|
||
```bash
|
||
PORT=20128 DASHBOARD_PORT=20129 omniroute
|
||
# API: http://localhost:20128/v1
|
||
# Dashboard: http://localhost:20129
|
||
```
|
||
|
||
When ports are split, the API port serves only OpenAI-compatible routes (`/v1`, `/chat/completions`, `/responses`, `/models`, `/codex/*`).
|
||
|
||
**2. Connect a FREE provider:**
|
||
|
||
Dashboard → Providers → Connect **Claude Code** or **Antigravity** → OAuth login → Done!
|
||
|
||
**3. Use in your CLI tool:**
|
||
|
||
```
|
||
Claude Code/Codex/Gemini CLI/OpenClaw/Cursor/Cline Settings:
|
||
Endpoint: http://localhost:20128/v1
|
||
API Key: [copy from dashboard]
|
||
Model: if/kimi-k2-thinking
|
||
```
|
||
|
||
**That's it!** Start coding with FREE AI models.
|
||
|
||
**Alternative — run from source:**
|
||
|
||
```bash
|
||
cp .env.example .env
|
||
npm install
|
||
PORT=20128 DASHBOARD_PORT=20129 NEXT_PUBLIC_BASE_URL=http://localhost:20129 npm run dev
|
||
```
|
||
|
||
---
|
||
|
||
## 🐳 Docker
|
||
|
||
OmniRoute is available as a public Docker image on [Docker Hub](https://hub.docker.com/r/diegosouzapw/omniroute).
|
||
|
||
**Quick run:**
|
||
|
||
```bash
|
||
docker run -d \
|
||
--name omniroute \
|
||
--restart unless-stopped \
|
||
-p 20128:20128 \
|
||
-v omniroute-data:/app/data \
|
||
diegosouzapw/omniroute:latest
|
||
```
|
||
|
||
**With environment file:**
|
||
|
||
```bash
|
||
# Copy and edit .env first
|
||
cp .env.example .env
|
||
|
||
docker run -d \
|
||
--name omniroute \
|
||
--restart unless-stopped \
|
||
--env-file .env \
|
||
-p 20128:20128 \
|
||
-v omniroute-data:/app/data \
|
||
diegosouzapw/omniroute:latest
|
||
```
|
||
|
||
**Using Docker Compose:**
|
||
|
||
```bash
|
||
# Base profile (no CLI tools)
|
||
docker compose --profile base up -d
|
||
|
||
# CLI profile (Claude Code, Codex, OpenClaw built-in)
|
||
docker compose --profile cli up -d
|
||
```
|
||
|
||
| Image | Tag | Size | Description |
|
||
| ------------------------ | -------- | ------ | --------------------- |
|
||
| `diegosouzapw/omniroute` | `latest` | ~250MB | Latest stable release |
|
||
| `diegosouzapw/omniroute` | `1.0.3` | ~250MB | Current version |
|
||
|
||
---
|
||
|
||
## 🖥️ Desktop App — Offline & Always-On
|
||
|
||
> 🆕 **NEW!** OmniRoute is now available as a **native desktop application** for Windows, macOS, and Linux.
|
||
|
||
Run OmniRoute as a standalone desktop app — no terminal, no browser, no internet required for local models. The Electron-based app includes:
|
||
|
||
- 🖥️ **Native Window** — Dedicated app window with system tray integration
|
||
- 🔄 **Auto-Start** — Launch OmniRoute on system login
|
||
- 🔔 **Native Notifications** — Get alerts for quota exhaustion or provider issues
|
||
- ⚡ **One-Click Install** — NSIS (Windows), DMG (macOS), AppImage (Linux)
|
||
- 🌐 **Offline Mode** — Works fully offline with bundled server
|
||
|
||
### Quick Start
|
||
|
||
```bash
|
||
# Development mode
|
||
npm run electron:dev
|
||
|
||
# Build for your platform
|
||
npm run electron:build # Current platform
|
||
npm run electron:build:win # Windows (.exe)
|
||
npm run electron:build:mac # macOS (.dmg) — x64 & arm64
|
||
npm run electron:build:linux # Linux (.AppImage)
|
||
```
|
||
|
||
### System Tray
|
||
|
||
When minimized, OmniRoute lives in your system tray with quick actions:
|
||
|
||
- Open dashboard
|
||
- Change server port
|
||
- Quit application
|
||
|
||
📖 Full documentation: [`electron/README.md`](electron/README.md)
|
||
|
||
---
|
||
|
||
## 💰 Pricing at a Glance
|
||
|
||
| Tier | Provider | Cost | Quota Reset | Best For |
|
||
| ------------------- | ----------------- | ----------------------- | ---------------- | -------------------- |
|
||
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
|
||
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
|
||
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
|
||
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
|
||
| **🔑 API KEY** | NVIDIA NIM | **FREE** (1000 credits) | One-time | Free tier testing |
|
||
| | DeepSeek | Pay-per-use | None | Best price/quality |
|
||
| | Groq | Free tier + paid | Rate limited | Ultra-fast inference |
|
||
| | xAI (Grok) | Pay-per-use | None | Grok models |
|
||
| | Mistral | Free tier + paid | Rate limited | European AI |
|
||
| | OpenRouter | Pay-per-use | None | 100+ models |
|
||
| **💰 CHEAP** | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
|
||
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
|
||
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
|
||
| **🆓 FREE** | iFlow | $0 | Unlimited | 8 models free |
|
||
| | Qwen | $0 | Unlimited | 3 models free |
|
||
| | Kiro | $0 | Unlimited | Claude free |
|
||
|
||
**💡 Pro Tip:** Start with Gemini CLI (180K free/month) + iFlow (unlimited free) combo = $0 cost!
|
||
|
||
---
|
||
|
||
## 💡 Key Features
|
||
|
||
### 🧠 Core Routing & Intelligence
|
||
|
||
| Feature | What It Does |
|
||
| ------------------------------- | ------------------------------------------------------------------------------ |
|
||
| 🎯 **Smart 4-Tier Fallback** | Auto-route: Subscription → API Key → Cheap → Free |
|
||
| 📊 **Real-Time Quota Tracking** | Live token count + reset countdown per provider |
|
||
| 🔄 **Format Translation** | OpenAI ↔ Claude ↔ Gemini ↔ Cursor ↔ Kiro seamless + response sanitization |
|
||
| 👥 **Multi-Account Support** | Multiple accounts per provider with intelligent selection |
|
||
| 🔄 **Auto Token Refresh** | OAuth tokens refresh automatically with retry |
|
||
| 🎨 **Custom Combos** | 6 strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized |
|
||
| 🧩 **Custom Models** | Add any model ID to any provider |
|
||
| 🌐 **Wildcard Router** | Route `provider/*` patterns to any provider dynamically |
|
||
| 🧠 **Thinking Budget** | Passthrough, auto, custom, and adaptive modes for reasoning models |
|
||
| 🔀 **Model Aliases** | Auto-forward deprecated model IDs to current replacements (built-in + custom) |
|
||
| ⚡ **Background Degradation** | Auto-route background tasks (titles, summaries) to cheaper models |
|
||
| 💬 **System Prompt Injection** | Global system prompt applied across all requests |
|
||
| 📄 **Responses API** | Full OpenAI Responses API (`/v1/responses`) support for Codex |
|
||
|
||
### 🎵 Multi-Modal APIs
|
||
|
||
| Feature | What It Does |
|
||
| -------------------------- | -------------------------------------------------------------------------------- |
|
||
| 🖼️ **Image Generation** | `/v1/images/generations` — 10 providers, 20+ models (cloud + local) |
|
||
| 📐 **Embeddings** | `/v1/embeddings` — 6 providers, 9+ models |
|
||
| 🎤 **Audio Transcription** | `/v1/audio/transcriptions` — Whisper + Nvidia NIM, HuggingFace, Qwen3 |
|
||
| 🔊 **Text-to-Speech** | `/v1/audio/speech` — ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3 |
|
||
| 🎬 **Video Generation** | `/v1/videos/generations` — ComfyUI (AnimateDiff, SVD), SD WebUI |
|
||
| 🎵 **Music Generation** | `/v1/music/generations` — ComfyUI (Stable Audio Open, MusicGen) |
|
||
| 🛡️ **Moderations** | `/v1/moderations` — Content safety checks |
|
||
| 🔀 **Reranking** | `/v1/rerank` — Document relevance reranking |
|
||
|
||
### 🛡️ Resilience & Security
|
||
|
||
| Feature | What It Does |
|
||
| ------------------------------- | ----------------------------------------------------------------------------- |
|
||
| 🔌 **Circuit Breaker** | Auto-open/close per-provider with configurable thresholds |
|
||
| 🛡️ **Anti-Thundering Herd** | Mutex + semaphore rate-limit for API key providers |
|
||
| 🧠 **Semantic Cache** | Two-tier cache (signature + semantic) reduces cost & latency |
|
||
| ⚡ **Request Idempotency** | 5s dedup window for duplicate requests |
|
||
| 🔒 **TLS Fingerprint Spoofing** | Bypass TLS-based bot detection via wreq-js |
|
||
| 🌐 **IP Filtering** | Allowlist/blocklist for API access control |
|
||
| 📊 **Editable Rate Limits** | Configurable RPM, min gap, and max concurrent at system level |
|
||
| 💾 **Rate Limit Persistence** | Learned limits survive restarts via SQLite with 60s debounce + 24h staleness |
|
||
| 🔄 **Token Refresh Resilience** | Per-provider circuit breaker (5 fails→30min) + 30s timeout per attempt |
|
||
| 🛡 **API Endpoint Protection** | Auth gating + provider blocking for the `/models` endpoint |
|
||
| 🔒 **Proxy Visibility** | Color-coded badges: 🟢 global, 🟡 provider, 🔵 per-connection with IP display |
|
||
| 🌐 **3-Level Proxy Config** | Configure proxies at global, per-provider, or per-connection level |
|
||
|
||
### 📊 Observability & Analytics
|
||
|
||
| Feature | What It Does |
|
||
| -------------------------- | ---------------------------------------------------------------------- |
|
||
| 📝 **Request Logging** | Debug mode with full request/response logs |
|
||
| 💾 **SQLite Proxy Logs** | Persistent proxy logs survive server restarts |
|
||
| 📊 **Analytics Dashboard** | Recharts-powered: stat cards, model usage chart, provider table |
|
||
| 📈 **Progress Tracking** | Opt-in SSE progress events for streaming |
|
||
| 🧪 **LLM Evaluations** | Golden set testing with 4 match strategies |
|
||
| 🔍 **Request Telemetry** | p50/p95/p99 latency aggregation + X-Request-Id tracing |
|
||
| 📋 **Logs Dashboard** | Unified 4-tab page: Request Logs, Proxy Logs, Audit Logs, Console |
|
||
| 🖥️ **Console Log Viewer** | Real-time terminal-style viewer with level filter, search, auto-scroll |
|
||
| 📑 **File-Based Logging** | Console interceptor captures all output to JSON log file with rotation |
|
||
| 🏥 **Health Dashboard** | System uptime, circuit breaker states, lockouts, cache stats |
|
||
| 💰 **Cost Tracking** | Budget management + per-model pricing configuration |
|
||
|
||
### ☁️ Deployment & Sync
|
||
|
||
| Feature | What It Does |
|
||
| ---------------------------- | --------------------------------------------------------------------- |
|
||
| 💾 **Cloud Sync** | Sync config across devices via Cloudflare Workers |
|
||
| 🌐 **Deploy Anywhere** | Localhost, VPS, Docker, Cloudflare Workers |
|
||
| 🔑 **API Key Management** | Generate, rotate, and scope API keys per provider |
|
||
| 🧙 **Onboarding Wizard** | 4-step guided setup for first-time users |
|
||
| 🔧 **CLI Tools Dashboard** | One-click configure Claude, Codex, Cline, OpenClaw, Kilo, Antigravity |
|
||
| 🔄 **DB Backups** | Automatic backup, restore, export & import for all settings |
|
||
| 🌐 **Internationalization** | Full i18n with next-intl — 30 languages including RTL support |
|
||
| 🌍 **Language Selector** | Globe icon in header for real-time switching between 30 languages |
|
||
| 📂 **Custom Data Directory** | `DATA_DIR` env var to override default `~/.omniroute` storage path |
|
||
|
||
<details>
|
||
<summary><b>📖 Feature Details</b></summary>
|
||
|
||
### 🎯 Smart 4-Tier Fallback
|
||
|
||
Create combos with automatic fallback:
|
||
|
||
```
|
||
Combo: "my-coding-stack"
|
||
1. cc/claude-opus-4-6 (your subscription)
|
||
2. nvidia/llama-3.3-70b (free NVIDIA API)
|
||
3. glm/glm-4.7 (cheap backup, $0.6/1M)
|
||
4. if/kimi-k2-thinking (free fallback)
|
||
|
||
→ Auto switches when quota runs out or errors occur
|
||
```
|
||
|
||
### 📊 Real-Time Quota Tracking
|
||
|
||
- Token consumption per provider
|
||
- Reset countdown (5-hour, daily, weekly)
|
||
- Cost estimation for paid tiers
|
||
- Monthly spending reports
|
||
|
||
### 🔄 Format Translation
|
||
|
||
Seamless translation between formats:
|
||
|
||
- **OpenAI** ↔ **Claude** ↔ **Gemini** ↔ **OpenAI Responses**
|
||
- Your CLI tool sends OpenAI format → OmniRoute translates → Provider receives native format
|
||
- Works with any tool that supports custom OpenAI endpoints
|
||
- **Response sanitization** — Strips non-standard fields for strict OpenAI SDK compatibility
|
||
- **Role normalization** — `developer` → `system` for non-OpenAI; `system` → `user` for GLM/ERNIE models
|
||
- **Think tag extraction** — `<think>` blocks → `reasoning_content` for thinking models
|
||
- **Structured output** — `json_schema` → Gemini's `responseMimeType`/`responseSchema`
|
||
|
||
### 👥 Multi-Account Support
|
||
|
||
- Add multiple accounts per provider
|
||
- Auto round-robin or priority-based routing
|
||
- Fallback to next account when one hits quota
|
||
|
||
### 🔄 Auto Token Refresh
|
||
|
||
- OAuth tokens automatically refresh before expiration
|
||
- No manual re-authentication needed
|
||
- Seamless experience across all providers
|
||
|
||
### 🎨 Custom Combos
|
||
|
||
- Create unlimited model combinations
|
||
- 6 strategies: fill-first, round-robin, power-of-two-choices, random, least-used, cost-optimized
|
||
- Share combos across devices with Cloud Sync
|
||
|
||
### 🏥 Health Dashboard
|
||
|
||
- System status (uptime, version, memory usage)
|
||
- Circuit breaker states per provider (Closed/Open/Half-Open)
|
||
- Rate limit status and active lockouts
|
||
- Signature cache statistics
|
||
- Latency telemetry (p50/p95/p99) + prompt cache
|
||
- Reset health status with one click
|
||
|
||
### 🔧 Translator Playground
|
||
|
||
OmniRoute includes a powerful built-in Translator Playground with **4 modes** for debugging, testing, and monitoring API translations:
|
||
|
||
| Mode | Description |
|
||
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||
| **💻 Playground** | Direct format translation — paste any API request body and instantly see how OmniRoute translates it between provider formats (OpenAI ↔ Claude ↔ Gemini ↔ Responses API). Includes example templates and format auto-detection. |
|
||
| **💬 Chat Tester** | Send real chat requests through OmniRoute and see the full round-trip: your input, the translated request, the provider response, and the translated response back. Invaluable for validating combo routing. |
|
||
| **🧪 Test Bench** | Batch testing mode — define multiple test cases with different inputs and expected outputs, run them all at once, and compare results across providers and models. |
|
||
| **📱 Live Monitor** | Real-time request monitoring — watch incoming requests as they flow through OmniRoute, see format translations happening live, and identify issues instantly. |
|
||
|
||
**Access:** Dashboard → Translator (sidebar)
|
||
|
||
### 💾 Cloud Sync
|
||
|
||
- Sync providers, combos, and settings across devices
|
||
- Automatic background sync
|
||
- Secure encrypted storage
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
## 🎯 Use Cases
|
||
|
||
### Case 1: "I have Claude Pro subscription"
|
||
|
||
**Problem:** Quota expires unused, rate limits during heavy coding
|
||
|
||
```
|
||
Combo: "maximize-claude"
|
||
1. cc/claude-opus-4-6 (use subscription fully)
|
||
2. glm/glm-4.7 (cheap backup when quota out)
|
||
3. if/kimi-k2-thinking (free emergency fallback)
|
||
|
||
Monthly cost: $20 (subscription) + ~$5 (backup) = $25 total
|
||
vs. $20 + hitting limits = frustration
|
||
```
|
||
|
||
### Case 2: "I want zero cost"
|
||
|
||
**Problem:** Can't afford subscriptions, need reliable AI coding
|
||
|
||
```
|
||
Combo: "free-forever"
|
||
1. gc/gemini-3-flash (180K free/month)
|
||
2. if/kimi-k2-thinking (unlimited free)
|
||
3. qw/qwen3-coder-plus (unlimited free)
|
||
|
||
Monthly cost: $0
|
||
Quality: Production-ready models
|
||
```
|
||
|
||
### Case 3: "I need 24/7 coding, no interruptions"
|
||
|
||
**Problem:** Deadlines, can't afford downtime
|
||
|
||
```
|
||
Combo: "always-on"
|
||
1. cc/claude-opus-4-6 (best quality)
|
||
2. cx/gpt-5.2-codex (second subscription)
|
||
3. glm/glm-4.7 (cheap, resets daily)
|
||
4. minimax/MiniMax-M2.1 (cheapest, 5h reset)
|
||
5. if/kimi-k2-thinking (free unlimited)
|
||
|
||
Result: 5 layers of fallback = zero downtime
|
||
```
|
||
|
||
### Case 4: "I want FREE AI in OpenClaw"
|
||
|
||
**Problem:** Need AI assistant in messaging apps, completely free
|
||
|
||
```
|
||
Combo: "openclaw-free"
|
||
1. if/glm-4.7 (unlimited free)
|
||
2. if/minimax-m2.1 (unlimited free)
|
||
3. if/kimi-k2-thinking (unlimited free)
|
||
|
||
Monthly cost: $0
|
||
Access via: WhatsApp, Telegram, Slack, Discord, iMessage, Signal...
|
||
```
|
||
|
||
---
|
||
|
||
## 📖 Setup Guide
|
||
|
||
<details>
|
||
<summary><b>💳 Subscription Providers</b></summary>
|
||
|
||
### Claude Code (Pro/Max)
|
||
|
||
```bash
|
||
Dashboard → Providers → Connect Claude Code
|
||
→ OAuth login → Auto token refresh
|
||
→ 5-hour + weekly quota tracking
|
||
|
||
Models:
|
||
cc/claude-opus-4-6
|
||
cc/claude-sonnet-4-5-20250929
|
||
cc/claude-haiku-4-5-20251001
|
||
```
|
||
|
||
**Pro Tip:** Use Opus for complex tasks, Sonnet for speed. OmniRoute tracks quota per model!
|
||
|
||
### OpenAI Codex (Plus/Pro)
|
||
|
||
```bash
|
||
Dashboard → Providers → Connect Codex
|
||
→ OAuth login (port 1455)
|
||
→ 5-hour + weekly reset
|
||
|
||
Models:
|
||
cx/gpt-5.2-codex
|
||
cx/gpt-5.1-codex-max
|
||
```
|
||
|
||
### Gemini CLI (FREE 180K/month!)
|
||
|
||
```bash
|
||
Dashboard → Providers → Connect Gemini CLI
|
||
→ Google OAuth
|
||
→ 180K completions/month + 1K/day
|
||
|
||
Models:
|
||
gc/gemini-3-flash-preview
|
||
gc/gemini-2.5-pro
|
||
```
|
||
|
||
**Best Value:** Huge free tier! Use this before paid tiers.
|
||
|
||
### GitHub Copilot
|
||
|
||
```bash
|
||
Dashboard → Providers → Connect GitHub
|
||
→ OAuth via GitHub
|
||
→ Monthly reset (1st of month)
|
||
|
||
Models:
|
||
gh/gpt-5
|
||
gh/claude-4.5-sonnet
|
||
gh/gemini-3-pro
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔑 API Key Providers</b></summary>
|
||
|
||
### NVIDIA NIM (FREE 1000 credits!)
|
||
|
||
1. Sign up: [build.nvidia.com](https://build.nvidia.com)
|
||
2. Get free API key (1000 inference credits included)
|
||
3. Dashboard → Add Provider → NVIDIA NIM:
|
||
- API Key: `nvapi-your-key`
|
||
|
||
**Models:** `nvidia/llama-3.3-70b-instruct`, `nvidia/mistral-7b-instruct`, and 50+ more
|
||
|
||
**Pro Tip:** OpenAI-compatible API — works seamlessly with OmniRoute's format translation!
|
||
|
||
### DeepSeek
|
||
|
||
1. Sign up: [platform.deepseek.com](https://platform.deepseek.com)
|
||
2. Get API key
|
||
3. Dashboard → Add Provider → DeepSeek
|
||
|
||
**Models:** `deepseek/deepseek-chat`, `deepseek/deepseek-coder`
|
||
|
||
### Groq (Free Tier Available!)
|
||
|
||
1. Sign up: [console.groq.com](https://console.groq.com)
|
||
2. Get API key (free tier included)
|
||
3. Dashboard → Add Provider → Groq
|
||
|
||
**Models:** `groq/llama-3.3-70b`, `groq/mixtral-8x7b`
|
||
|
||
**Pro Tip:** Ultra-fast inference — best for real-time coding!
|
||
|
||
### OpenRouter (100+ Models)
|
||
|
||
1. Sign up: [openrouter.ai](https://openrouter.ai)
|
||
2. Get API key
|
||
3. Dashboard → Add Provider → OpenRouter
|
||
|
||
**Models:** Access 100+ models from all major providers through a single API key.
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>💰 Cheap Providers (Backup)</b></summary>
|
||
|
||
### GLM-4.7 (Daily reset, $0.6/1M)
|
||
|
||
1. Sign up: [Zhipu AI](https://open.bigmodel.cn/)
|
||
2. Get API key from Coding Plan
|
||
3. Dashboard → Add API Key:
|
||
- Provider: `glm`
|
||
- API Key: `your-key`
|
||
|
||
**Use:** `glm/glm-4.7`
|
||
|
||
**Pro Tip:** Coding Plan offers 3× quota at 1/7 cost! Reset daily 10:00 AM.
|
||
|
||
### MiniMax M2.1 (5h reset, $0.20/1M)
|
||
|
||
1. Sign up: [MiniMax](https://www.minimax.io/)
|
||
2. Get API key
|
||
3. Dashboard → Add API Key
|
||
|
||
**Use:** `minimax/MiniMax-M2.1`
|
||
|
||
**Pro Tip:** Cheapest option for long context (1M tokens)!
|
||
|
||
### Kimi K2 ($9/month flat)
|
||
|
||
1. Subscribe: [Moonshot AI](https://platform.moonshot.ai/)
|
||
2. Get API key
|
||
3. Dashboard → Add API Key
|
||
|
||
**Use:** `kimi/kimi-latest`
|
||
|
||
**Pro Tip:** Fixed $9/month for 10M tokens = $0.90/1M effective cost!
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🆓 FREE Providers (Emergency Backup)</b></summary>
|
||
|
||
### iFlow (8 FREE models)
|
||
|
||
```bash
|
||
Dashboard → Connect iFlow
|
||
→ iFlow OAuth login
|
||
→ Unlimited usage
|
||
|
||
Models:
|
||
if/kimi-k2-thinking
|
||
if/qwen3-coder-plus
|
||
if/glm-4.7
|
||
if/minimax-m2
|
||
if/deepseek-r1
|
||
```
|
||
|
||
### Qwen (3 FREE models)
|
||
|
||
```bash
|
||
Dashboard → Connect Qwen
|
||
→ Device code authorization
|
||
→ Unlimited usage
|
||
|
||
Models:
|
||
qw/qwen3-coder-plus
|
||
qw/qwen3-coder-flash
|
||
```
|
||
|
||
### Kiro (Claude FREE)
|
||
|
||
```bash
|
||
Dashboard → Connect Kiro
|
||
→ AWS Builder ID or Google/GitHub
|
||
→ Unlimited usage
|
||
|
||
Models:
|
||
kr/claude-sonnet-4.5
|
||
kr/claude-haiku-4.5
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🎨 Create Combos</b></summary>
|
||
|
||
### Example 1: Maximize Subscription → Cheap Backup
|
||
|
||
```
|
||
Dashboard → Combos → Create New
|
||
|
||
Name: premium-coding
|
||
Models:
|
||
1. cc/claude-opus-4-6 (Subscription primary)
|
||
2. glm/glm-4.7 (Cheap backup, $0.6/1M)
|
||
3. minimax/MiniMax-M2.1 (Cheapest fallback, $0.20/1M)
|
||
|
||
Use in CLI: premium-coding
|
||
```
|
||
|
||
### Example 2: Free-Only (Zero Cost)
|
||
|
||
```
|
||
Name: free-combo
|
||
Models:
|
||
1. gc/gemini-3-flash-preview (180K free/month)
|
||
2. if/kimi-k2-thinking (unlimited)
|
||
3. qw/qwen3-coder-plus (unlimited)
|
||
|
||
Cost: $0 forever!
|
||
```
|
||
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>🔧 CLI Integration</b></summary>
|
||
|
||
### Cursor IDE
|
||
|
||
```
|
||
Settings → Models → Advanced:
|
||
OpenAI API Base URL: http://localhost:20128/v1
|
||
OpenAI API Key: [from OmniRoute dashboard]
|
||
Model: cc/claude-opus-4-6
|
||
```
|
||
|
||
### Claude Code
|
||
|
||
Use the **CLI Tools** page in the dashboard for one-click configuration, or edit `~/.claude/settings.json` manually.
|
||
|
||
### Codex CLI
|
||
|
||
```bash
|
||
export OPENAI_BASE_URL="http://localhost:20128"
|
||
export OPENAI_API_KEY="your-omniroute-api-key"
|
||
|
||
codex "your prompt"
|
||
```
|
||
|
||
### OpenClaw
|
||
|
||
**Option 1 — Dashboard (recommended):**
|
||
|
||
```
|
||
Dashboard → CLI Tools → OpenClaw → Select Model → Apply
|
||
```
|
||
|
||
**Option 2 — Manual:** Edit `~/.openclaw/openclaw.json`:
|
||
|
||
```json
|
||
{
|
||
"models": {
|
||
"providers": {
|
||
"omniroute": {
|
||
"baseUrl": "http://127.0.0.1:20128/v1",
|
||
"apiKey": "sk_omniroute",
|
||
"api": "openai-completions"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
> **Note:** OpenClaw only works with local OmniRoute. Use `127.0.0.1` instead of `localhost` to avoid IPv6 resolution issues.
|
||
|
||
### Cline / Continue / RooCode
|
||
|
||
```
|
||
Settings → API Configuration:
|
||
Provider: OpenAI Compatible
|
||
Base URL: http://localhost:20128/v1
|
||
API Key: [from OmniRoute dashboard]
|
||
Model: if/kimi-k2-thinking
|
||
```
|
||
|
||
### OpenCode
|
||
|
||
**Step 1:** Add OmniRoute as a custom provider:
|
||
|
||
```bash
|
||
opencode
|
||
/connect
|
||
# Select "Other" → Enter ID: "omniroute" → Enter your OmniRoute API key
|
||
```
|
||
|
||
**Step 2:** Create/edit `opencode.json` in your project root:
|
||
|
||
```json
|
||
{
|
||
"$schema": "https://opencode.ai/config.json",
|
||
"provider": {
|
||
"omniroute": {
|
||
"npm": "@ai-sdk/openai-compatible",
|
||
"name": "OmniRoute",
|
||
"options": {
|
||
"baseURL": "http://localhost:20128/v1"
|
||
},
|
||
"models": {
|
||
"cc/claude-sonnet-4-20250514": { "name": "Claude Sonnet 4" },
|
||
"gg/gemini-2.5-pro": { "name": "Gemini 2.5 Pro" },
|
||
"if/kimi-k2-thinking": { "name": "Kimi K2 (Free)" }
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**Step 3:** Select the model in OpenCode:
|
||
|
||
```bash
|
||
/models
|
||
# Select any OmniRoute model from the list
|
||
```
|
||
|
||
> **Tip:** Add any model available in your OmniRoute `/v1/models` endpoint to the `models` section. Use the format `provider/model-id` from your OmniRoute dashboard.
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
## 🧪 Evaluations (Evals)
|
||
|
||
OmniRoute includes a built-in evaluation framework to test LLM response quality against a golden set. Access it via **Analytics → Evals** in the dashboard.
|
||
|
||
### Built-in Golden Set
|
||
|
||
The pre-loaded "OmniRoute Golden Set" contains 10 test cases covering:
|
||
|
||
- Greetings, math, geography, code generation
|
||
- JSON format compliance, translation, markdown
|
||
- Safety refusal (harmful content), counting, boolean logic
|
||
|
||
### Evaluation Strategies
|
||
|
||
| Strategy | Description | Example |
|
||
| ---------- | ------------------------------------------------ | -------------------------------- |
|
||
| `exact` | Output must match exactly | `"4"` |
|
||
| `contains` | Output must contain substring (case-insensitive) | `"Paris"` |
|
||
| `regex` | Output must match regex pattern | `"1.*2.*3"` |
|
||
| `custom` | Custom JS function returns true/false | `(output) => output.length > 10` |
|
||
|
||
---
|
||
|
||
## 🐛 Troubleshooting
|
||
|
||
<details>
|
||
<summary><b>Click to expand troubleshooting guide</b></summary>
|
||
|
||
**"Language model did not provide messages"**
|
||
|
||
- Provider quota exhausted → Check dashboard quota tracker
|
||
- Solution: Use combo fallback or switch to cheaper tier
|
||
|
||
**Rate limiting**
|
||
|
||
- Subscription quota out → Fallback to GLM/MiniMax
|
||
- Add combo: `cc/claude-opus-4-6 → glm/glm-4.7 → if/kimi-k2-thinking`
|
||
|
||
**OAuth token expired**
|
||
|
||
- Auto-refreshed by OmniRoute
|
||
- If issues persist: Dashboard → Provider → Reconnect
|
||
|
||
**High costs**
|
||
|
||
- Check usage stats in Dashboard → Costs
|
||
- Switch primary model to GLM/MiniMax
|
||
- Use free tier (Gemini CLI, iFlow) for non-critical tasks
|
||
|
||
**Dashboard/API ports are wrong**
|
||
|
||
- `PORT` is the canonical base port (and API port by default)
|
||
- `API_PORT` overrides only OpenAI-compatible API listener
|
||
- `DASHBOARD_PORT` overrides only dashboard/Next.js listener
|
||
- Set `NEXT_PUBLIC_BASE_URL` to your dashboard/public URL (for OAuth callbacks)
|
||
|
||
**Cloud sync errors**
|
||
|
||
- Verify `BASE_URL` points to your running instance
|
||
- Verify `CLOUD_URL` points to your expected cloud endpoint
|
||
- Keep `NEXT_PUBLIC_*` values aligned with server-side values
|
||
|
||
**First login not working**
|
||
|
||
- Check `INITIAL_PASSWORD` in `.env`
|
||
- If unset, fallback password is `123456`
|
||
|
||
**No request logs**
|
||
|
||
- Set `ENABLE_REQUEST_LOGS=true` in `.env`
|
||
|
||
**Connection test shows "Invalid" for OpenAI-compatible providers**
|
||
|
||
- Many providers don't expose a `/models` endpoint
|
||
- OmniRoute v1.0.6+ includes fallback validation via chat completions
|
||
- Ensure base URL includes `/v1` suffix
|
||
|
||
### 🔐 OAuth em Servidor Remoto (Remote OAuth Setup)
|
||
|
||
<a name="oauth-em-servidor-remoto"></a>
|
||
|
||
> **⚠️ IMPORTANTE para usuários com OmniRoute em VPS/Docker/servidor remoto**
|
||
|
||
#### Por que o OAuth do Antigravity / Gemini CLI falha em servidores remotos?
|
||
|
||
Os provedores **Antigravity** e **Gemini CLI** usam **Google OAuth 2.0** para autenticação. O Google exige que a `redirect_uri` usada no fluxo OAuth seja **exatamente** uma das URIs pré-cadastradas no Google Cloud Console do aplicativo.
|
||
|
||
As credenciais OAuth embutidas no OmniRoute estão cadastradas **apenas para `localhost`**. Quando você acessa o OmniRoute em um servidor remoto (ex: `https://omniroute.meuservidor.com`), o Google rejeita a autenticação com:
|
||
|
||
```
|
||
Error 400: redirect_uri_mismatch
|
||
```
|
||
|
||
#### Solução: Configure suas próprias credenciais OAuth
|
||
|
||
Você precisa criar um **OAuth 2.0 Client ID** no Google Cloud Console com a URI do seu servidor.
|
||
|
||
#### Passo a passo
|
||
|
||
**1. Acesse o Google Cloud Console**
|
||
|
||
Abra: [https://console.cloud.google.com/apis/credentials](https://console.cloud.google.com/apis/credentials)
|
||
|
||
**2. Crie um novo OAuth 2.0 Client ID**
|
||
|
||
- Clique em **"+ Create Credentials"** → **"OAuth client ID"**
|
||
- Tipo de aplicativo: **"Web application"**
|
||
- Nome: escolha qualquer nome (ex: `OmniRoute Remote`)
|
||
|
||
**3. Adicione as Authorized Redirect URIs**
|
||
|
||
No campo **"Authorized redirect URIs"**, adicione:
|
||
|
||
```
|
||
https://seu-servidor.com/callback
|
||
```
|
||
|
||
> Substitua `seu-servidor.com` pelo domínio ou IP do seu servidor (inclua a porta se necessário, ex: `http://45.33.32.156:20128/callback`).
|
||
|
||
**4. Salve e copie as credenciais**
|
||
|
||
Após criar, o Google mostrará o **Client ID** e o **Client Secret**.
|
||
|
||
**5. Configure as variáveis de ambiente**
|
||
|
||
No seu `.env` (ou nas variáveis de ambiente do Docker):
|
||
|
||
```bash
|
||
# Para Antigravity:
|
||
ANTIGRAVITY_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
|
||
ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
|
||
|
||
# Para Gemini CLI:
|
||
GEMINI_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
|
||
GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
|
||
GEMINI_CLI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
|
||
```
|
||
|
||
**6. Reinicie o OmniRoute**
|
||
|
||
```bash
|
||
# Se usando npm:
|
||
npm run dev
|
||
|
||
# Se usando Docker:
|
||
docker restart omniroute
|
||
```
|
||
|
||
**7. Tente conectar novamente**
|
||
|
||
Dashboard → Providers → Antigravity (ou Gemini CLI) → OAuth
|
||
|
||
Agora o Google redirecionará corretamente para `https://seu-servidor.com/callback` e a autenticação funcionará.
|
||
|
||
---
|
||
|
||
#### Workaround temporário (sem configurar credenciais próprias)
|
||
|
||
Se não quiser criar credenciais próprias agora, ainda é possível usar o fluxo **manual de URL**:
|
||
|
||
1. O OmniRoute abrirá a URL de autorização do Google
|
||
2. Após você autorizar, o Google tentará redirecionar para `localhost` (que falha no servidor remoto)
|
||
3. **Copie a URL completa** da barra de endereço do seu browser (mesmo que a página não carregue)
|
||
4. Cole essa URL no campo que aparece no modal de conexão do OmniRoute
|
||
5. Clique em **"Connect"**
|
||
|
||
> Este workaround funciona porque o código de autorização na URL é válido independente do redirect ter carregado ou não.
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
## 🛠️ Tech Stack
|
||
|
||
- **Runtime**: Node.js 18–22 LTS (⚠️ Node.js 24+ is **not supported** — `better-sqlite3` native binaries are incompatible)
|
||
- **Language**: TypeScript 5.9 — **100% TypeScript** across `src/` and `open-sse/` (v1.0.6)
|
||
- **Framework**: Next.js 16 + React 19 + Tailwind CSS 4
|
||
- **Database**: LowDB (JSON) + SQLite (domain state + proxy logs)
|
||
- **Streaming**: Server-Sent Events (SSE)
|
||
- **Auth**: OAuth 2.0 (PKCE) + JWT + API Keys
|
||
- **Testing**: Node.js test runner (368+ unit tests)
|
||
- **CI/CD**: GitHub Actions (auto npm publish + Docker Hub on release)
|
||
- **Website**: [omniroute.online](https://omniroute.online)
|
||
- **Package**: [npmjs.com/package/omniroute](https://www.npmjs.com/package/omniroute)
|
||
- **Docker**: [hub.docker.com/r/diegosouzapw/omniroute](https://hub.docker.com/r/diegosouzapw/omniroute)
|
||
- **Resilience**: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing
|
||
|
||
---
|
||
|
||
## 📖 Documentation
|
||
|
||
| Document | Description |
|
||
| -------------------------------------------- | ---------------------------------------------- |
|
||
| [User Guide](docs/USER_GUIDE.md) | Providers, combos, CLI integration, deployment |
|
||
| [API Reference](docs/API_REFERENCE.md) | All endpoints with examples |
|
||
| [Troubleshooting](docs/TROUBLESHOOTING.md) | Common problems and solutions |
|
||
| [Architecture](docs/ARCHITECTURE.md) | System architecture and internals |
|
||
| [Contributing](CONTRIBUTING.md) | Development setup and guidelines |
|
||
| [OpenAPI Spec](docs/openapi.yaml) | OpenAPI 3.0 specification |
|
||
| [Security Policy](SECURITY.md) | Vulnerability reporting and security practices |
|
||
| [VM Deployment](docs/VM_DEPLOYMENT_GUIDE.md) | Complete guide: VM + nginx + Cloudflare setup |
|
||
| [Features Gallery](docs/FEATURES.md) | Visual dashboard tour with screenshots |
|
||
|
||
### 📸 Dashboard Preview
|
||
|
||
<details>
|
||
<summary><b>Click to see dashboard screenshots</b></summary>
|
||
|
||
| Page | Screenshot |
|
||
| -------------- | ------------------------------------------------- |
|
||
| **Providers** |  |
|
||
| **Combos** |  |
|
||
| **Analytics** |  |
|
||
| **Health** |  |
|
||
| **Translator** |  |
|
||
| **Settings** |  |
|
||
| **CLI Tools** |  |
|
||
| **Usage Logs** |  |
|
||
| **Endpoint** |  |
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
## 🗺️ Roadmap
|
||
|
||
OmniRoute has **210+ features planned** across multiple development phases. Here are the key areas:
|
||
|
||
| Category | Planned Features | Highlights |
|
||
| ----------------------------- | ---------------- | -------------------------------------------------------------------------------------- |
|
||
| 🧠 **Routing & Intelligence** | 25+ | Lowest-latency routing, tag-based routing, quota preflight, P2C account selection |
|
||
| 🔒 **Security & Compliance** | 20+ | SSRF hardening, credential cloaking, rate-limit per endpoint, management key scoping |
|
||
| 📊 **Observability** | 15+ | OpenTelemetry integration, real-time quota monitoring, cost tracking per model |
|
||
| 🔄 **Provider Integrations** | 20+ | Dynamic model registry, provider cooldowns, multi-account Codex, Copilot quota parsing |
|
||
| ⚡ **Performance** | 15+ | Dual cache layer, prompt cache, response cache, streaming keepalive, batch API |
|
||
| 🌐 **Ecosystem** | 10+ | WebSocket API, config hot-reload, distributed config store, commercial mode |
|
||
|
||
### 🔜 Coming Soon
|
||
|
||
- 🔗 **OpenCode Integration** — Native provider support for the OpenCode AI coding IDE
|
||
- 🔗 **TRAE Integration** — Full support for the TRAE AI development framework
|
||
- 📦 **Batch API** — Asynchronous batch processing for bulk requests
|
||
- 🎯 **Tag-Based Routing** — Route requests based on custom tags and metadata
|
||
- 💰 **Lowest-Cost Strategy** — Automatically select the cheapest available provider
|
||
|
||
> 📝 Full feature specifications available in [`docs/new-features/`](docs/new-features/) (217 detailed specs)
|
||
|
||
---
|
||
|
||
## 👥 Contributors
|
||
|
||
[](https://github.com/diegosouzapw/OmniRoute/graphs/contributors)
|
||
|
||
### How to Contribute
|
||
|
||
1. Fork the repository
|
||
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
|
||
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
||
4. Push to the branch (`git push origin feature/amazing-feature`)
|
||
5. Open a Pull Request
|
||
|
||
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
|
||
|
||
### Releasing a New Version
|
||
|
||
```bash
|
||
# Create a release — npm publish happens automatically
|
||
gh release create v1.0.6 --title "v1.0.6" --generate-notes
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 Star History
|
||
|
||
<a href="https://star-history.com/#diegosouzapw/OmniRoute&Date">
|
||
<picture>
|
||
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=diegosouzapw/OmniRoute&type=Date&theme=dark" />
|
||
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=diegosouzapw/OmniRoute&type=Date" />
|
||
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=diegosouzapw/OmniRoute&type=Date" />
|
||
</picture>
|
||
</a>
|
||
|
||
---
|
||
|
||
## 🙏 Acknowledgments
|
||
|
||
Special thanks to **[9router](https://github.com/decolua/9router)** by **[decolua](https://github.com/decolua)** — the original project that inspired this fork. OmniRoute builds upon that incredible foundation with additional features, multi-modal APIs, and a full TypeScript rewrite.
|
||
|
||
Special thanks to **[CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI)** — the original Go implementation that inspired this JavaScript port.
|
||
|
||
---
|
||
|
||
## 📄 License
|
||
|
||
MIT License - see [LICENSE](LICENSE) for details.
|
||
|
||
---
|
||
|
||
<div align="center">
|
||
<sub>Built with ❤️ for developers who code 24/7</sub>
|
||
<br/>
|
||
<sub><a href="https://omniroute.online">omniroute.online</a></sub>
|
||
</div>
|
||
<!-- GitHub Discussions enabled for community Q&A -->
|