OmniRoute/README.md
2026-03-02 22:28:01 +05:00

1446 lines
61 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<div align="center">
<img src="./docs/screenshots/MainOmniRoute.png" alt="OmniRoute Dashboard" width="800"/>
# 🚀 OmniRoute — The Free AI Gateway
### Never stop coding. Smart routing to **FREE & low-cost AI models** with automatic fallback.
_Your universal API proxy — one endpoint, 36+ providers, zero downtime._
**Chat Completions • Embeddings • Image Generation • Video • Music • Audio • Reranking • 100% TypeScript**
---
[![npm version](https://img.shields.io/npm/v/omniroute?color=cb3837&logo=npm)](https://www.npmjs.com/package/omniroute)
[![Docker Hub](https://img.shields.io/docker/v/diegosouzapw/omniroute?label=Docker%20Hub&logo=docker&color=2496ED)](https://hub.docker.com/r/diegosouzapw/omniroute)
[![License](https://img.shields.io/github/license/diegosouzapw/OmniRoute)](https://github.com/diegosouzapw/OmniRoute/blob/main/LICENSE)
[![Website](https://img.shields.io/badge/Website-omniroute.online-blue?logo=google-chrome&logoColor=white)](https://omniroute.online)
[![WhatsApp](https://img.shields.io/badge/WhatsApp-Community-25D366?logo=whatsapp&logoColor=white)](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t)
[🌐 Website](https://omniroute.online) • [🚀 Quick Start](#-quick-start) • [💡 Features](#-key-features) • [📖 Docs](#-documentation) • [💰 Pricing](#-pricing-at-a-glance) • [💬 WhatsApp](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t)
🌐 **Available in:** 🇺🇸 [English](README.md) | 🇧🇷 [Português (Brasil)](README.pt-BR.md) | 🇪🇸 [Español](README.es.md) | 🇫🇷 [Français](README.fr.md) | 🇮🇹 [Italiano](README.it.md) | 🇷🇺 [Русский](README.ru.md) | 🇨🇳 [中文 (简体)](README.zh-CN.md) | 🇩🇪 [Deutsch](README.de.md) | 🇮🇳 [हिन्दी](README.in.md) | 🇹🇭 [ไทย](README.th.md) | 🇺🇦 [Українська](README.uk-UA.md) | 🇸🇦 [العربية](README.ar.md) | 🇯🇵 [日本語](README.ja.md) | 🇻🇳 [Tiếng Việt](README.vi.md) | 🇧🇬 [Български](README.bg.md) | 🇩🇰 [Dansk](README.da.md) | 🇫🇮 [Suomi](README.fi.md) | 🇮🇱 [עברית](README.he.md) | 🇭🇺 [Magyar](README.hu.md) | 🇮🇩 [Bahasa Indonesia](README.id.md) | 🇰🇷 [한국어](README.ko.md) | 🇲🇾 [Bahasa Melayu](README.ms.md) | 🇳🇱 [Nederlands](README.nl.md) | 🇳🇴 [Norsk](README.no.md) | 🇵🇹 [Português (Portugal)](README.pt.md) | 🇷🇴 [Română](README.ro.md) | 🇵🇱 [Polski](README.pl.md) | 🇸🇰 [Slovenčina](README.sk.md) | 🇸🇪 [Svenska](README.sv.md) | 🇵🇭 [Filipino](README.phi.md)
</div>
---
### 🤖 Free AI Provider for your favorite coding agents
_Connect any AI-powered IDE or CLI tool through OmniRoute — free API gateway for unlimited coding._
<table>
<tr>
<td align="center" width="110">
<a href="https://github.com/openclaw/openclaw">
<img src="./public/providers/openclaw.png" alt="OpenClaw" width="48"/><br/>
<b>OpenClaw</b>
</a><br/>
<sub>⭐ 205K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/HKUDS/nanobot">
<img src="./public/providers/nanobot.png" alt="NanoBot" width="48"/><br/>
<b>NanoBot</b>
</a><br/>
<sub>⭐ 20.9K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/sipeed/picoclaw">
<img src="./public/providers/picoclaw.jpg" alt="PicoClaw" width="48"/><br/>
<b>PicoClaw</b>
</a><br/>
<sub>⭐ 14.6K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/zeroclaw-labs/zeroclaw">
<img src="./public/providers/zeroclaw.png" alt="ZeroClaw" width="48"/><br/>
<b>ZeroClaw</b>
</a><br/>
<sub>⭐ 9.9K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/nearai/ironclaw">
<img src="./public/providers/ironclaw.png" alt="IronClaw" width="48"/><br/>
<b>IronClaw</b>
</a><br/>
<sub>⭐ 2.1K</sub>
</td>
</tr>
<tr>
<td align="center" width="110">
<a href="https://github.com/anomalyco/opencode">
<img src="./public/providers/opencode.svg" alt="OpenCode" width="48"/><br/>
<b>OpenCode</b>
</a><br/>
<sub>⭐ 106K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/openai/codex">
<img src="./public/providers/codex.png" alt="Codex CLI" width="48"/><br/>
<b>Codex CLI</b>
</a><br/>
<sub>⭐ 60.8K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/anthropics/claude-code">
<img src="./public/providers/claude.png" alt="Claude Code" width="48"/><br/>
<b>Claude Code</b>
</a><br/>
<sub>⭐ 67.3K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/google-gemini/gemini-cli">
<img src="./public/providers/gemini-cli.png" alt="Gemini CLI" width="48"/><br/>
<b>Gemini CLI</b>
</a><br/>
<sub>⭐ 94.7K</sub>
</td>
<td align="center" width="110">
<a href="https://github.com/Kilo-Org/kilocode">
<img src="./public/providers/kilocode.png" alt="Kilo Code" width="48"/><br/>
<b>Kilo Code</b>
</a><br/>
<sub>⭐ 15.5K</sub>
</td>
</tr>
</table>
<sub>📡 All agents connect via <code>http://localhost:20128/v1</code> or <code>http://cloud.omniroute.online/v1</code> — one config, unlimited models and quota</sub>
---
## 📧 Support
> 💬 **Join our community!** [WhatsApp Group](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t) — Get help, share tips, and stay updated.
- **Website**: [omniroute.online](https://omniroute.online)
- **GitHub**: [github.com/diegosouzapw/OmniRoute](https://github.com/diegosouzapw/OmniRoute)
- **Issues**: [github.com/diegosouzapw/OmniRoute/issues](https://github.com/diegosouzapw/OmniRoute/issues)
- **WhatsApp**: [Community Group](https://chat.whatsapp.com/JI7cDQ1GyaiDHhVBpLxf8b?mode=gi_t)
- **Original Project**: [9router by decolua](https://github.com/decolua/9router)
---
## 🤔 Why OmniRoute?
**Stop wasting money and hitting limits:**
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Subscription quota expires unused every month
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Rate limits stop you mid-coding
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Expensive APIs ($20-50/month per provider)
- <img src="https://img.shields.io/badge/✗-e74c3c?style=flat-square" height="16"/> Manual switching between providers
**OmniRoute solves this:**
-**Maximize subscriptions** - Track quota, use every bit before reset
-**Auto fallback** - Subscription → API Key → Cheap → Free, zero downtime
-**Multi-account** - Round-robin between accounts per provider
-**Universal** - Works with Claude Code, Codex, Gemini CLI, Cursor, Cline, OpenClaw, any CLI tool
---
## 🔄 How It Works
```
┌─────────────┐
│ Your CLI │ (Claude Code, Codex, Gemini CLI, OpenClaw, Cursor, Cline...)
│ Tool │
└──────┬──────┘
│ http://localhost:20128/v1
┌─────────────────────────────────────────┐
│ OmniRoute (Smart Router) │
│ • Format translation (OpenAI ↔ Claude) │
│ • Quota tracking + Embeddings + Images │
│ • Auto token refresh │
└──────┬──────────────────────────────────┘
├─→ [Tier 1: SUBSCRIPTION] Claude Code, Codex, Gemini CLI
│ ↓ quota exhausted
├─→ [Tier 2: API KEY] DeepSeek, Groq, xAI, Mistral, NVIDIA NIM, etc.
│ ↓ budget limit
├─→ [Tier 3: CHEAP] GLM ($0.6/1M), MiniMax ($0.2/1M)
│ ↓ budget limit
└─→ [Tier 4: FREE] iFlow, Qwen, Kiro (unlimited)
Result: Never stop coding, minimal cost
```
---
## 🎯 What OmniRoute Solves — 16 Real Pain Points
> **Every developer using AI tools faces these problems daily.** OmniRoute was built to solve them all — from cost overruns to regional blocks, from broken OAuth flows to zero observability.
<details>
<summary><b>💸 1. "I pay for an expensive subscription but still get interrupted by limits"</b></summary>
Developers pay $20200/month for Claude Pro, Codex Pro, or GitHub Copilot. Even paying, quota has a ceiling — 5h of usage, weekly limits, or per-minute rate limits. Mid-coding session, the provider stops responding and the developer loses flow and productivity.
**How OmniRoute solves it:**
- **Smart 4-Tier Fallback** — If subscription quota runs out, automatically redirects to API Key → Cheap → Free with zero manual intervention
- **Real-Time Quota Tracking** — Shows token consumption in real-time with reset countdown (5h, daily, weekly)
- **Multi-Account Support** — Multiple accounts per provider with auto round-robin — when one runs out, switches to the next
- **Custom Combos** — Customizable fallback chains with 6 balancing strategies (fill-first, round-robin, P2C, random, least-used, cost-optimized)
- **Codex Business Quotas** — Business/Team workspace quota monitoring directly in the dashboard
</details>
<details>
<summary><b>🔌 2. "I need to use multiple providers but each has a different API"</b></summary>
OpenAI uses one format, Claude (Anthropic) uses another, Gemini yet another. If a dev wants to test models from different providers or fallback between them, they need to reconfigure SDKs, change endpoints, deal with incompatible formats. Custom providers (FriendLI, NIM) have non-standard model endpoints.
**How OmniRoute solves it:**
- **Unified Endpoint** — A single `http://localhost:20128/v1` serves as proxy for all 36+ providers
- **Format Translation** — Automatic and transparent: OpenAI ↔ Claude ↔ Gemini ↔ Responses API
- **Response Sanitization** — Strips non-standard fields (`x_groq`, `usage_breakdown`, `service_tier`) that break OpenAI SDK v1.83+
- **Role Normalization** — Converts `developer``system` for non-OpenAI providers; `system``user` for GLM/ERNIE
- **Think Tag Extraction** — Extracts `<think>` blocks from models like DeepSeek R1 into standardized `reasoning_content`
- **Structured Output for Gemini** — `json_schema``responseMimeType`/`responseSchema` automatic conversion
- **`stream` defaults to `false`** — Aligns with OpenAI spec, avoiding unexpected SSE in Python/Rust/Go SDKs
</details>
<details>
<summary><b>🌐 3. "My AI provider blocks my region/country"</b></summary>
Providers like OpenAI/Codex block access from certain geographic regions. Users get errors like `unsupported_country_region_territory` during OAuth and API connections. This is especially frustrating for developers from developing countries.
**How OmniRoute solves it:**
- **3-Level Proxy Config** — Configurable proxy at 3 levels: global (all traffic), per-provider (one provider only), and per-connection/key
- **Color-Coded Proxy Badges** — Visual indicators: 🟢 global proxy, 🟡 provider proxy, 🔵 connection proxy, always showing the IP
- **OAuth Token Exchange Through Proxy** — OAuth flow also goes through the proxy, solving `unsupported_country_region_territory`
- **Connection Tests via Proxy** — Connection tests use the configured proxy (no more direct bypass)
- **SOCKS5 Support** — Full SOCKS5 proxy support for outbound routing
- **TLS Fingerprint Spoofing** — Browser-like TLS fingerprint via `wreq-js` to bypass bot detection
</details>
<details>
<summary><b>🆓 4. "I want to use AI for coding but I have no money"</b></summary>
Not everyone can pay $20200/month for AI subscriptions. Students, devs from emerging countries, hobbyists, and freelancers need access to quality models at zero cost.
**How OmniRoute solves it:**
- **Free Tier Providers Built-in** — Native support for 100% free providers: iFlow (8 unlimited models), Qwen (3 unlimited models), Kiro (Claude for free), Gemini CLI (180K/month free)
- **Free-Only Combos** — Chain `gc/gemini-3-flash → if/kimi-k2-thinking → qw/qwen3-coder-plus` = $0/month with zero downtime
- **NVIDIA NIM Free Credits** — 1000 free credits integrated
- **Cost Optimized Strategy** — Routing strategy that automatically chooses the cheapest available provider
</details>
<details>
<summary><b>🔒 5. "I need to protect my AI gateway from unauthorized access"</b></summary>
When exposing an AI gateway to the network (LAN, VPS, Docker), anyone with the address can consume the developer's tokens/quota. Without protection, APIs are vulnerable to misuse, prompt injection, and abuse.
**How OmniRoute solves it:**
- **API Key Management** — Generation, rotation, and scoping per provider with a dedicated `/dashboard/api-manager` page
- **Model-Level Permissions** — Restrict API keys to specific models (`openai/*`, wildcard patterns), with Allow All/Restrict toggle
- **API Endpoint Protection** — Require a key for `/v1/models` and block specific providers from the listing
- **Auth Guard + CSRF Protection** — All dashboard routes protected with `withAuth` middleware + CSRF tokens
- **Rate Limiter** — Per-IP rate limiting with configurable windows
- **IP Filtering** — Allowlist/blocklist for access control
- **Prompt Injection Guard** — Sanitization against malicious prompt patterns
- **AES-256-GCM Encryption** — Credentials encrypted at rest
</details>
<details>
<summary><b>🛑 6. "My provider went down and I lost my coding flow"</b></summary>
AI providers can become unstable, return 5xx errors, or hit temporary rate limits. If a dev depends on a single provider, they're interrupted. Without circuit breakers, repeated retries can crash the application.
**How OmniRoute solves it:**
- **Circuit Breaker per-provider** — Auto-open/close with configurable thresholds and cooldown (Closed/Open/Half-Open)
- **Exponential Backoff** — Progressive retry delays
- **Anti-Thundering Herd** — Mutex + semaphore protection against concurrent retry storms
- **Combo Fallback Chains** — If the primary provider fails, automatically falls through the chain with no intervention
- **Combo Circuit Breaker** — Auto-disables failing providers within a combo chain
- **Health Dashboard** — Uptime monitoring, circuit breaker states, lockouts, cache stats, p50/p95/p99 latency
</details>
<details>
<summary><b>🔧 7. "Configuring each AI tool is tedious and repetitive"</b></summary>
Developers use Cursor, Claude Code, Codex CLI, OpenClaw, Gemini CLI, Kilo Code... Each tool needs a different config (API endpoint, key, model). Reconfiguring when switching providers or models is a waste of time.
**How OmniRoute solves it:**
- **CLI Tools Dashboard** — Dedicated page with one-click setup for Claude Code, Codex CLI, OpenClaw, Kilo Code, Antigravity, Cline
- **GitHub Copilot Config Generator** — Generates `chatLanguageModels.json` for VS Code with bulk model selection
- **Onboarding Wizard** — Guided 4-step setup for first-time users
- **One endpoint, all models** — Configure `http://localhost:20128/v1` once, access 36+ providers
</details>
<details>
<summary><b>🔑 8. "Managing OAuth tokens from multiple providers is hell"</b></summary>
Claude Code, Codex, Gemini CLI, Copilot — all use OAuth 2.0 with expiring tokens. Developers need to re-authenticate constantly, deal with `client_secret is missing`, `redirect_uri_mismatch`, and failures on remote servers. OAuth on LAN/VPS is particularly problematic.
**How OmniRoute solves it:**
- **Auto Token Refresh** — OAuth tokens refresh in background before expiration
- **OAuth 2.0 (PKCE) Built-in** — Automatic flow for Claude Code, Codex, Gemini CLI, Copilot, Kiro, Qwen, iFlow
- **Multi-Account OAuth** — Multiple accounts per provider via JWT/ID token extraction
- **OAuth LAN/Remote Fix** — Private IP detection for `redirect_uri` + manual URL mode for remote servers
- **OAuth Behind Nginx** — Uses `window.location.origin` for reverse proxy compatibility
- **Remote OAuth Guide** — Step-by-step guide for Google Cloud credentials on VPS/Docker
</details>
<details>
<summary><b>📊 9. "I don't know how much I'm spending or where"</b></summary>
Developers use multiple paid providers but have no unified view of spending. Each provider has its own billing dashboard, but there's no consolidated view. Unexpected costs can pile up.
**How OmniRoute solves it:**
- **Cost Analytics Dashboard** — Per-token cost tracking and budget management per provider
- **Budget Limits per Tier** — Spending ceiling per tier that triggers automatic fallback
- **Per-Model Pricing Configuration** — Configurable prices per model
- **Usage Statistics Per API Key** — Request count and last-used timestamp per key
- **Analytics Dashboard** — Stat cards, model usage chart, provider table with success rates and latency
</details>
<details>
<summary><b>🐛 10. "I can't diagnose errors and problems in AI calls"</b></summary>
When a call fails, the dev doesn't know if it was a rate limit, expired token, wrong format, or provider error. Fragmented logs across different terminals. Without observability, debugging is trial-and-error.
**How OmniRoute solves it:**
- **Unified Logs Dashboard** — 4 tabs: Request Logs, Proxy Logs, Audit Logs, Console
- **Console Log Viewer** — Real-time terminal-style viewer with color-coded levels, auto-scroll, search, filter
- **SQLite Proxy Logs** — Persistent logs that survive server restarts
- **Translator Playground** — 4 debugging modes: Playground (format translation), Chat Tester (round-trip), Test Bench (batch), Live Monitor (real-time)
- **Request Telemetry** — p50/p95/p99 latency + X-Request-Id tracing
- **File-Based Logging with Rotation** — Console interceptor captures everything to JSON log with size-based rotation
</details>
<details>
<summary><b>🏗️ 11. "Deploying and maintaining the gateway is complex"</b></summary>
Installing, configuring, and maintaining an AI proxy across different environments (local, VPS, Docker, cloud) is labor-intensive. Problems like hardcoded paths, `EACCES` on directories, port conflicts, and cross-platform builds add friction.
**How OmniRoute solves it:**
- **npm global install** — `npm install -g omniroute && omniroute` — done
- **Docker Multi-Platform** — AMD64 + ARM64 native (Apple Silicon, AWS Graviton, Raspberry Pi)
- **Docker Compose Profiles** — `base` (no CLI tools) and `cli` (with Claude Code, Codex, OpenClaw)
- **Electron Desktop App** — Native app for Windows/macOS/Linux with system tray, auto-start, offline mode
- **Split-Port Mode** — API and Dashboard on separate ports for advanced scenarios (reverse proxy, container networking)
- **Cloud Sync** — Config synchronization across devices via Cloudflare Workers
- **DB Backups** — Automatic backup, restore, export and import of all settings
</details>
<details>
<summary><b>🌍 12. "The interface is English-only and my team doesn't speak English"</b></summary>
Teams in non-English-speaking countries, especially in Latin America, Asia, and Europe, struggle with English-only interfaces. Language barriers reduce adoption and increase configuration errors.
**How OmniRoute solves it:**
- **Dashboard i18n — 30 Languages** — All 500+ keys translated including Arabic, Bulgarian, Danish, German, Spanish, Finnish, French, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Dutch, Norwegian, Polish, Portuguese (PT/BR), Romanian, Russian, Slovak, Swedish, Thai, Ukrainian, Vietnamese, Chinese, Filipino, English
- **RTL Support** — Right-to-left support for Arabic and Hebrew
- **Multi-Language READMEs** — 30 complete documentation translations
- **Language Selector** — Globe icon in header for real-time switching
</details>
<details>
<summary><b>🔄 13. "I need more than chat — I need embeddings, images, audio"</b></summary>
AI isn't just chat completion. Devs need to generate images, transcribe audio, create embeddings for RAG, rerank documents, and moderate content. Each API has a different endpoint and format.
**How OmniRoute solves it:**
- **Embeddings** — `/v1/embeddings` with 6 providers and 9+ models
- **Image Generation** — `/v1/images/generations` with 10 providers and 20+ models (OpenAI, xAI, Together, Fireworks, Nebius, Hyperbolic, NanoBanana, Antigravity, SD WebUI, ComfyUI)
- **Text-to-Video** — `/v1/videos/generations` — ComfyUI (AnimateDiff, SVD) and SD WebUI
- **Text-to-Music** — `/v1/music/generations` — ComfyUI (Stable Audio Open, MusicGen)
- **Audio Transcription** — `/v1/audio/transcriptions` — Whisper + Nvidia NIM, HuggingFace, Qwen3
- **Text-to-Speech** — `/v1/audio/speech` — ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3, + existing providers
- **Moderations** — `/v1/moderations` — Content safety checks
- **Reranking** — `/v1/rerank` — Document relevance reranking
- **Responses API** — Full `/v1/responses` support for Codex
</details>
<details>
<summary><b>🧪 14. "I have no way to test and compare quality across models"</b></summary>
Developers want to know which model is best for their use case — code, translation, reasoning — but comparing manually is slow. No integrated eval tools exist.
**How OmniRoute solves it:**
- **LLM Evaluations** — Golden set testing with 10 pre-loaded cases covering greetings, math, geography, code generation, JSON compliance, translation, markdown, safety refusal
- **4 Match Strategies** — `exact`, `contains`, `regex`, `custom` (JS function)
- **Translator Playground Test Bench** — Batch testing with multiple inputs and expected outputs, cross-provider comparison
- **Chat Tester** — Full round-trip with visual response rendering
- **Live Monitor** — Real-time stream of all requests flowing through the proxy
</details>
<details>
<summary><b>📈 15. "I need to scale without losing performance"</b></summary>
As request volume grows, without caching the same questions generate duplicate costs. Without idempotency, duplicate requests waste processing. Per-provider rate limits must be respected.
**How OmniRoute solves it:**
- **Semantic Cache** — Two-tier cache (signature + semantic) reduces cost and latency
- **Request Idempotency** — 5s deduplication window for identical requests
- **Rate Limit Detection** — Per-provider RPM, min gap, and max concurrent tracking
- **Editable Rate Limits** — Configurable defaults in Settings → Resilience with persistence
- **API Key Validation Cache** — 3-tier cache for production performance
- **Health Dashboard with Telemetry** — p50/p95/p99 latency, cache stats, uptime
</details>
<details>
<summary><b>🤖 16. "I want to control model behavior globally"</b></summary>
Developers who want all responses in a specific language, with a specific tone, or want to limit reasoning tokens. Configuring this in every tool/request is impractical.
**How OmniRoute solves it:**
- **System Prompt Injection** — Global prompt applied to all requests
- **Thinking Budget Validation** — Reasoning token allocation control per request (passthrough, auto, custom, adaptive)
- **6 Routing Strategies** — Global strategies that determine how requests are distributed
- **Wildcard Router** — `provider/*` patterns route dynamically to any provider
- **Combo Enable/Disable Toggle** — Toggle combos directly from the dashboard
- **Provider Toggle** — Enable/disable all connections for a provider with one click
- **Blocked Providers** — Exclude specific providers from `/v1/models` listing
</details>
---
## ⚡ Quick Start
**1. Install globally:**
```bash
npm install -g omniroute
omniroute
```
🎉 Dashboard opens at `http://localhost:20128`
| Command | Description |
| ----------------------- | ----------------------------------------------------------- |
| `omniroute` | Start server (`PORT=20128`, API and dashboard on same port) |
| `omniroute --port 3000` | Set canonical/API port to 3000 |
| `omniroute --no-open` | Don't auto-open browser |
| `omniroute --help` | Show help |
Optional split-port mode:
```bash
PORT=20128 DASHBOARD_PORT=20129 omniroute
# API: http://localhost:20128/v1
# Dashboard: http://localhost:20129
```
When ports are split, the API port serves only OpenAI-compatible routes (`/v1`, `/chat/completions`, `/responses`, `/models`, `/codex/*`).
**2. Connect a FREE provider:**
Dashboard → Providers → Connect **Claude Code** or **Antigravity** → OAuth login → Done!
**3. Use in your CLI tool:**
```
Claude Code/Codex/Gemini CLI/OpenClaw/Cursor/Cline Settings:
Endpoint: http://localhost:20128/v1
API Key: [copy from dashboard]
Model: if/kimi-k2-thinking
```
**That's it!** Start coding with FREE AI models.
**Alternative — run from source:**
```bash
cp .env.example .env
npm install
PORT=20128 DASHBOARD_PORT=20129 NEXT_PUBLIC_BASE_URL=http://localhost:20129 npm run dev
```
---
## 🐳 Docker
OmniRoute is available as a public Docker image on [Docker Hub](https://hub.docker.com/r/diegosouzapw/omniroute).
**Quick run:**
```bash
docker run -d \
--name omniroute \
--restart unless-stopped \
-p 20128:20128 \
-v omniroute-data:/app/data \
diegosouzapw/omniroute:latest
```
**With environment file:**
```bash
# Copy and edit .env first
cp .env.example .env
docker run -d \
--name omniroute \
--restart unless-stopped \
--env-file .env \
-p 20128:20128 \
-v omniroute-data:/app/data \
diegosouzapw/omniroute:latest
```
**Using Docker Compose:**
```bash
# Base profile (no CLI tools)
docker compose --profile base up -d
# CLI profile (Claude Code, Codex, OpenClaw built-in)
docker compose --profile cli up -d
```
| Image | Tag | Size | Description |
| ------------------------ | -------- | ------ | --------------------- |
| `diegosouzapw/omniroute` | `latest` | ~250MB | Latest stable release |
| `diegosouzapw/omniroute` | `1.0.3` | ~250MB | Current version |
---
## 🖥️ Desktop App — Offline & Always-On
> 🆕 **NEW!** OmniRoute is now available as a **native desktop application** for Windows, macOS, and Linux.
Run OmniRoute as a standalone desktop app — no terminal, no browser, no internet required for local models. The Electron-based app includes:
- 🖥️ **Native Window** — Dedicated app window with system tray integration
- 🔄 **Auto-Start** — Launch OmniRoute on system login
- 🔔 **Native Notifications** — Get alerts for quota exhaustion or provider issues
-**One-Click Install** — NSIS (Windows), DMG (macOS), AppImage (Linux)
- 🌐 **Offline Mode** — Works fully offline with bundled server
### Quick Start
```bash
# Development mode
npm run electron:dev
# Build for your platform
npm run electron:build # Current platform
npm run electron:build:win # Windows (.exe)
npm run electron:build:mac # macOS (.dmg) — x64 & arm64
npm run electron:build:linux # Linux (.AppImage)
```
### System Tray
When minimized, OmniRoute lives in your system tray with quick actions:
- Open dashboard
- Change server port
- Quit application
📖 Full documentation: [`electron/README.md`](electron/README.md)
---
## 💰 Pricing at a Glance
| Tier | Provider | Cost | Quota Reset | Best For |
| ------------------- | ----------------- | ----------------------- | ---------------- | -------------------- |
| **💳 SUBSCRIPTION** | Claude Code (Pro) | $20/mo | 5h + weekly | Already subscribed |
| | Codex (Plus/Pro) | $20-200/mo | 5h + weekly | OpenAI users |
| | Gemini CLI | **FREE** | 180K/mo + 1K/day | Everyone! |
| | GitHub Copilot | $10-19/mo | Monthly | GitHub users |
| **🔑 API KEY** | NVIDIA NIM | **FREE** (1000 credits) | One-time | Free tier testing |
| | DeepSeek | Pay-per-use | None | Best price/quality |
| | Groq | Free tier + paid | Rate limited | Ultra-fast inference |
| | xAI (Grok) | Pay-per-use | None | Grok models |
| | Mistral | Free tier + paid | Rate limited | European AI |
| | OpenRouter | Pay-per-use | None | 100+ models |
| **💰 CHEAP** | GLM-4.7 | $0.6/1M | Daily 10AM | Budget backup |
| | MiniMax M2.1 | $0.2/1M | 5-hour rolling | Cheapest option |
| | Kimi K2 | $9/mo flat | 10M tokens/mo | Predictable cost |
| **🆓 FREE** | iFlow | $0 | Unlimited | 8 models free |
| | Qwen | $0 | Unlimited | 3 models free |
| | Kiro | $0 | Unlimited | Claude free |
**💡 Pro Tip:** Start with Gemini CLI (180K free/month) + iFlow (unlimited free) combo = $0 cost!
---
## 💡 Key Features
### 🧠 Core Routing & Intelligence
| Feature | What It Does |
| ------------------------------- | ------------------------------------------------------------------------------ |
| 🎯 **Smart 4-Tier Fallback** | Auto-route: Subscription → API Key → Cheap → Free |
| 📊 **Real-Time Quota Tracking** | Live token count + reset countdown per provider |
| 🔄 **Format Translation** | OpenAI ↔ Claude ↔ Gemini ↔ Cursor ↔ Kiro seamless + response sanitization |
| 👥 **Multi-Account Support** | Multiple accounts per provider with intelligent selection |
| 🔄 **Auto Token Refresh** | OAuth tokens refresh automatically with retry |
| 🎨 **Custom Combos** | 6 strategies: fill-first, round-robin, p2c, random, least-used, cost-optimized |
| 🧩 **Custom Models** | Add any model ID to any provider |
| 🌐 **Wildcard Router** | Route `provider/*` patterns to any provider dynamically |
| 🧠 **Thinking Budget** | Passthrough, auto, custom, and adaptive modes for reasoning models |
| 🔀 **Model Aliases** | Auto-forward deprecated model IDs to current replacements (built-in + custom) |
| ⚡ **Background Degradation** | Auto-route background tasks (titles, summaries) to cheaper models |
| 💬 **System Prompt Injection** | Global system prompt applied across all requests |
| 📄 **Responses API** | Full OpenAI Responses API (`/v1/responses`) support for Codex |
### 🎵 Multi-Modal APIs
| Feature | What It Does |
| -------------------------- | -------------------------------------------------------------------------------- |
| 🖼️ **Image Generation** | `/v1/images/generations` — 10 providers, 20+ models (cloud + local) |
| 📐 **Embeddings** | `/v1/embeddings` — 6 providers, 9+ models |
| 🎤 **Audio Transcription** | `/v1/audio/transcriptions` — Whisper + Nvidia NIM, HuggingFace, Qwen3 |
| 🔊 **Text-to-Speech** | `/v1/audio/speech` — ElevenLabs, Nvidia NIM, HuggingFace, Coqui, Tortoise, Qwen3 |
| 🎬 **Video Generation** | `/v1/videos/generations` — ComfyUI (AnimateDiff, SVD), SD WebUI |
| 🎵 **Music Generation** | `/v1/music/generations` — ComfyUI (Stable Audio Open, MusicGen) |
| 🛡️ **Moderations** | `/v1/moderations` — Content safety checks |
| 🔀 **Reranking** | `/v1/rerank` — Document relevance reranking |
### 🛡️ Resilience & Security
| Feature | What It Does |
| ------------------------------- | ----------------------------------------------------------------------------- |
| 🔌 **Circuit Breaker** | Auto-open/close per-provider with configurable thresholds |
| 🛡️ **Anti-Thundering Herd** | Mutex + semaphore rate-limit for API key providers |
| 🧠 **Semantic Cache** | Two-tier cache (signature + semantic) reduces cost & latency |
| ⚡ **Request Idempotency** | 5s dedup window for duplicate requests |
| 🔒 **TLS Fingerprint Spoofing** | Bypass TLS-based bot detection via wreq-js |
| 🌐 **IP Filtering** | Allowlist/blocklist for API access control |
| 📊 **Editable Rate Limits** | Configurable RPM, min gap, and max concurrent at system level |
| 💾 **Rate Limit Persistence** | Learned limits survive restarts via SQLite with 60s debounce + 24h staleness |
| 🔄 **Token Refresh Resilience** | Per-provider circuit breaker (5 fails→30min) + 30s timeout per attempt |
| 🛡 **API Endpoint Protection** | Auth gating + provider blocking for the `/models` endpoint |
| 🔒 **Proxy Visibility** | Color-coded badges: 🟢 global, 🟡 provider, 🔵 per-connection with IP display |
| 🌐 **3-Level Proxy Config** | Configure proxies at global, per-provider, or per-connection level |
### 📊 Observability & Analytics
| Feature | What It Does |
| -------------------------- | ---------------------------------------------------------------------- |
| 📝 **Request Logging** | Debug mode with full request/response logs |
| 💾 **SQLite Proxy Logs** | Persistent proxy logs survive server restarts |
| 📊 **Analytics Dashboard** | Recharts-powered: stat cards, model usage chart, provider table |
| 📈 **Progress Tracking** | Opt-in SSE progress events for streaming |
| 🧪 **LLM Evaluations** | Golden set testing with 4 match strategies |
| 🔍 **Request Telemetry** | p50/p95/p99 latency aggregation + X-Request-Id tracing |
| 📋 **Logs Dashboard** | Unified 4-tab page: Request Logs, Proxy Logs, Audit Logs, Console |
| 🖥️ **Console Log Viewer** | Real-time terminal-style viewer with level filter, search, auto-scroll |
| 📑 **File-Based Logging** | Console interceptor captures all output to JSON log file with rotation |
| 🏥 **Health Dashboard** | System uptime, circuit breaker states, lockouts, cache stats |
| 💰 **Cost Tracking** | Budget management + per-model pricing configuration |
### ☁️ Deployment & Sync
| Feature | What It Does |
| ---------------------------- | --------------------------------------------------------------------- |
| 💾 **Cloud Sync** | Sync config across devices via Cloudflare Workers |
| 🌐 **Deploy Anywhere** | Localhost, VPS, Docker, Cloudflare Workers |
| 🔑 **API Key Management** | Generate, rotate, and scope API keys per provider |
| 🧙 **Onboarding Wizard** | 4-step guided setup for first-time users |
| 🔧 **CLI Tools Dashboard** | One-click configure Claude, Codex, Cline, OpenClaw, Kilo, Antigravity |
| 🔄 **DB Backups** | Automatic backup, restore, export & import for all settings |
| 🌐 **Internationalization** | Full i18n with next-intl — 30 languages including RTL support |
| 🌍 **Language Selector** | Globe icon in header for real-time switching between 30 languages |
| 📂 **Custom Data Directory** | `DATA_DIR` env var to override default `~/.omniroute` storage path |
<details>
<summary><b>📖 Feature Details</b></summary>
### 🎯 Smart 4-Tier Fallback
Create combos with automatic fallback:
```
Combo: "my-coding-stack"
1. cc/claude-opus-4-6 (your subscription)
2. nvidia/llama-3.3-70b (free NVIDIA API)
3. glm/glm-4.7 (cheap backup, $0.6/1M)
4. if/kimi-k2-thinking (free fallback)
→ Auto switches when quota runs out or errors occur
```
### 📊 Real-Time Quota Tracking
- Token consumption per provider
- Reset countdown (5-hour, daily, weekly)
- Cost estimation for paid tiers
- Monthly spending reports
### 🔄 Format Translation
Seamless translation between formats:
- **OpenAI** ↔ **Claude****Gemini****OpenAI Responses**
- Your CLI tool sends OpenAI format → OmniRoute translates → Provider receives native format
- Works with any tool that supports custom OpenAI endpoints
- **Response sanitization** — Strips non-standard fields for strict OpenAI SDK compatibility
- **Role normalization** — `developer``system` for non-OpenAI; `system``user` for GLM/ERNIE models
- **Think tag extraction** — `<think>` blocks → `reasoning_content` for thinking models
- **Structured output** — `json_schema` → Gemini's `responseMimeType`/`responseSchema`
### 👥 Multi-Account Support
- Add multiple accounts per provider
- Auto round-robin or priority-based routing
- Fallback to next account when one hits quota
### 🔄 Auto Token Refresh
- OAuth tokens automatically refresh before expiration
- No manual re-authentication needed
- Seamless experience across all providers
### 🎨 Custom Combos
- Create unlimited model combinations
- 6 strategies: fill-first, round-robin, power-of-two-choices, random, least-used, cost-optimized
- Share combos across devices with Cloud Sync
### 🏥 Health Dashboard
- System status (uptime, version, memory usage)
- Circuit breaker states per provider (Closed/Open/Half-Open)
- Rate limit status and active lockouts
- Signature cache statistics
- Latency telemetry (p50/p95/p99) + prompt cache
- Reset health status with one click
### 🔧 Translator Playground
OmniRoute includes a powerful built-in Translator Playground with **4 modes** for debugging, testing, and monitoring API translations:
| Mode | Description |
| ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **💻 Playground** | Direct format translation — paste any API request body and instantly see how OmniRoute translates it between provider formats (OpenAI ↔ Claude ↔ Gemini ↔ Responses API). Includes example templates and format auto-detection. |
| **💬 Chat Tester** | Send real chat requests through OmniRoute and see the full round-trip: your input, the translated request, the provider response, and the translated response back. Invaluable for validating combo routing. |
| **🧪 Test Bench** | Batch testing mode — define multiple test cases with different inputs and expected outputs, run them all at once, and compare results across providers and models. |
| **📱 Live Monitor** | Real-time request monitoring — watch incoming requests as they flow through OmniRoute, see format translations happening live, and identify issues instantly. |
**Access:** Dashboard → Translator (sidebar)
### 💾 Cloud Sync
- Sync providers, combos, and settings across devices
- Automatic background sync
- Secure encrypted storage
</details>
---
## 🎯 Use Cases
### Case 1: "I have Claude Pro subscription"
**Problem:** Quota expires unused, rate limits during heavy coding
```
Combo: "maximize-claude"
1. cc/claude-opus-4-6 (use subscription fully)
2. glm/glm-4.7 (cheap backup when quota out)
3. if/kimi-k2-thinking (free emergency fallback)
Monthly cost: $20 (subscription) + ~$5 (backup) = $25 total
vs. $20 + hitting limits = frustration
```
### Case 2: "I want zero cost"
**Problem:** Can't afford subscriptions, need reliable AI coding
```
Combo: "free-forever"
1. gc/gemini-3-flash (180K free/month)
2. if/kimi-k2-thinking (unlimited free)
3. qw/qwen3-coder-plus (unlimited free)
Monthly cost: $0
Quality: Production-ready models
```
### Case 3: "I need 24/7 coding, no interruptions"
**Problem:** Deadlines, can't afford downtime
```
Combo: "always-on"
1. cc/claude-opus-4-6 (best quality)
2. cx/gpt-5.2-codex (second subscription)
3. glm/glm-4.7 (cheap, resets daily)
4. minimax/MiniMax-M2.1 (cheapest, 5h reset)
5. if/kimi-k2-thinking (free unlimited)
Result: 5 layers of fallback = zero downtime
```
### Case 4: "I want FREE AI in OpenClaw"
**Problem:** Need AI assistant in messaging apps, completely free
```
Combo: "openclaw-free"
1. if/glm-4.7 (unlimited free)
2. if/minimax-m2.1 (unlimited free)
3. if/kimi-k2-thinking (unlimited free)
Monthly cost: $0
Access via: WhatsApp, Telegram, Slack, Discord, iMessage, Signal...
```
---
## 📖 Setup Guide
<details>
<summary><b>💳 Subscription Providers</b></summary>
### Claude Code (Pro/Max)
```bash
Dashboard → Providers → Connect Claude Code
→ OAuth login → Auto token refresh
→ 5-hour + weekly quota tracking
Models:
cc/claude-opus-4-6
cc/claude-sonnet-4-5-20250929
cc/claude-haiku-4-5-20251001
```
**Pro Tip:** Use Opus for complex tasks, Sonnet for speed. OmniRoute tracks quota per model!
### OpenAI Codex (Plus/Pro)
```bash
Dashboard → Providers → Connect Codex
→ OAuth login (port 1455)
→ 5-hour + weekly reset
Models:
cx/gpt-5.2-codex
cx/gpt-5.1-codex-max
```
### Gemini CLI (FREE 180K/month!)
```bash
Dashboard → Providers → Connect Gemini CLI
→ Google OAuth
→ 180K completions/month + 1K/day
Models:
gc/gemini-3-flash-preview
gc/gemini-2.5-pro
```
**Best Value:** Huge free tier! Use this before paid tiers.
### GitHub Copilot
```bash
Dashboard → Providers → Connect GitHub
→ OAuth via GitHub
→ Monthly reset (1st of month)
Models:
gh/gpt-5
gh/claude-4.5-sonnet
gh/gemini-3-pro
```
</details>
<details>
<summary><b>🔑 API Key Providers</b></summary>
### NVIDIA NIM (FREE 1000 credits!)
1. Sign up: [build.nvidia.com](https://build.nvidia.com)
2. Get free API key (1000 inference credits included)
3. Dashboard → Add Provider → NVIDIA NIM:
- API Key: `nvapi-your-key`
**Models:** `nvidia/llama-3.3-70b-instruct`, `nvidia/mistral-7b-instruct`, and 50+ more
**Pro Tip:** OpenAI-compatible API — works seamlessly with OmniRoute's format translation!
### DeepSeek
1. Sign up: [platform.deepseek.com](https://platform.deepseek.com)
2. Get API key
3. Dashboard → Add Provider → DeepSeek
**Models:** `deepseek/deepseek-chat`, `deepseek/deepseek-coder`
### Groq (Free Tier Available!)
1. Sign up: [console.groq.com](https://console.groq.com)
2. Get API key (free tier included)
3. Dashboard → Add Provider → Groq
**Models:** `groq/llama-3.3-70b`, `groq/mixtral-8x7b`
**Pro Tip:** Ultra-fast inference — best for real-time coding!
### OpenRouter (100+ Models)
1. Sign up: [openrouter.ai](https://openrouter.ai)
2. Get API key
3. Dashboard → Add Provider → OpenRouter
**Models:** Access 100+ models from all major providers through a single API key.
</details>
<details>
<summary><b>💰 Cheap Providers (Backup)</b></summary>
### GLM-4.7 (Daily reset, $0.6/1M)
1. Sign up: [Zhipu AI](https://open.bigmodel.cn/)
2. Get API key from Coding Plan
3. Dashboard → Add API Key:
- Provider: `glm`
- API Key: `your-key`
**Use:** `glm/glm-4.7`
**Pro Tip:** Coding Plan offers 3× quota at 1/7 cost! Reset daily 10:00 AM.
### MiniMax M2.1 (5h reset, $0.20/1M)
1. Sign up: [MiniMax](https://www.minimax.io/)
2. Get API key
3. Dashboard → Add API Key
**Use:** `minimax/MiniMax-M2.1`
**Pro Tip:** Cheapest option for long context (1M tokens)!
### Kimi K2 ($9/month flat)
1. Subscribe: [Moonshot AI](https://platform.moonshot.ai/)
2. Get API key
3. Dashboard → Add API Key
**Use:** `kimi/kimi-latest`
**Pro Tip:** Fixed $9/month for 10M tokens = $0.90/1M effective cost!
</details>
<details>
<summary><b>🆓 FREE Providers (Emergency Backup)</b></summary>
### iFlow (8 FREE models)
```bash
Dashboard → Connect iFlow
→ iFlow OAuth login
→ Unlimited usage
Models:
if/kimi-k2-thinking
if/qwen3-coder-plus
if/glm-4.7
if/minimax-m2
if/deepseek-r1
```
### Qwen (3 FREE models)
```bash
Dashboard → Connect Qwen
→ Device code authorization
→ Unlimited usage
Models:
qw/qwen3-coder-plus
qw/qwen3-coder-flash
```
### Kiro (Claude FREE)
```bash
Dashboard → Connect Kiro
→ AWS Builder ID or Google/GitHub
→ Unlimited usage
Models:
kr/claude-sonnet-4.5
kr/claude-haiku-4.5
```
</details>
<details>
<summary><b>🎨 Create Combos</b></summary>
### Example 1: Maximize Subscription → Cheap Backup
```
Dashboard → Combos → Create New
Name: premium-coding
Models:
1. cc/claude-opus-4-6 (Subscription primary)
2. glm/glm-4.7 (Cheap backup, $0.6/1M)
3. minimax/MiniMax-M2.1 (Cheapest fallback, $0.20/1M)
Use in CLI: premium-coding
```
### Example 2: Free-Only (Zero Cost)
```
Name: free-combo
Models:
1. gc/gemini-3-flash-preview (180K free/month)
2. if/kimi-k2-thinking (unlimited)
3. qw/qwen3-coder-plus (unlimited)
Cost: $0 forever!
```
</details>
<details>
<summary><b>🔧 CLI Integration</b></summary>
### Cursor IDE
```
Settings → Models → Advanced:
OpenAI API Base URL: http://localhost:20128/v1
OpenAI API Key: [from OmniRoute dashboard]
Model: cc/claude-opus-4-6
```
### Claude Code
Use the **CLI Tools** page in the dashboard for one-click configuration, or edit `~/.claude/settings.json` manually.
### Codex CLI
```bash
export OPENAI_BASE_URL="http://localhost:20128"
export OPENAI_API_KEY="your-omniroute-api-key"
codex "your prompt"
```
### OpenClaw
**Option 1 — Dashboard (recommended):**
```
Dashboard → CLI Tools → OpenClaw → Select Model → Apply
```
**Option 2 — Manual:** Edit `~/.openclaw/openclaw.json`:
```json
{
"models": {
"providers": {
"omniroute": {
"baseUrl": "http://127.0.0.1:20128/v1",
"apiKey": "sk_omniroute",
"api": "openai-completions"
}
}
}
}
```
> **Note:** OpenClaw only works with local OmniRoute. Use `127.0.0.1` instead of `localhost` to avoid IPv6 resolution issues.
### Cline / Continue / RooCode
```
Settings → API Configuration:
Provider: OpenAI Compatible
Base URL: http://localhost:20128/v1
API Key: [from OmniRoute dashboard]
Model: if/kimi-k2-thinking
```
### OpenCode
**Step 1:** Add OmniRoute as a custom provider:
```bash
opencode
/connect
# Select "Other" → Enter ID: "omniroute" → Enter your OmniRoute API key
```
**Step 2:** Create/edit `opencode.json` in your project root:
```json
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"omniroute": {
"npm": "@ai-sdk/openai-compatible",
"name": "OmniRoute",
"options": {
"baseURL": "http://localhost:20128/v1"
},
"models": {
"cc/claude-sonnet-4-20250514": { "name": "Claude Sonnet 4" },
"gg/gemini-2.5-pro": { "name": "Gemini 2.5 Pro" },
"if/kimi-k2-thinking": { "name": "Kimi K2 (Free)" }
}
}
}
}
```
**Step 3:** Select the model in OpenCode:
```bash
/models
# Select any OmniRoute model from the list
```
> **Tip:** Add any model available in your OmniRoute `/v1/models` endpoint to the `models` section. Use the format `provider/model-id` from your OmniRoute dashboard.
</details>
---
## 🧪 Evaluations (Evals)
OmniRoute includes a built-in evaluation framework to test LLM response quality against a golden set. Access it via **Analytics → Evals** in the dashboard.
### Built-in Golden Set
The pre-loaded "OmniRoute Golden Set" contains 10 test cases covering:
- Greetings, math, geography, code generation
- JSON format compliance, translation, markdown
- Safety refusal (harmful content), counting, boolean logic
### Evaluation Strategies
| Strategy | Description | Example |
| ---------- | ------------------------------------------------ | -------------------------------- |
| `exact` | Output must match exactly | `"4"` |
| `contains` | Output must contain substring (case-insensitive) | `"Paris"` |
| `regex` | Output must match regex pattern | `"1.*2.*3"` |
| `custom` | Custom JS function returns true/false | `(output) => output.length > 10` |
---
## 🐛 Troubleshooting
<details>
<summary><b>Click to expand troubleshooting guide</b></summary>
**"Language model did not provide messages"**
- Provider quota exhausted → Check dashboard quota tracker
- Solution: Use combo fallback or switch to cheaper tier
**Rate limiting**
- Subscription quota out → Fallback to GLM/MiniMax
- Add combo: `cc/claude-opus-4-6 → glm/glm-4.7 → if/kimi-k2-thinking`
**OAuth token expired**
- Auto-refreshed by OmniRoute
- If issues persist: Dashboard → Provider → Reconnect
**High costs**
- Check usage stats in Dashboard → Costs
- Switch primary model to GLM/MiniMax
- Use free tier (Gemini CLI, iFlow) for non-critical tasks
**Dashboard/API ports are wrong**
- `PORT` is the canonical base port (and API port by default)
- `API_PORT` overrides only OpenAI-compatible API listener
- `DASHBOARD_PORT` overrides only dashboard/Next.js listener
- Set `NEXT_PUBLIC_BASE_URL` to your dashboard/public URL (for OAuth callbacks)
**Cloud sync errors**
- Verify `BASE_URL` points to your running instance
- Verify `CLOUD_URL` points to your expected cloud endpoint
- Keep `NEXT_PUBLIC_*` values aligned with server-side values
**First login not working**
- Check `INITIAL_PASSWORD` in `.env`
- If unset, fallback password is `123456`
**No request logs**
- Set `ENABLE_REQUEST_LOGS=true` in `.env`
**Connection test shows "Invalid" for OpenAI-compatible providers**
- Many providers don't expose a `/models` endpoint
- OmniRoute v1.0.6+ includes fallback validation via chat completions
- Ensure base URL includes `/v1` suffix
### 🔐 OAuth em Servidor Remoto (Remote OAuth Setup)
<a name="oauth-em-servidor-remoto"></a>
> **⚠️ IMPORTANTE para usuários com OmniRoute em VPS/Docker/servidor remoto**
#### Por que o OAuth do Antigravity / Gemini CLI falha em servidores remotos?
Os provedores **Antigravity** e **Gemini CLI** usam **Google OAuth 2.0** para autenticação. O Google exige que a `redirect_uri` usada no fluxo OAuth seja **exatamente** uma das URIs pré-cadastradas no Google Cloud Console do aplicativo.
As credenciais OAuth embutidas no OmniRoute estão cadastradas **apenas para `localhost`**. Quando você acessa o OmniRoute em um servidor remoto (ex: `https://omniroute.meuservidor.com`), o Google rejeita a autenticação com:
```
Error 400: redirect_uri_mismatch
```
#### Solução: Configure suas próprias credenciais OAuth
Você precisa criar um **OAuth 2.0 Client ID** no Google Cloud Console com a URI do seu servidor.
#### Passo a passo
**1. Acesse o Google Cloud Console**
Abra: [https://console.cloud.google.com/apis/credentials](https://console.cloud.google.com/apis/credentials)
**2. Crie um novo OAuth 2.0 Client ID**
- Clique em **"+ Create Credentials"** → **"OAuth client ID"**
- Tipo de aplicativo: **"Web application"**
- Nome: escolha qualquer nome (ex: `OmniRoute Remote`)
**3. Adicione as Authorized Redirect URIs**
No campo **"Authorized redirect URIs"**, adicione:
```
https://seu-servidor.com/callback
```
> Substitua `seu-servidor.com` pelo domínio ou IP do seu servidor (inclua a porta se necessário, ex: `http://45.33.32.156:20128/callback`).
**4. Salve e copie as credenciais**
Após criar, o Google mostrará o **Client ID** e o **Client Secret**.
**5. Configure as variáveis de ambiente**
No seu `.env` (ou nas variáveis de ambiente do Docker):
```bash
# Para Antigravity:
ANTIGRAVITY_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
ANTIGRAVITY_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
# Para Gemini CLI:
GEMINI_OAUTH_CLIENT_ID=seu-client-id.apps.googleusercontent.com
GEMINI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
GEMINI_CLI_OAUTH_CLIENT_SECRET=GOCSPX-seu-secret
```
**6. Reinicie o OmniRoute**
```bash
# Se usando npm:
npm run dev
# Se usando Docker:
docker restart omniroute
```
**7. Tente conectar novamente**
Dashboard → Providers → Antigravity (ou Gemini CLI) → OAuth
Agora o Google redirecionará corretamente para `https://seu-servidor.com/callback` e a autenticação funcionará.
---
#### Workaround temporário (sem configurar credenciais próprias)
Se não quiser criar credenciais próprias agora, ainda é possível usar o fluxo **manual de URL**:
1. O OmniRoute abrirá a URL de autorização do Google
2. Após você autorizar, o Google tentará redirecionar para `localhost` (que falha no servidor remoto)
3. **Copie a URL completa** da barra de endereço do seu browser (mesmo que a página não carregue)
4. Cole essa URL no campo que aparece no modal de conexão do OmniRoute
5. Clique em **"Connect"**
> Este workaround funciona porque o código de autorização na URL é válido independente do redirect ter carregado ou não.
</details>
---
## 🛠️ Tech Stack
- **Runtime**: Node.js 1822 LTS (⚠️ Node.js 24+ is **not supported**`better-sqlite3` native binaries are incompatible)
- **Language**: TypeScript 5.9 — **100% TypeScript** across `src/` and `open-sse/` (v1.0.6)
- **Framework**: Next.js 16 + React 19 + Tailwind CSS 4
- **Database**: LowDB (JSON) + SQLite (domain state + proxy logs)
- **Streaming**: Server-Sent Events (SSE)
- **Auth**: OAuth 2.0 (PKCE) + JWT + API Keys
- **Testing**: Node.js test runner (368+ unit tests)
- **CI/CD**: GitHub Actions (auto npm publish + Docker Hub on release)
- **Website**: [omniroute.online](https://omniroute.online)
- **Package**: [npmjs.com/package/omniroute](https://www.npmjs.com/package/omniroute)
- **Docker**: [hub.docker.com/r/diegosouzapw/omniroute](https://hub.docker.com/r/diegosouzapw/omniroute)
- **Resilience**: Circuit breaker, exponential backoff, anti-thundering herd, TLS spoofing
---
## 📖 Documentation
| Document | Description |
| -------------------------------------------- | ---------------------------------------------- |
| [User Guide](docs/USER_GUIDE.md) | Providers, combos, CLI integration, deployment |
| [API Reference](docs/API_REFERENCE.md) | All endpoints with examples |
| [Troubleshooting](docs/TROUBLESHOOTING.md) | Common problems and solutions |
| [Architecture](docs/ARCHITECTURE.md) | System architecture and internals |
| [Contributing](CONTRIBUTING.md) | Development setup and guidelines |
| [OpenAPI Spec](docs/openapi.yaml) | OpenAPI 3.0 specification |
| [Security Policy](SECURITY.md) | Vulnerability reporting and security practices |
| [VM Deployment](docs/VM_DEPLOYMENT_GUIDE.md) | Complete guide: VM + nginx + Cloudflare setup |
| [Features Gallery](docs/FEATURES.md) | Visual dashboard tour with screenshots |
### 📸 Dashboard Preview
<details>
<summary><b>Click to see dashboard screenshots</b></summary>
| Page | Screenshot |
| -------------- | ------------------------------------------------- |
| **Providers** | ![Providers](docs/screenshots/01-providers.png) |
| **Combos** | ![Combos](docs/screenshots/02-combos.png) |
| **Analytics** | ![Analytics](docs/screenshots/03-analytics.png) |
| **Health** | ![Health](docs/screenshots/04-health.png) |
| **Translator** | ![Translator](docs/screenshots/05-translator.png) |
| **Settings** | ![Settings](docs/screenshots/06-settings.png) |
| **CLI Tools** | ![CLI Tools](docs/screenshots/07-cli-tools.png) |
| **Usage Logs** | ![Usage](docs/screenshots/08-usage.png) |
| **Endpoint** | ![Endpoint](docs/screenshots/09-endpoint.png) |
</details>
---
## 🗺️ Roadmap
OmniRoute has **210+ features planned** across multiple development phases. Here are the key areas:
| Category | Planned Features | Highlights |
| ----------------------------- | ---------------- | -------------------------------------------------------------------------------------- |
| 🧠 **Routing & Intelligence** | 25+ | Lowest-latency routing, tag-based routing, quota preflight, P2C account selection |
| 🔒 **Security & Compliance** | 20+ | SSRF hardening, credential cloaking, rate-limit per endpoint, management key scoping |
| 📊 **Observability** | 15+ | OpenTelemetry integration, real-time quota monitoring, cost tracking per model |
| 🔄 **Provider Integrations** | 20+ | Dynamic model registry, provider cooldowns, multi-account Codex, Copilot quota parsing |
| ⚡ **Performance** | 15+ | Dual cache layer, prompt cache, response cache, streaming keepalive, batch API |
| 🌐 **Ecosystem** | 10+ | WebSocket API, config hot-reload, distributed config store, commercial mode |
### 🔜 Coming Soon
- 🔗 **OpenCode Integration** — Native provider support for the OpenCode AI coding IDE
- 🔗 **TRAE Integration** — Full support for the TRAE AI development framework
- 📦 **Batch API** — Asynchronous batch processing for bulk requests
- 🎯 **Tag-Based Routing** — Route requests based on custom tags and metadata
- 💰 **Lowest-Cost Strategy** — Automatically select the cheapest available provider
> 📝 Full feature specifications available in [`docs/new-features/`](docs/new-features/) (217 detailed specs)
---
## 👥 Contributors
[![Contributors](https://contrib.rocks/image?repo=diegosouzapw/OmniRoute&max=100&columns=20&anon=1)](https://github.com/diegosouzapw/OmniRoute/graphs/contributors)
### How to Contribute
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.
### Releasing a New Version
```bash
# Create a release — npm publish happens automatically
gh release create v1.0.6 --title "v1.0.6" --generate-notes
```
---
## 📊 Star History
<a href="https://star-history.com/#diegosouzapw/OmniRoute&Date">
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=diegosouzapw/OmniRoute&type=Date&theme=dark" />
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=diegosouzapw/OmniRoute&type=Date" />
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=diegosouzapw/OmniRoute&type=Date" />
</picture>
</a>
---
## 🙏 Acknowledgments
Special thanks to **[9router](https://github.com/decolua/9router)** by **[decolua](https://github.com/decolua)** — the original project that inspired this fork. OmniRoute builds upon that incredible foundation with additional features, multi-modal APIs, and a full TypeScript rewrite.
Special thanks to **[CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI)** — the original Go implementation that inspired this JavaScript port.
---
## 📄 License
MIT License - see [LICENSE](LICENSE) for details.
---
<div align="center">
<sub>Built with ❤️ for developers who code 24/7</sub>
<br/>
<sub><a href="https://omniroute.online">omniroute.online</a></sub>
</div>
<!-- GitHub Discussions enabled for community Q&A -->