mirror of
https://github.com/diegosouzapw/OmniRoute.git
synced 2026-05-02 00:00:23 +00:00
## Bug Fixes - fix(lint): resolve 5 explicit 'any' violations in open-sse/services/usage.ts - quotas: Record<string,any> → Record<string,UsageQuota> - window: any → JsonRecord (hasUtilization + createQuotaObject) - (error as any) → (error as Error) in 2 catch blocks - valueRecord = toRecord(value) to safely pass unknown to JsonRecord fn ## Documentation - docs: move 29 README.<lang>.md from root to docs/i18n/<lang>/README.md (root now contains only English README.md) - docs(i18n): sync all 11 docs/*.md to 30 language folders (319 updates) Languages: ar bg da de es fi fr he hu id in it ja ko ms nl no phi pl pt pt-BR ro ru sk sv th uk-UA vi zh-CN ## Maintenance - chore: delete all duplicate auto-generated draft GitHub releases (v2.0.17–v2.1.0)
4 KiB
4 KiB
🌐 Languages: 🇺🇸 English · 🇧🇷 pt-BR · 🇪🇸 es · 🇫🇷 fr · 🇩🇪 de · 🇮🇹 it · 🇷🇺 ru · 🇨🇳 zh-CN · 🇯🇵 ja · 🇰🇷 ko · 🇸🇦 ar · 🇮🇳 in · 🇹🇭 th · 🇻🇳 vi · 🇮🇩 id · 🇲🇾 ms · 🇳🇱 nl · 🇵🇱 pl · 🇸🇪 sv · 🇳🇴 no · 🇩🇰 da · 🇫🇮 fi · 🇵🇹 pt · 🇷🇴 ro · 🇭🇺 hu · 🇧🇬 bg · 🇸🇰 sk · 🇺🇦 uk-UA · 🇮🇱 he · 🇵🇭 phi
OmniRoute Auto-Combo Engine
Self-managing model chains with adaptive scoring
How It Works
The Auto-Combo Engine dynamically selects the best provider/model for each request using a 6-factor scoring function:
| Factor | Weight | Description |
|---|---|---|
| Quota | 0.20 | Remaining capacity [0..1] |
| Health | 0.25 | Circuit breaker: CLOSED=1.0, HALF=0.5, OPEN=0.0 |
| CostInv | 0.20 | Inverse cost (cheaper = higher score) |
| LatencyInv | 0.15 | Inverse p95 latency (faster = higher) |
| TaskFit | 0.10 | Model × task type fitness score |
| Stability | 0.10 | Low variance in latency/errors |
Mode Packs
| Pack | Focus | Key Weight |
|---|---|---|
| 🚀 Ship Fast | Speed | latencyInv: 0.35 |
| 💰 Cost Saver | Economy | costInv: 0.40 |
| 🎯 Quality First | Best model | taskFit: 0.40 |
| 📡 Offline Friendly | Availability | quota: 0.40 |
Self-Healing
- Temporary exclusion: Score < 0.2 → excluded for 5 min (progressive backoff, max 30 min)
- Circuit breaker awareness: OPEN → auto-excluded; HALF_OPEN → probe requests
- Incident mode: >50% OPEN → disable exploration, maximize stability
- Cooldown recovery: After exclusion, first request is a "probe" with reduced timeout
Bandit Exploration
5% of requests (configurable) are routed to random providers for exploration. Disabled in incident mode.
API
# Create auto-combo
curl -X POST http://localhost:20128/api/combos/auto \
-H "Content-Type: application/json" \
-d '{"id":"my-auto","name":"Auto Coder","candidatePool":["anthropic","google","openai"],"modePack":"ship-fast"}'
# List auto-combos
curl http://localhost:20128/api/combos/auto
Task Fitness
30+ models scored across 6 task types (coding, review, planning, analysis, debugging, documentation). Supports wildcard patterns (e.g., *-coder → high coding score).
Files
| File | Purpose |
|---|---|
open-sse/services/autoCombo/scoring.ts |
Scoring function & pool normalization |
open-sse/services/autoCombo/taskFitness.ts |
Model × task fitness lookup |
open-sse/services/autoCombo/engine.ts |
Selection logic, bandit, budget cap |
open-sse/services/autoCombo/selfHealing.ts |
Exclusion, probes, incident mode |
open-sse/services/autoCombo/modePacks.ts |
4 weight profiles |
src/app/api/combos/auto/route.ts |
REST API |