mirror of
https://github.com/diegosouzapw/OmniRoute.git
synced 2026-05-04 09:10:29 +00:00
- Rename a2a-server.md → A2A-SERVER.md, auto-combo.md → AUTO-COMBO.md, mcp-server.md → MCP-SERVER.md - Update gitignore whitelist to match uppercase names - Reddit Launch Plan v2 remains gitignored (local only)
63 lines
2.9 KiB
Markdown
63 lines
2.9 KiB
Markdown
# OmniRoute Auto-Combo Engine
|
||
|
||
> Self-managing model chains with adaptive scoring
|
||
|
||
## How It Works
|
||
|
||
The Auto-Combo Engine dynamically selects the best provider/model for each request using a **6-factor scoring function**:
|
||
|
||
| Factor | Weight | Description |
|
||
| :--------- | :----- | :---------------------------------------------- |
|
||
| Quota | 0.20 | Remaining capacity [0..1] |
|
||
| Health | 0.25 | Circuit breaker: CLOSED=1.0, HALF=0.5, OPEN=0.0 |
|
||
| CostInv | 0.20 | Inverse cost (cheaper = higher score) |
|
||
| LatencyInv | 0.15 | Inverse p95 latency (faster = higher) |
|
||
| TaskFit | 0.10 | Model × task type fitness score |
|
||
| Stability | 0.10 | Low variance in latency/errors |
|
||
|
||
## Mode Packs
|
||
|
||
| Pack | Focus | Key Weight |
|
||
| :---------------------- | :----------- | :--------------- |
|
||
| 🚀 **Ship Fast** | Speed | latencyInv: 0.35 |
|
||
| 💰 **Cost Saver** | Economy | costInv: 0.40 |
|
||
| 🎯 **Quality First** | Best model | taskFit: 0.40 |
|
||
| 📡 **Offline Friendly** | Availability | quota: 0.40 |
|
||
|
||
## Self-Healing
|
||
|
||
- **Temporary exclusion**: Score < 0.2 → excluded for 5 min (progressive backoff, max 30 min)
|
||
- **Circuit breaker awareness**: OPEN → auto-excluded; HALF_OPEN → probe requests
|
||
- **Incident mode**: >50% OPEN → disable exploration, maximize stability
|
||
- **Cooldown recovery**: After exclusion, first request is a "probe" with reduced timeout
|
||
|
||
## Bandit Exploration
|
||
|
||
5% of requests (configurable) are routed to random providers for exploration. Disabled in incident mode.
|
||
|
||
## API
|
||
|
||
```bash
|
||
# Create auto-combo
|
||
curl -X POST http://localhost:20128/api/combos/auto \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"id":"my-auto","name":"Auto Coder","candidatePool":["anthropic","google","openai"],"modePack":"ship-fast"}'
|
||
|
||
# List auto-combos
|
||
curl http://localhost:20128/api/combos/auto
|
||
```
|
||
|
||
## Task Fitness
|
||
|
||
30+ models scored across 6 task types (`coding`, `review`, `planning`, `analysis`, `debugging`, `documentation`). Supports wildcard patterns (e.g., `*-coder` → high coding score).
|
||
|
||
## Files
|
||
|
||
| File | Purpose |
|
||
| :------------------------------------------- | :------------------------------------ |
|
||
| `open-sse/services/autoCombo/scoring.ts` | Scoring function & pool normalization |
|
||
| `open-sse/services/autoCombo/taskFitness.ts` | Model × task fitness lookup |
|
||
| `open-sse/services/autoCombo/engine.ts` | Selection logic, bandit, budget cap |
|
||
| `open-sse/services/autoCombo/selfHealing.ts` | Exclusion, probes, incident mode |
|
||
| `open-sse/services/autoCombo/modePacks.ts` | 4 weight profiles |
|
||
| `src/app/api/combos/auto/route.ts` | REST API |
|