Require dashboard session cookies on protected management APIs and
reject bearer API keys with explicit 403 responses to prevent
privilege escalation across provider, settings, and model alias routes.
Add a dedicated payload rules management surface with dashboard UI,
OpenAPI documentation, route normalization, and tests for hot-reloaded
runtime updates.
Consolidate provider catalog metadata for dashboard pages, add
Perplexity web-cookie provider support, retire the legacy provider
creation page, and improve upstream proxy handling.
Harden startup and runtime behavior by moving cloud sync bootstrap to
server instrumentation, skipping background services during build/test,
making models.dev sync abortable, pruning isolated build artifacts, and
improving DB backup and recovery safeguards.
* fix(streaming): #1211 greedy strip omniModel tags to prevent literal \n\n artifacts
- Changed regex quantifier from ? to * in combo.ts, comboAgentMiddleware.ts,
and contextHandoff.ts to greedily strip all JSON-escaped newline sequences
surrounding <omniModel> tags in SSE streaming chunks
- Added \r to the character class for cross-platform robustness
- Fixed Playwright strict-mode violation in combo-unification.spec.ts
- Bumped OpenAPI version and CHANGELOG to 3.6.6
* fix: 3 bugs found during issue triage (#1175, #1187/#1218, #1202)
- fix(gemini): strip VS Code JSON Schema extensions from tool schemas (#1175)
Add enumDescriptions, markdownDescription, markdownEnumDescriptions,
enumItemLabels and tags to UNSUPPORTED_SCHEMA_CONSTRAINTS so the Gemini
sanitizer removes them before forwarding. GitHub Copilot injects these
non-standard fields into tool definitions, causing Gemini to reject with
'Unknown name enumDescriptions at functionDeclarations[n].parameters'.
- fix(health-check): unwrap proxy config object before passing to getAccessToken (#1187#1218)
resolveProxyForConnection() returns { proxy, level, levelId } but the health
check loop was passing the full wrapper to getAccessToken(), which expects the
inner config object (.host, .port etc). The proxy dispatcher validated .host
on the wrapper (undefined) and threw 'Context proxy host is required', silently
marking every connection as unhealthy every sweep. Fix mirrors the pattern
already used in chatHelpers.ts: proxyResult?.proxy || null.
- fix(ui): debounce models.dev sync interval slider to save only on release (#1202)
The slider's onChange fired updateInterval() on every drag tick, sending a
PATCH per pixel of movement. Rapid API responses overwrote UI state mid-drag.
Introduce draftIntervalHours for smooth visual feedback; the PATCH fires
on onMouseUp / onBlur once the user releases the control.
* fix(providers): update Xiaomi MiMo token-plan endpoints (#1238)
Integrated into release/v3.6.6
* fix(cc-compatible): trim beta flags and preserve cache passthrough (#1230)
Integrated into release/v3.6.6
* feat(memory+skills): full-featured memory & skills systems with tests (#1228)
Integrated into release/v3.6.6
* fix: forward client x-initiator header to GitHub Copilot upstream (#1227)
Integrated into release/v3.6.6
* feat(bailian-quota): add Alibaba Coding Plan quota monitoring (#1235)
* fix: resolve v3.6.6 backlog bugs (#1206, #1211, #1220, #1231)
- fix(core): #1206 inject startup guard against app/ and src/app/ conflict
- fix(health): #1220 add HEALTHCHECK_STAGGER_MS to prevent token refresh bursting
- fix(proxy): #1231 prioritize HTTP 429 over quota body heuristics
- fix(sse): #1211 strip leading double-newlines in responses API stream
* fix(tests): resolve memory migration and skills route pagination bugs from PR overlaps
* docs: Update CHANGELOG.md with v3.6.6 features (#1182, #1165, #1177)
* chore(release): bump version to 3.6.6
Update package versions for the electron app and open-sse package.
Sync llm.txt metadata and feature headings with the 3.6.6 release.
* feat(core): harden outbound provider calls and add cooldown retries
Add guarded outbound fetch helpers with private/local URL blocking,
controlled retries, timeout normalization, and route-level status
propagation for provider validation and model discovery.
Introduce cooldown-aware chat retries with configurable
requestRetry and maxRetryIntervalSec settings, model-scoped cooldown
responses, and improved rate-limit learning from headers and error
bodies so short upstream lockouts can recover automatically.
Also align Antigravity and Codex header handling, require API keys
for Pollinations, validate web runtime env at startup, restore
sanitized Gemini tool names in translated responses, and inject a
synthetic Claude text block when upstream SSE completes empty.
* feat(models): add glmt preset and hybrid token counting
Introduce GLM Thinking as a first-class provider preset with shared GLM
model metadata, pricing, usage sync, dashboard support, and provider
request defaults for higher token budgets and longer timeouts.
Use provider-side /messages/count_tokens when a Claude-compatible
upstream supports it, while preserving estimated fallback behavior for
missing models, missing credentials, and upstream failures.
Also add startup seeding for default model aliases and normalize common
cross-proxy model dialects so canonical slashful model ids do not get
misrouted during resolution.
* feat(api): add sync tokens and v1 websocket bridge
Add dedicated sync token storage, issuance, revocation, and bundle
download routes backed by stable config bundle versioning and ETag
support.
Expose the v1 websocket handshake route and custom Next server bridge so
OpenAI-compatible websocket traffic can be upgraded and proxied through
the dashboard and API bridge.
Expand compliance auditing with structured metadata, pagination, request
context, auth and provider credential events, and SSRF-blocked
validation logging.
* docs: Update all documentation for v3.6.6
- CHANGELOG: Add WebSocket bridge, GLM Thinking preset, safe outbound
fetch/SSRF guard, cooldown-aware retries, compliance audit v2, model
alias seeding, and all Internal Improvements for the 3 new commits
- README: Expand v3.6.x highlights table with 10 new features; add
SafeOutboundFetch, CooldownAwareRetry, SSRF guard, TPS metric, sync
tokens, WebSocket bridge to Resilience/Observability/Deployment tables
- ARCHITECTURE: Bump date; add new modules to executive summary, API
routes, SSE core services, Auth/Security section; add SSRF/Outbound
guard failure mode (section 6); expand module mapping
- ENVIRONMENT: Add OMNIROUTE_CRYPT_KEY/OMNIROUTE_API_KEY_BASE64 legacy
aliases, OUTBOUND_SSRF_GUARD_ENABLED, CODEX_CLIENT_VERSION, and
REQUEST_RETRY/MAX_RETRY_INTERVAL_SEC cooldown retry settings
- FEATURES: Add 6 new feature sections — V1 WebSocket Bridge, Sync
Tokens & Config Bundle, GLM Thinking Preset, Safe Outbound Fetch &
SSRF Guard, Cooldown-Aware Retries, Compliance Audit v2
* fix: use api64 for proxy test (#1255)
Integrated into release/v3.6.6 — IPv6 proxy test fix
* fix(page): update custom models section to include all providers #1200 (#1256)
Integrated into release/v3.6.6 — Gemini custom model picker fix
* fix: provide default client_id fallbacks to prevent broken OAuth requests (#1246)
Integrated into release/v3.6.6 — OAuth client_id default fallbacks
* fix: translate max_tokens/max_completion_tokens → max_output_tokens in Chat→Responses translator (#1245)
Integrated into release/v3.6.6 — max_tokens → max_output_tokens Responses API translation + unit tests
* feat(oauth): support cursor-agent CLI as Cursor credential source (#1258)
Integrated into release/v3.6.6 — cursor-agent CLI credential source support
* fix(cc-compatible): restore upstream SSE and correct stream/combo timeout behavior (#1257)
Integrated into release/v3.6.6 — CC-compatible upstream SSE restore + stream timeout fix + README table repair
* fix(cli-tools): resolve API key resolution and model mapping bugs in CLI tools (#1263)
Integrated into release/v3.6.6
* feat(cli-tools): add Qwen Code CLI integration (#1266)
Integrated into release/v3.6.6
* fix(i18n): add missing zh-CN translations and fix logger imports (#1269)
Integrated into release/v3.6.6
* fix(i18n): add Chinese i18n support to dashboard components (#1274)
Integrated into release/v3.6.6
* feat: update Pollinations to require API key, remove free tier flag (#1177)
* feat: friendly error messages for crypto/encryption failures (#1165)
* feat: add TPS (tokens per second) metric column to request logs (#1182)
* feat: merge custom/imported models into filter list for all providers (#1191)
* feat(fallback): Fix provider-profile-driven lockouts (#1267)
This integrates rdself's unify-provider-profile-locks PR manually to handle structural conflicts.
* fix(claude): proper Anthropic SDK integration (#1271)
* fix(healthcheck): use correct proxy wrapper format for getAccessToken (#1272)
* chore(release): v3.6.6 — skills registry stability fix + final integration
* fix(auth): harden bootstrap auth and memory dashboard behavior
Restrict unauthenticated writes to /api/settings/require-login to
the initial bootstrap window while keeping read-only checks public.
This prevents post-setup config changes without blocking first-run
login setup, and the onboarding flow now logs in immediately after
setting the password.
Restore memory API filtering and pagination behavior by supporting q
searches, honoring offset-based requests, and avoiding unrelated
fallback results when FTS misses. Update dashboard stats fallback to
use the response totals consistently.
Package the MCP server with explicit file entries and add regression
tests for bootstrap auth and memory route behavior
* fix(codex): remove max_output_tokens from body for compatibility
* chore(release): v3.6.6 — include PR 1274 fixes in changelog
* chore: exclude additional build artifacts and internal directories from npm package distribution
* fix: update Gemini OAuth test to match registry defaults + codex UI improvements
* fix: restore .mjs refs for scripts/ in test imports after ts migration
* fix: restore next.config.mjs ref in dev-origins test
* fix: implement db migration safety checks and codex config format
* fix: disable mass-migration abort during unit tests based on auto-backup flag
* fix: update script regex in auto-update tests to use .mjs
* feat: Add Perplexity Web (Session) provider (#1289)
Integrated into release/v3.6.6
* fix(cli): resolve codex routing config parsing, standardize select model button positioning, and clarify oauth documentation
* docs(changelog): record recent cli, provider, and test updates
Document the latest fixes for Codex routing configuration parsing and
Lobehub provider icon fallback behavior.
Add the note that the remaining JavaScript test files were migrated to
TypeScript ES modules to reflect the completed test stack transition.
* chore(release): merge #1286 minor improvements manually to avoid testing conflict
* chore(test): rename perplexity-web.test.mjs to .ts to maintain 100% TS codebase
* chore(docs): update CHANGELOG.md for perplexity-web provider
* fix(security): resolve CodeQL incomplete URL substring sanitization via URL parsing in test mocks
* fix: integrate compressContext() into chatCore.ts request pipeline
Proactively compress oversized contexts before sending to upstream providers,
preventing context_length_exceeded errors. Compression triggers at 85% of
model's context limit using the existing 3-layer compressContext() function.
- Import compressContext, estimateTokens, getTokenLimit from contextManager
- Add compression check after translation, before executor dispatch
- Estimate tokens and compare against 85% threshold of model's context limit
- Apply 3-layer compression (trim tools, compress thinking, purify history)
- Log compression events with before/after token counts and layers applied
- Audit compression events for observability
- Add unit tests verifying integration behavior
Closes#1290
* fix(tests): align reasoning expectations with GLM thinking structure
* fix: prevent orphaned tool_result messages in purifyHistory()
When purifyHistory() drops oldest messages to fit context window, it can
split tool_use/tool_result pairs — keeping the tool_result but dropping
the tool_use that initiated it. This causes upstream providers to reject
the request with format errors.
Add fixToolPairs() that runs after each purification pass to remove:
- OpenAI format: orphaned role='tool' messages without matching tool_calls ID
- Claude format: orphaned tool_result content blocks without matching tool_use ID
Closes#1291
* fix(tests): supply tool_use in mock so it is not dropped
* chore: convert remaining test to TypeScript
* fix(tests): restore compatibility with compressContext threshold test after tsx migration
* docs: finalize v3.6.6 release documentation
* fix(core): finalize provider removal, type issues, and codex API key config
* fix(dashboard): render Web/Cookie, Search, Audio provider sections and fix TypeScript errors
* fix: increase MCP web_search timeout to 60s (#1278)
* fix: route combo testing properly for embedding models (#1260)
* fix: accumulate excluded accounts in combo fallback loop (#1233)
* fix: strip leading whitespace and newlines from first streaming chunk (#1211)
* docs: clarify VPS and Docker settings for OAuth credentials (#1204)
* fix: return real retry-after for pipeline gates (#1301)
Integrated into release/v3.6.6 — returns real Retry-After values from pipeline gates
* feat: streaming semantic cache, Cursor auto-version detection, and call-log enhancements (#1296)
Integrated into release/v3.6.6 — streaming semantic cache, Cursor auto-version detection, call-log cache_source tracking
* feat(api): support more OpenAI types (image, embeddings, audio-transcriptions, audio-speech) (#1297)
Integrated into release/v3.6.6 — adds embeddings, audio-transcriptions, audio-speech, and images-generations support for custom OpenAI-compatible providers, plus Pollinations image registry
* deps: bump hono from 4.12.12 to 4.12.14 (#1302)
Integrated into release/v3.6.6
* deps: bump hono from 4.12.12 to 4.12.14 (#1306)
Integrated into release/v3.6.6
* chore: stabilization fixes for v3.6.6 (#1298, #1254, #59, CI)
* fix(providers): match correct endpoint for Xiaomi MiMo, strip routing prefix for custom openai endpoints (#1303, #1261)
* feat(storage): add database backup cleanup controls
* chore(release): v3.6.6 — Final Stabilization Push
* Backport call log storage refactor to release/v3.6.6 (#1307)
Integrated into release/v3.6.6
* deps: update dompurify to 3.4.0 to resolve CVE-XYZ (#60)
* test: disable sqlite auto backup in CI to resolve E2E timeout (#24481475058)
* chore(docs): sync CHANGELOG for v3.6.6 with missing features and fixes
* chore(release): prep v3.6.6 infrastructure and type safety fixes
- Migrated legacy .mjs scripts to .ts (bin, prepublish, policies)
- Resolved pre-commit strict lint (t11 budget) errors in combo.ts
- Explicitly typed all TS bindings in pack-artifact policies
- Updated package.json commands to run Node via tsx/esm internally
- Hardened CI/CD with explicit node version 22.22.2 checks
- Completed stage validations for v3.6.6 final release
* chore: fix TS build errors and e2e timeouts in CI
- Migrate nodeRuntimeSupport to TS interfaces avoiding implicit any
- Increase visibility timeouts in skills-marketplace E2E test to 15s to bypass CI flakiness
- Complete migration of .mjs scripts to .ts ensuring type safety
* chore(release): sync package version 3.6.6 across workspaces
* test(e2e): universally increase UI component visibility timeouts from 5s to 15s to bypass CI starvation
* chore(build): inject baseUrl, paths, and types:node into MITM tsconfig within prepublish hook to fix missing types in CI check
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: Jack <5443152+hijak@users.noreply.github.com>
Co-authored-by: Randi <55005611+rdself@users.noreply.github.com>
Co-authored-by: Paijo <14921983+oyi77@users.noreply.github.com>
Co-authored-by: Samuel Cedric <ceds.sam@gmail.com>
Co-authored-by: Max Garmash <max@37bytes.com>
Co-authored-by: Markus Hartung <mail@hartmark.se>
Co-authored-by: Gi99lin <74502520+Gi99lin@users.noreply.github.com>
Co-authored-by: Payne <baboialex95@gmail.com>
Co-authored-by: Benson K B <bensonkbmca@gmail.com>
Co-authored-by: clousky2020 <33016567+clousky2020@users.noreply.github.com>
Co-authored-by: Ravi Tharuma <25951435+RaviTharuma@users.noreply.github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: Hdsje <vovan877@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: xiaoge1688 <moyekongling@gmail.com>
- Add emailPrivacyStore (Zustand + persist) for global toggle state
- Add EmailPrivacyToggle component with eye open/closed icons
- Add pickDisplayValue() visibility-aware masking function
- Integrate toggle into provider detail, usage limits & playground pages
- Per-modal showEmail now uses global store (synced across all pages)
- Default: emails hidden; toggle persists across page reloads
- Add showEmails/hideEmails i18n keys
- Update CHANGELOG.md and openapi.yaml to v3.6.2
* fix(minimax): switch auth from x-api-key to Authorization Bearer (#1076)
Integrated into release/v3.5.6 — MiniMax auth fix with authHeader consistency normalization
* feat(CI,i18n): autogenerate language files + Add missing strings (#1071)
Integrated into release/v3.5.6 — i18n translations for memory, skills, and missing keys across 31 languages
* fix(ci): restore i18n continue-on-error, remove auto-commit race condition
* fix(husky): load nvm in hooks for VS Code compatibility
* fix(husky): gracefully skip hooks when npm is not in PATH
* fix: convert OpenAI function tool_choice to Claude tool format (#1072)
* fix: prevent EPIPE feedback loop filling logs at GB/s (#1006)
* fix: fallback to native fetch when undici dispatcher fails (#1054)
* fix: improve Qoder PAT validation with actionable error messages (#966)
- Add QODER_PERSONAL_ACCESS_TOKEN env var fallback for both validation and execution
- Pre-flight ping check to diagnose connectivity issues (Docker/proxy)
- Detect encrypted auth blobs from ~/.qoder/.auth/user and guide to website PAT
- Clear error messages for auth failures with link to integrations page
- Treat non-auth 4xx as auth-pass (request format issue, not token issue)
- Update tests to cover new validation paths (23 tests, all passing)
* feat: Improve the Chinese translation (#1079)
Integrated into release/v3.5.6
* chore(release): v3.5.6 — i18n updates and credential security fixes
* fix(ci): resolve e2e and docs-sync pipeline failures
* fix(security): bump next to 16.2.3 to resolve SNYK-JS-NEXT-15954202
* fix: guard Memory/Cache UI against null toLocaleString crash (#1083)
* fix: translate OpenAI tool_choice type 'function' to Claude 'tool' format (#1072)
* fix: pass custom baseUrl in provider API key validation (#1078)
* docs: update CHANGELOG with v3.5.6 bug fixes and security patches
* docs: rewrite implement-features workflow with 5-phase harvest-research-report-plan-execute pipeline
* docs: organize _ideia/ into viable/defer/notfit + add Phase 2.5 auto-response workflow
* docs: implementation plans for #1025, #750, #960, #1046 + close already-implemented #833, #973, #982
* feat: mask email addresses in dashboard for privacy (#1025)
* feat: add OpenRouter and GitHub to embedding/image provider registries (#960)
* feat: add model visibility toggle and search filter to provider page (#750)
* docs: move implemented features to notfit, update task plans status
* chore: untrack _ideia/ and _tasks/ from git — private/internal only
* chore(release): bump to v3.5.6 — changelog, docs, version sync & any-budget fix
* fix: remove explicit .ts extension in qoderCli import that caused 500 error in production build
---------
Co-authored-by: Jean Brito <jeanfbrito@gmail.com>
Co-authored-by: zenobit <zenobit@disroot.org>
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: Ethan Hunt <136065060+only4copilot@users.noreply.github.com>
* feat(qoder): native cosy integration
* feat(qoder): implement native COSY encryption algorithm and remove CLI child instances, plus workflow bumps
* feat(resilience): context overflow fallback, OAuth token detection, empty content guard & context-optimized combo strategy
- Add isContextOverflowError + isContextOverflow detectors (400 + token-limit signals)
- Auto-fallback to next family model on context overflow in chatCore
- Add isEmptyContentResponse to catch fake-success empty responses, trigger fallback + recursive retry
- Add OAUTH_INVALID_TOKEN error type (T11) with isOAuthInvalidToken signal matching; warn instead of deactivating node
- Add getModelContextLimit helper in modelsDevSync (reads limit_context from synced capabilities)
- Upgrade getTokenLimit in contextManager to check models.dev DB before registry (fixes gemini-2.5-pro: 1000000→1048576)
- Add findLargerContextModel in modelFamilyFallback for context-aware model selection
- Add sortModelsByContextSize + context-optimized combo strategy in combo.ts
- Update context-manager unit test for corrected gemini-2.5-pro limit
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(review): address Gemini code review — tool_calls path, infinite recursion, dedup signals, findLargerContextModel
- Fix isEmptyContentResponse: check message.tool_calls/delta.tool_calls instead
of firstChoice.tool_calls (wrong OpenAI API path, caused tool-call responses
to be falsely flagged as empty)
- Fix empty content fallback: replace recursive handleChatCore call (infinite
recursion risk + wrong model due to original body.model) with non-recursive
pattern — call executeProviderRequest, parse fallback response body, reassign
responseBody and fall through to existing processing
- Fix context overflow: use findLargerContextModel over family candidates first,
fall back to getNextFamilyFallback — ensures we pick a model with actually
larger context window on overflow
- Fix signal dedup: export CONTEXT_OVERFLOW_SIGNALS + CONTEXT_OVERFLOW_REGEX
from errorClassifier.ts; import shared regex in modelFamilyFallback.ts,
removing duplicate signal list and per-call RegExp construction
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(UI): add context-optimized strategy to frontend schema and options
* fix(sse): preserve Responses API events in stream translation
When translating Claude-format responses (e.g. GLM) to Responses API
format for Codex CLI, the sanitizer stripped {event, data} structured
items to {"object":"chat.completion.chunk"}, losing all content and
the critical response.completed event.
Only run sanitizeStreamingChunk on OpenAI Chat Completions chunks,
skipping items that have the Responses API {event, data} structure.
* test(sse): add regression test for Claude→Responses stream sanitization
Verifies that {event,data} structured items from the Responses API
translator bypass sanitizeStreamingChunk when translating Claude-format
providers (e.g. GLM) to Responses API format for Codex CLI.
* fix(sse): strengthen Responses API event detection with response. prefix check
Use explicit `response.` prefix check instead of generic `event && data`
presence check, as recommended in PR review.
* fix: pin Next.js to 16.0.10 to prevent Turbopack hashed module bug
Remove ^ prefix from next and eslint-config-next to prevent
automatic upgrades to 16.1.x+ which introduced content-based
hashing for external module references in Turbopack.
Also remove duplicate Material Symbols @import from globals.css
(font already loaded via <link> in layout.tsx).
Fixes#509
* align cc-compatible cache handling with client passthrough
* chore: integrate resilience and turbopack fixes (PRs #992, #990, #987)
* chore(release): bump to v3.5.2 — changelog, docs, version sync
* docs(i18n): sync documentation updates to 33 languages
* fix(qoder): replace any with unknown to comply with strict any-budget
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: oyi77 <oyi77@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Chris Staley <christopher-s@users.noreply.github.com>
Co-authored-by: Ivan <shanin-i2011@yandex.ru>
Co-authored-by: R.D. <rogerproself@gmail.com>
* chore(release): v3.2.8 — Docker auto-update UI and cache analytics fixes
* fix(sse): remove race condition in cache metrics tracking (#758)
- Remove in-memory metrics tracking (currentMetrics, trackCacheMetrics, updateCacheMetrics)
- Cache metrics now computed on-the-fly from usage_history table (single source of truth)
- Fixes CRITICAL issue from code review: concurrent requests overwriting metrics
- Fixes WARNING: duplicate metric tracking logic in streaming/non-streaming paths
Ref: PR #752 (merged before this fix was included)
* fix: handle allRateLimited credentials & forward extra body keys in embeddings/images routes (#757)
* fix: handle allRateLimited credentials in embeddings and images routes
When getProviderCredentials() returns an allRateLimited object (truthy,
but without apiKey/accessToken), the embeddings and images routes
incorrectly passed it to handlers as valid credentials. The handlers
then sent upstream requests without Authorization headers, causing
401 errors from providers (e.g. NVIDIA NIM).
This only manifested under concurrent requests: a chat/completions
call could trigger rate limiting on a provider account, and a
simultaneous embeddings request would receive the allRateLimited
sentinel — but treat it as valid credentials.
The chat pipeline already handled this case correctly. This commit
adds the same allRateLimited guard to all affected routes:
- POST /v1/embeddings
- POST /v1/providers/{provider}/embeddings
- POST /v1/images/generations
- POST /v1/providers/{provider}/images/generations
Also adds a defense-in-depth guard in the embeddings handler itself:
if no auth token is available for a non-local provider, return 401
immediately instead of sending an unauthenticated request upstream.
Made-with: Cursor
* fix(embeddings): forward extra body keys to upstream providers
The embeddings handler only forwarded model, input, dimensions, and
encoding_format to upstream providers, silently dropping any additional
fields. This broke asymmetric embedding APIs (e.g. NVIDIA NIM
nv-embedqa-e5-v5) that require input_type, and other providers
expecting user or truncate parameters.
Add a KNOWN_FIELDS exclusion set and forward all unrecognized body
keys to the upstream request, matching the passthrough pattern used
by the chat pipeline's DefaultExecutor.transformRequest().
Made-with: Cursor
* fix(auth): redirect and unconditional 401 on disabled requireLogin + fix test cases
* fix(build): remove legacy proxy.ts causing Next.js build collision
* fix(build): revert middleware.ts rename to proxy.ts because of Next.js Edge constraints
---------
Co-authored-by: diegosouzapw <diegosouzapw@users.noreply.github.com>
Co-authored-by: tombii <tombii@users.noreply.github.com>
Co-authored-by: Gorchakov-Pressure <117600961+Gorchakov-Pressure@users.noreply.github.com>