feat: add OpenRouter support and configuration options

- Introduced OpenRouter as a new provider option in settings and environment configuration.
- Updated README.md to include instructions for using OpenRouter.
- Enhanced the message converter to support reasoning content for OpenRouter.
- Added tests for OpenRouter provider functionality and message conversion.
- Updated dependencies to include OpenRouterProvider.
This commit is contained in:
Alishahryar1 2026-02-15 10:50:53 -08:00
parent 2d72dc7304
commit e5a096049d
13 changed files with 788 additions and 24 deletions

View file

@ -1,3 +1,6 @@
# Provider: "nvidia_nim" | "open_router"
PROVIDER_TYPE=nvidia_nim
# All Claude model requests are mapped to this model
MODEL="stepfun-ai/step-3.5-flash"
@ -8,6 +11,12 @@ NVIDIA_NIM_RATE_LIMIT=40
NVIDIA_NIM_RATE_WINDOW=60
# OpenRouter Config
OPENROUTER_API_KEY=""
OPENROUTER_RATE_LIMIT=1
OPENROUTER_RATE_WINDOW=1
# Telegram Config
TELEGRAM_BOT_TOKEN=""
ALLOWED_TELEGRAM_USER_ID=""

View file

@ -2,7 +2,7 @@
# 🚀 Free Claude Code
### Use Claude Code for free with NVIDIA NIM
### Use Claude Code for free with NVIDIA NIM or OpenRouter
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
[![Python 3.14](https://img.shields.io/badge/python-3.14-3776ab.svg?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/downloads/)
@ -12,10 +12,10 @@
[![Code style: Ruff](https://img.shields.io/badge/code%20formatting-ruff-f5a623.svg?style=for-the-badge)](https://github.com/astral-sh/ruff)
[![Logging: Loguru](https://img.shields.io/badge/logging-loguru-4ecdc4.svg?style=for-the-badge)](https://github.com/Delgan/loguru)
A lightweight proxy that converts Claude Code's Anthropic API requests to NVIDIA NIM format.
**40 reqs/min free** · **Telegram bot** · **VSCode & CLI**
A lightweight proxy that converts Claude Code's Anthropic API requests to NVIDIA NIM or OpenRouter format.
**40 reqs/min free** · **Provider switching** · **Telegram bot** · **VSCode & CLI**
[Quick Start](#quick-start) · [Telegram Bot](#telegram-bot-integration) · [Models](#available-models) · [Configuration](#configuration)
[Quick Start](#quick-start) · [Provider Switching](#provider-switching) · [Telegram Bot](#telegram-bot-integration) · [Models](#available-models) · [Configuration](#configuration)
---
@ -27,7 +27,9 @@ A lightweight proxy that converts Claude Code's Anthropic API requests to NVIDIA
### 1. Prerequisites
1. Get a new API key from [build.nvidia.com/settings/api-keys](https://build.nvidia.com/settings/api-keys)
1. Get an API key:
- **NVIDIA NIM**: [build.nvidia.com/settings/api-keys](https://build.nvidia.com/settings/api-keys)
- **OpenRouter**: [openrouter.ai/keys](https://openrouter.ai/keys)
2. Install [claude-code](https://github.com/anthropics/claude-code)
3. Install [uv](https://github.com/astral-sh/uv)
@ -40,13 +42,22 @@ cd free-claude-code
cp .env.example .env
```
Edit `.env`:
Edit `.env` for **NVIDIA NIM** (default):
```dotenv
PROVIDER_TYPE=nvidia_nim
NVIDIA_NIM_API_KEY=nvapi-your-key-here
MODEL=moonshotai/kimi-k2-thinking
```
Or for **OpenRouter**:
```dotenv
PROVIDER_TYPE=open_router
OPENROUTER_API_KEY=sk-or-your-key-here
MODEL=stepfun/step-3.5-flash:free
```
---
### Claude Code CLI
@ -63,7 +74,7 @@ uv run uvicorn server:app --host 0.0.0.0 --port 8082
ANTHROPIC_AUTH_TOKEN=freecc ANTHROPIC_BASE_URL=http://localhost:8082 claude
```
That's it! Claude Code now uses NVIDIA NIM for free.
That's it! Claude Code now uses your configured provider for free.
---
@ -90,7 +101,20 @@ uv run uvicorn server:app --host 0.0.0.0 --port 8082
6. **If you see the login screen** ("How do you want to log in?"): Click **Anthropic Console**, then authorize. The extension will start working. You may be redirected to buy credits in the browser—ignore that; the extension already works.
That's it! The Claude Code VSCode extension now uses NVIDIA NIM for free. To go back to Anthropic models just comment out the added block and reload extensions.
That's it! The Claude Code VSCode extension now uses your configured provider for free. To go back to Anthropic models just comment out the added block and reload extensions.
---
### Provider Switching
Switch between **NVIDIA NIM** and **OpenRouter** via `PROVIDER_TYPE`:
| Provider | `PROVIDER_TYPE` | API Key Variable | Base URL |
| ------------- | ---------------- | ---------------------- | --------------------------------- |
| NVIDIA NIM | `nvidia_nim` | `NVIDIA_NIM_API_KEY` | `integrate.api.nvidia.com/v1` |
| OpenRouter | `open_router` | `OPENROUTER_API_KEY` | `openrouter.ai/api/v1` |
OpenRouter gives access to hundreds of models (stepfun, OpenAI, Anthropic, etc.) through a single API. Set `MODEL` to any OpenRouter model ID, e.g. `stepfun/step-3.5-flash:free`.
---
@ -139,9 +163,7 @@ uv run uvicorn server:app --host 0.0.0.0 --port 8082
## Available Models
See [`nvidia_nim_models.json`](nvidia_nim_models.json) for the full list of supported models.
Popular choices:
**NVIDIA NIM** (`PROVIDER_TYPE=nvidia_nim`): See [`nvidia_nim_models.json`](nvidia_nim_models.json) for the full list. Popular choices:
- `z-ai/glm5`
- `stepfun-ai/step-3.5-flash`
@ -149,21 +171,31 @@ Popular choices:
- `minimaxai/minimax-m2.1`
- `mistralai/devstral-2-123b-instruct-2512`
Browse all models at [build.nvidia.com](https://build.nvidia.com/explore/discover)
Browse at [build.nvidia.com](https://build.nvidia.com/explore/discover).
### Updating the Model List
### Updating the NIM Model List
To update `nvidia_nim_models.json` with the latest models from NVIDIA NIM, run the following command:
To update `nvidia_nim_models.json` with the latest models from NVIDIA NIM:
```bash
curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
```
**OpenRouter** (`PROVIDER_TYPE=open_router`): Hundreds of models from stepfun, OpenAI, Anthropic, Google, etc. Examples:
- `stepfun/step-3.5-flash:free`
- `openai/gpt-4o-mini`
- `anthropic/claude-3.5-sonnet`
Browse at [openrouter.ai/models](https://openrouter.ai/models).
## Configuration
| Variable | Description | Default |
| --------------------------------- | ------------------------------- | ----------------------------- |
| `NVIDIA_NIM_API_KEY` | Your NVIDIA API key | required |
| `PROVIDER_TYPE` | Provider: `nvidia_nim` or `open_router` | `nvidia_nim` |
| `NVIDIA_NIM_API_KEY` | Your NVIDIA API key (NIM provider) | required |
| `OPENROUTER_API_KEY` | Your OpenRouter API key (OpenRouter provider) | required |
| `MODEL` | Model to use for all requests | `stepfun-ai/step-3.5-flash` |
| `CLAUDE_WORKSPACE` | Directory for agent workspace | `./agent_workspace` |
| `ALLOWED_DIR` | Allowed directories for agent | `""` |
@ -177,10 +209,13 @@ curl "https://integrate.api.nvidia.com/v1/models" > nvidia_nim_models.json
| `ALLOWED_TELEGRAM_USER_ID` | Allowed Telegram User ID | `""` |
| `MESSAGING_RATE_LIMIT` | Telegram messages per window | `1` |
| `MESSAGING_RATE_WINDOW` | Messaging window (seconds) | `1` |
| `NVIDIA_NIM_RATE_LIMIT` | API requests per window | `40` |
| `NVIDIA_NIM_RATE_WINDOW` | Rate limit window (seconds) | `60` |
| `NVIDIA_NIM_RATE_LIMIT` | NIM API requests per window | `40` |
| `NVIDIA_NIM_RATE_WINDOW` | NIM rate limit window (seconds)| `60` |
| `OPENROUTER_RATE_LIMIT` | OpenRouter requests per window | `40` |
| `OPENROUTER_RATE_WINDOW` | OpenRouter rate limit window | `60` |
The NVIDIA NIM base URL is fixed to `https://integrate.api.nvidia.com/v1`.
- **NVIDIA NIM** base URL: `https://integrate.api.nvidia.com/v1`
- **OpenRouter** base URL: `https://openrouter.ai/api/v1`
**NIM Settings (prefix `NVIDIA_NIM_`)**

View file

@ -35,14 +35,26 @@ def get_provider() -> BaseProvider:
)
_provider = NvidiaNimProvider(config)
logger.info("Provider initialized: %s", settings.provider_type)
elif settings.provider_type == "open_router":
from providers.open_router import OpenRouterProvider
config = ProviderConfig(
api_key=settings.open_router_api_key,
base_url="https://openrouter.ai/api/v1",
rate_limit=settings.open_router_rate_limit,
rate_window=settings.open_router_rate_window,
nim_settings=settings.nim,
)
_provider = OpenRouterProvider(config)
logger.info("Provider initialized: %s", settings.provider_type)
else:
logger.error(
"Unknown provider_type: '%s'. Supported: 'nvidia_nim'",
"Unknown provider_type: '%s'. Supported: 'nvidia_nim', 'open_router'",
settings.provider_type,
)
raise ValueError(
f"Unknown provider_type: '{settings.provider_type}'. "
f"Supported: 'nvidia_nim'"
f"Supported: 'nvidia_nim', 'open_router'"
)
return _provider

View file

@ -19,8 +19,18 @@ class Settings(BaseSettings):
"""Application settings loaded from environment variables."""
# ==================== Provider Selection ====================
# Valid: "nvidia_nim" | "open_router"
provider_type: str = "nvidia_nim"
# ==================== OpenRouter Config ====================
open_router_api_key: str = Field(default="", validation_alias="OPENROUTER_API_KEY")
open_router_rate_limit: int = Field(
default=40, validation_alias="OPENROUTER_RATE_LIMIT"
)
open_router_rate_window: int = Field(
default=60, validation_alias="OPENROUTER_RATE_WINDOW"
)
# ==================== Messaging Platform Selection ====================
messaging_platform: str = "telegram"

View file

@ -2,6 +2,7 @@
from .base import BaseProvider, ProviderConfig
from .nvidia_nim import NvidiaNimProvider
from .open_router import OpenRouterProvider
from .exceptions import (
ProviderError,
AuthenticationError,
@ -15,6 +16,7 @@ __all__ = [
"BaseProvider",
"ProviderConfig",
"NvidiaNimProvider",
"OpenRouterProvider",
"ProviderError",
"AuthenticationError",
"InvalidRequestError",

View file

@ -22,8 +22,17 @@ class AnthropicToOpenAIConverter:
"""Converts Anthropic message format to OpenAI format."""
@staticmethod
def convert_messages(messages: List[Any]) -> List[Dict[str, Any]]:
"""Convert a list of Anthropic messages to OpenAI format."""
def convert_messages(
messages: List[Any],
*,
include_reasoning_for_openrouter: bool = False,
) -> List[Dict[str, Any]]:
"""Convert a list of Anthropic messages to OpenAI format.
When include_reasoning_for_openrouter is True, assistant messages with
thinking blocks get reasoning_content added for OpenRouter multi-turn
reasoning continuation.
"""
result = []
for msg in messages:
@ -35,7 +44,10 @@ class AnthropicToOpenAIConverter:
elif isinstance(content, list):
if role == "assistant":
result.extend(
AnthropicToOpenAIConverter._convert_assistant_message(content)
AnthropicToOpenAIConverter._convert_assistant_message(
content,
include_reasoning_for_openrouter=include_reasoning_for_openrouter,
)
)
elif role == "user":
result.extend(
@ -47,9 +59,14 @@ class AnthropicToOpenAIConverter:
return result
@staticmethod
def _convert_assistant_message(content: List[Any]) -> List[Dict[str, Any]]:
def _convert_assistant_message(
content: List[Any],
*,
include_reasoning_for_openrouter: bool = False,
) -> List[Dict[str, Any]]:
"""Convert assistant message blocks, preserving interleaved thinking+text order."""
content_parts: List[str] = []
thinking_parts: List[str] = []
tool_calls: List[Dict[str, Any]] = []
for block in content:
@ -60,6 +77,8 @@ class AnthropicToOpenAIConverter:
elif block_type == "thinking":
thinking = get_block_attr(block, "thinking", "")
content_parts.append(f"<think>\n{thinking}\n</think>")
if include_reasoning_for_openrouter:
thinking_parts.append(thinking)
elif block_type == "tool_use":
tool_input = get_block_attr(block, "input", {})
tool_calls.append(
@ -88,6 +107,8 @@ class AnthropicToOpenAIConverter:
}
if tool_calls:
msg["tool_calls"] = tool_calls
if include_reasoning_for_openrouter and thinking_parts:
msg["reasoning_content"] = "\n".join(thinking_parts)
return [msg]

View file

@ -0,0 +1,5 @@
"""OpenRouter provider - OpenAI-compatible API for hundreds of models."""
from .client import OpenRouterProvider
__all__ = ["OpenRouterProvider"]

View file

@ -0,0 +1,370 @@
"""OpenRouter provider implementation."""
import json
import logging
import uuid
from typing import Any, AsyncIterator
from loguru import logger as loguru_logger
from openai import AsyncOpenAI
from providers.base import BaseProvider, ProviderConfig
from providers.rate_limit import GlobalRateLimiter
from providers.nvidia_nim.errors import map_error
from providers.nvidia_nim.utils import (
SSEBuilder,
map_stop_reason,
ThinkTagParser,
HeuristicToolParser,
ContentType,
)
from .request import build_request_body
logger = logging.getLogger(__name__)
OPENROUTER_BASE_URL = "https://openrouter.ai/api/v1"
class OpenRouterProvider(BaseProvider):
"""OpenRouter provider using OpenAI-compatible API."""
def __init__(self, config: ProviderConfig):
super().__init__(config)
self._api_key = config.api_key
self._base_url = (config.base_url or OPENROUTER_BASE_URL).rstrip("/")
self._global_rate_limiter = GlobalRateLimiter.get_instance(
rate_limit=config.rate_limit,
rate_window=config.rate_window,
)
self._client = AsyncOpenAI(
api_key=self._api_key,
base_url=self._base_url,
max_retries=0,
timeout=300.0,
)
def _build_request_body(self, request: Any) -> dict:
"""Internal helper for tests and shared building."""
return build_request_body(request)
async def stream_response(
self,
request: Any,
input_tokens: int = 0,
*,
request_id: str | None = None,
) -> AsyncIterator[str]:
"""Stream response in Anthropic SSE format."""
with loguru_logger.contextualize(request_id=request_id):
async for event in self._stream_response_impl(
request, input_tokens, request_id
):
yield event
async def _stream_response_impl(
self,
request: Any,
input_tokens: int,
request_id: str | None,
) -> AsyncIterator[str]:
"""Internal streaming implementation with context bound."""
message_id = f"msg_{uuid.uuid4()}"
sse = SSEBuilder(message_id, request.model, input_tokens)
body = self._build_request_body(request)
req_tag = f" request_id={request_id}" if request_id else ""
logger.info(
"OPENROUTER_STREAM:%s model=%s msgs=%d tools=%d",
req_tag,
body.get("model"),
len(body.get("messages", [])),
len(body.get("tools", [])),
)
yield sse.message_start()
think_parser = ThinkTagParser()
heuristic_parser = HeuristicToolParser()
finish_reason = None
usage_info = None
error_occurred = False
error_message = ""
try:
stream = await self._global_rate_limiter.execute_with_retry(
self._client.chat.completions.create, **body, stream=True
)
async for chunk in stream:
if getattr(chunk, "usage", None):
usage_info = chunk.usage
if not chunk.choices:
continue
choice = chunk.choices[0]
delta = choice.delta
if delta is None:
continue
if choice.finish_reason:
finish_reason = choice.finish_reason
logger.debug("OPENROUTER finish_reason: %s", finish_reason)
# Handle reasoning_content (OpenRouter/OpenAI extended format)
reasoning = getattr(delta, "reasoning_content", None)
if reasoning:
for event in sse.ensure_thinking_block():
yield event
yield sse.emit_thinking_delta(reasoning)
# Handle reasoning_details (e.g. stepfun models)
reasoning_details = getattr(delta, "reasoning_details", None)
if reasoning_details and isinstance(reasoning_details, list):
for item in reasoning_details:
text = item.get("text", "") if isinstance(item, dict) else ""
if text:
for event in sse.ensure_thinking_block():
yield event
yield sse.emit_thinking_delta(text)
# Handle text content
if delta.content:
for part in think_parser.feed(delta.content):
if part.type == ContentType.THINKING:
for event in sse.ensure_thinking_block():
yield event
yield sse.emit_thinking_delta(part.content)
else:
filtered_text, detected_tools = heuristic_parser.feed(
part.content
)
if filtered_text:
for event in sse.ensure_text_block():
yield event
yield sse.emit_text_delta(filtered_text)
for tool_use in detected_tools:
for event in sse.close_content_blocks():
yield event
block_idx = sse.blocks.allocate_index()
if tool_use.get("name") == "Task" and isinstance(
tool_use.get("input"), dict
):
tool_use["input"]["run_in_background"] = False
yield sse.content_block_start(
block_idx,
"tool_use",
id=tool_use["id"],
name=tool_use["name"],
)
yield sse.content_block_delta(
block_idx,
"input_json_delta",
json.dumps(tool_use["input"]),
)
yield sse.content_block_stop(block_idx)
# Handle native tool calls
if delta.tool_calls:
for event in sse.close_content_blocks():
yield event
for tc in delta.tool_calls:
tc_info = {
"index": tc.index,
"id": tc.id,
"function": {
"name": tc.function.name,
"arguments": tc.function.arguments,
},
}
for event in self._process_tool_call(tc_info, sse):
yield event
except Exception as e:
req_tag = f" request_id={request_id}" if request_id else ""
logger.error("OPENROUTER_ERROR:%s %s: %s", req_tag, type(e).__name__, e)
mapped_e = map_error(e)
error_occurred = True
error_message = str(mapped_e)
logger.info(
"OPENROUTER_STREAM: Emitting SSE error event for %s%s",
type(e).__name__,
req_tag,
)
for event in sse.close_content_blocks():
yield event
for event in sse.emit_error(error_message):
yield event
# Flush remaining content
remaining = think_parser.flush()
if remaining:
if remaining.type == ContentType.THINKING:
for event in sse.ensure_thinking_block():
yield event
yield sse.emit_thinking_delta(remaining.content)
else:
for event in sse.ensure_text_block():
yield event
yield sse.emit_text_delta(remaining.content)
for tool_use in heuristic_parser.flush():
for event in sse.close_content_blocks():
yield event
block_idx = sse.blocks.allocate_index()
yield sse.content_block_start(
block_idx,
"tool_use",
id=tool_use["id"],
name=tool_use["name"],
)
if tool_use.get("name") == "Task" and isinstance(
tool_use.get("input"), dict
):
tool_use["input"]["run_in_background"] = False
yield sse.content_block_delta(
block_idx,
"input_json_delta",
json.dumps(tool_use["input"]),
)
yield sse.content_block_stop(block_idx)
if (
not error_occurred
and sse.blocks.text_index == -1
and not sse.blocks.tool_indices
):
for event in sse.ensure_text_block():
yield event
yield sse.emit_text_delta(" ")
for event in self._flush_task_arg_buffers(sse):
yield event
for event in sse.close_all_blocks():
yield event
output_tokens = (
usage_info.completion_tokens
if usage_info and hasattr(usage_info, "completion_tokens")
else sse.estimate_output_tokens()
)
if usage_info and hasattr(usage_info, "prompt_tokens"):
provider_input = usage_info.prompt_tokens
if isinstance(provider_input, int):
diff = provider_input - input_tokens
logger.debug(
"TOKEN_ESTIMATE: our=%d provider=%d diff=%+d",
input_tokens,
provider_input,
diff,
)
yield sse.message_delta(map_stop_reason(finish_reason), output_tokens)
yield sse.message_stop()
yield sse.done()
def _process_tool_call(self, tc: dict, sse: Any):
"""Process a single tool call delta and yield SSE events."""
tc_index = tc.get("index", 0)
if tc_index < 0:
tc_index = len(sse.blocks.tool_indices)
fn_delta = tc.get("function", {})
incoming_name = fn_delta.get("name")
if incoming_name is not None:
prev = sse.blocks.tool_names.get(tc_index, "")
if not prev:
sse.blocks.tool_names[tc_index] = incoming_name
elif prev == incoming_name:
pass
elif isinstance(prev, str) and isinstance(incoming_name, str):
if incoming_name.startswith(prev):
sse.blocks.tool_names[tc_index] = incoming_name
elif prev.startswith(incoming_name):
pass
else:
sse.blocks.tool_names[tc_index] = prev + incoming_name
else:
sse.blocks.tool_names[tc_index] = str(prev) + str(incoming_name)
if tc_index not in sse.blocks.tool_indices:
name = sse.blocks.tool_names.get(tc_index, "")
if name or tc.get("id"):
tool_id = tc.get("id") or f"tool_{uuid.uuid4()}"
yield sse.start_tool_block(tc_index, tool_id, name)
sse.blocks.tool_started[tc_index] = True
elif not sse.blocks.tool_started.get(tc_index) and sse.blocks.tool_names.get(
tc_index
):
tool_id = tc.get("id") or f"tool_{uuid.uuid4()}"
name = sse.blocks.tool_names[tc_index]
yield sse.start_tool_block(tc_index, tool_id, name)
sse.blocks.tool_started[tc_index] = True
args = fn_delta.get("arguments", "")
if args:
if not sse.blocks.tool_started.get(tc_index):
tool_id = tc.get("id") or f"tool_{uuid.uuid4()}"
name = sse.blocks.tool_names.get(tc_index, "tool_call") or "tool_call"
yield sse.start_tool_block(tc_index, tool_id, name)
sse.blocks.tool_started[tc_index] = True
current_name = sse.blocks.tool_names.get(tc_index, "")
if current_name == "Task":
if not sse.blocks.task_args_emitted.get(tc_index, False):
buf = sse.blocks.task_arg_buffer.get(tc_index, "") + args
sse.blocks.task_arg_buffer[tc_index] = buf
try:
args_json = json.loads(buf)
except Exception:
return
if args_json.get("run_in_background") is not False:
logger.info(
"OPENROUTER_INTERCEPT: Forcing run_in_background=False for Task %s",
tc.get("id")
or sse.blocks.tool_ids.get(tc_index, "unknown"),
)
args_json["run_in_background"] = False
sse.blocks.task_args_emitted[tc_index] = True
sse.blocks.task_arg_buffer.pop(tc_index, None)
yield sse.emit_tool_delta(tc_index, json.dumps(args_json))
return
yield sse.emit_tool_delta(tc_index, args)
def _flush_task_arg_buffers(self, sse: Any):
"""Emit buffered Task args as a single JSON delta (best-effort)."""
for tool_index, buf in list(sse.blocks.task_arg_buffer.items()):
if sse.blocks.task_args_emitted.get(tool_index, False):
sse.blocks.task_arg_buffer.pop(tool_index, None)
continue
tool_id = sse.blocks.tool_ids.get(tool_index, "unknown")
out = "{}"
try:
args_json = json.loads(buf)
if args_json.get("run_in_background") is not False:
logger.info(
"OPENROUTER_INTERCEPT: Forcing run_in_background=False for Task %s",
tool_id,
)
args_json["run_in_background"] = False
out = json.dumps(args_json)
except Exception as e:
prefix = buf[:120]
logger.warning(
"OPENROUTER_INTERCEPT: Task args invalid JSON (id=%s len=%d prefix=%r): %s",
tool_id,
len(buf),
prefix,
e,
)
sse.blocks.task_args_emitted[tool_index] = True
sse.blocks.task_arg_buffer.pop(tool_index, None)
yield sse.emit_tool_delta(tool_index, out)

View file

@ -0,0 +1,80 @@
"""Request builder for OpenRouter provider."""
import logging
from typing import Any, Dict
from providers.nvidia_nim.utils.message_converter import AnthropicToOpenAIConverter
logger = logging.getLogger(__name__)
OPENROUTER_DEFAULT_MAX_TOKENS = 8192
def _set_if_not_none(body: Dict[str, Any], key: str, value: Any) -> None:
if value is not None:
body[key] = value
def build_request_body(request_data: Any) -> dict:
"""Build OpenAI-format request body from Anthropic request for OpenRouter."""
logger.debug(
"OPENROUTER_REQUEST: conversion start model=%s msgs=%d",
getattr(request_data, "model", "?"),
len(getattr(request_data, "messages", [])),
)
messages = AnthropicToOpenAIConverter.convert_messages(
request_data.messages, include_reasoning_for_openrouter=True
)
# Add system prompt
system = getattr(request_data, "system", None)
if system:
system_msg = AnthropicToOpenAIConverter.convert_system_prompt(system)
if system_msg:
messages.insert(0, system_msg)
body: Dict[str, Any] = {
"model": request_data.model,
"messages": messages,
}
max_tokens = getattr(request_data, "max_tokens", None)
_set_if_not_none(body, "max_tokens", max_tokens or OPENROUTER_DEFAULT_MAX_TOKENS)
_set_if_not_none(body, "temperature", getattr(request_data, "temperature", None))
_set_if_not_none(body, "top_p", getattr(request_data, "top_p", None))
stop_sequences = getattr(request_data, "stop_sequences", None)
if stop_sequences:
body["stop"] = stop_sequences
tools = getattr(request_data, "tools", None)
if tools:
body["tools"] = AnthropicToOpenAIConverter.convert_tools(tools)
tool_choice = getattr(request_data, "tool_choice", None)
if tool_choice:
body["tool_choice"] = tool_choice
# OpenRouter reasoning: extra_body={"reasoning": {"enabled": True}}
extra_body: Dict[str, Any] = {}
request_extra = getattr(request_data, "extra_body", None)
if request_extra:
extra_body.update(request_extra)
thinking = getattr(request_data, "thinking", None)
thinking_enabled = (
thinking.enabled if thinking and hasattr(thinking, "enabled") else True
)
if thinking_enabled:
extra_body.setdefault("reasoning", {"enabled": True})
if extra_body:
body["extra_body"] = extra_body
logger.debug(
"OPENROUTER_REQUEST: conversion done model=%s msgs=%d tools=%d",
body.get("model"),
len(body.get("messages", [])),
len(body.get("tools", [])),
)
return body

View file

@ -37,6 +37,13 @@ def nim_provider(provider_config):
return NvidiaNimProvider(provider_config)
@pytest.fixture
def open_router_provider(provider_config):
from providers.open_router import OpenRouterProvider
return OpenRouterProvider(provider_config)
@pytest.fixture
def mock_cli_session():
session = MagicMock(spec=CLISession)

View file

@ -178,6 +178,23 @@ def test_convert_assistant_message_thinking():
"<think>\nI need to calculate this.\n</think>\n\nThe answer is 4."
)
assert result[0]["content"] == expected_content
assert "reasoning_content" not in result[0]
def test_convert_assistant_message_thinking_include_reasoning_for_openrouter():
"""When include_reasoning_for_openrouter=True, reasoning_content is added."""
content = [
MockBlock(type="thinking", thinking="I need to calculate this."),
MockBlock(type="text", text="The answer is 4."),
]
messages = [MockMessage("assistant", content)]
result = AnthropicToOpenAIConverter.convert_messages(
messages, include_reasoning_for_openrouter=True
)
assert len(result) == 1
assert result[0]["reasoning_content"] == "I need to calculate this."
assert "<think>" in result[0]["content"]
def test_convert_assistant_message_tool_use():

View file

@ -2,6 +2,7 @@ import pytest
from unittest.mock import AsyncMock, MagicMock, patch
from api.dependencies import get_provider, get_settings, cleanup_provider
from providers.nvidia_nim import NvidiaNimProvider
from providers.open_router import OpenRouterProvider
from config.nim import NimSettings
@ -12,6 +13,9 @@ def _make_mock_settings(**overrides):
mock.nvidia_nim_api_key = "test_key"
mock.nvidia_nim_rate_limit = 40
mock.nvidia_nim_rate_window = 60
mock.open_router_api_key = "test_openrouter_key"
mock.open_router_rate_limit = 40
mock.open_router_rate_window = 60
mock.nim = NimSettings()
for key, value in overrides.items():
setattr(mock, key, value)
@ -74,6 +78,19 @@ async def test_cleanup_provider_no_client():
# Should not raise
@pytest.mark.asyncio
async def test_get_provider_open_router():
"""Test that provider_type=open_router returns OpenRouterProvider."""
with patch("api.dependencies.get_settings") as mock_settings:
mock_settings.return_value = _make_mock_settings(provider_type="open_router")
provider = get_provider()
assert isinstance(provider, OpenRouterProvider)
assert provider._base_url == "https://openrouter.ai/api/v1"
assert provider._api_key == "test_openrouter_key"
@pytest.mark.asyncio
async def test_get_provider_unknown_type():
"""Test that unknown provider_type raises ValueError."""

179
tests/test_open_router.py Normal file
View file

@ -0,0 +1,179 @@
"""Tests for OpenRouter provider."""
import pytest
import json
from unittest.mock import MagicMock, AsyncMock, patch
from providers.open_router import OpenRouterProvider
from providers.base import ProviderConfig
from config.nim import NimSettings
class MockMessage:
def __init__(self, role, content):
self.role = role
self.content = content
class MockRequest:
def __init__(self, **kwargs):
self.model = "stepfun/step-3.5-flash:free"
self.messages = [MockMessage("user", "Hello")]
self.max_tokens = 100
self.temperature = 0.5
self.top_p = 0.9
self.system = "System prompt"
self.stop_sequences = None
self.tools = []
self.extra_body = {}
self.thinking = MagicMock()
self.thinking.enabled = True
for k, v in kwargs.items():
setattr(self, k, v)
@pytest.fixture
def open_router_config():
return ProviderConfig(
api_key="test_openrouter_key",
base_url="https://openrouter.ai/api/v1",
rate_limit=10,
rate_window=60,
nim_settings=NimSettings(),
)
@pytest.fixture(autouse=True)
def mock_rate_limiter():
"""Mock the global rate limiter to prevent waiting."""
with patch("providers.open_router.client.GlobalRateLimiter") as mock:
instance = mock.get_instance.return_value
instance.wait_if_blocked = AsyncMock(return_value=False)
async def _passthrough(fn, *args, **kwargs):
return await fn(*args, **kwargs)
instance.execute_with_retry = AsyncMock(side_effect=_passthrough)
yield instance
@pytest.fixture
def open_router_provider(open_router_config):
return OpenRouterProvider(open_router_config)
def test_init(open_router_config):
"""Test provider initialization."""
with patch("providers.open_router.client.AsyncOpenAI") as mock_openai:
provider = OpenRouterProvider(open_router_config)
assert provider._api_key == "test_openrouter_key"
assert provider._base_url == "https://openrouter.ai/api/v1"
mock_openai.assert_called_once()
def test_build_request_body_has_reasoning_extra(open_router_provider):
"""Request body has extra_body.reasoning.enabled for thinking models."""
req = MockRequest()
body = open_router_provider._build_request_body(req)
assert body["model"] == "stepfun/step-3.5-flash:free"
assert body["temperature"] == 0.5
assert len(body["messages"]) == 2 # System + User
assert body["messages"][0]["role"] == "system"
assert body["messages"][0]["content"] == "System prompt"
assert "extra_body" in body
assert "reasoning" in body["extra_body"]
assert body["extra_body"]["reasoning"]["enabled"] is True
def test_build_request_body_base_url_and_model(open_router_provider):
"""Base URL and model are correct in provider config."""
assert open_router_provider._base_url == "https://openrouter.ai/api/v1"
req = MockRequest(model="stepfun/step-3.5-flash:free")
body = open_router_provider._build_request_body(req)
assert body["model"] == "stepfun/step-3.5-flash:free"
@pytest.mark.asyncio
async def test_stream_response_text(open_router_provider):
"""Test streaming text response."""
req = MockRequest()
mock_chunk1 = MagicMock()
mock_chunk1.choices = [
MagicMock(
delta=MagicMock(content="Hello", reasoning_content=None),
finish_reason=None,
)
]
mock_chunk1.usage = None
mock_chunk2 = MagicMock()
mock_chunk2.choices = [
MagicMock(
delta=MagicMock(content=" World", reasoning_content=None),
finish_reason="stop",
)
]
mock_chunk2.usage = MagicMock(completion_tokens=10)
async def mock_stream():
yield mock_chunk1
yield mock_chunk2
with patch.object(
open_router_provider._client.chat.completions, "create", new_callable=AsyncMock
) as mock_create:
mock_create.return_value = mock_stream()
events = []
async for event in open_router_provider.stream_response(req):
events.append(event)
assert len(events) > 0
assert "event: message_start" in events[0]
text_content = ""
for e in events:
if "event: content_block_delta" in e and '"text_delta"' in e:
for line in e.splitlines():
if line.startswith("data: "):
data = json.loads(line[6:])
if "delta" in data and "text" in data["delta"]:
text_content += data["delta"]["text"]
assert "Hello World" in text_content
@pytest.mark.asyncio
async def test_stream_response_reasoning_content(open_router_provider):
"""Test streaming with reasoning_content delta."""
req = MockRequest()
mock_chunk = MagicMock()
mock_chunk.choices = [
MagicMock(
delta=MagicMock(content=None, reasoning_content="Thinking..."),
finish_reason=None,
)
]
mock_chunk.usage = None
async def mock_stream():
yield mock_chunk
with patch.object(
open_router_provider._client.chat.completions, "create", new_callable=AsyncMock
) as mock_create:
mock_create.return_value = mock_stream()
events = []
async for event in open_router_provider.stream_response(req):
events.append(event)
found_thinking = False
for e in events:
if "event: content_block_delta" in e and '"thinking_delta"' in e:
if "Thinking..." in e:
found_thinking = True
assert found_thinking