feat: deepseek api support (#118)

## Summary

* add native DeepSeek provider support via the shared OpenAI-compatible
provider base
* allow `deepseek/...` model prefixes in config validation
* add `DEEPSEEK_API_KEY` and `DEEPSEEK_BASE_URL` settings
* add DeepSeek entries to `.env.example` and `config/env.example`
* implement `DeepSeekProvider` and register it in provider dependencies
* add a DeepSeek request builder with DeepSeek-specific thinking payload
handling
* preserve Anthropic thinking blocks as `reasoning_content` for
DeepSeek-compatible continuation flows
* update `claude-pick` to discover DeepSeek models from the DeepSeek API
* document DeepSeek usage in `README.md`
* add tests for config validation, provider dependency wiring, request
building, and streaming behavior

## Motivation

DeepSeek exposes an OpenAI-compatible API and can be used directly
without routing through OpenRouter. This lets users spend their existing
DeepSeek balance through the proxy while keeping the same Claude Code
workflow and per-model provider mapping.

## Example

```dotenv
DEEPSEEK_API_KEY="sk-..."
DEEPSEEK_BASE_URL="https://api.deepseek.com"

MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat"

---------

Co-authored-by: Alishahryar1 <alishahryar2@gmail.com>
This commit is contained in:
Pavel Yurchenko 2026-04-23 10:06:01 +10:00 committed by GitHub
parent c3f6dbe0bc
commit e719e4aed2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
13 changed files with 410 additions and 14 deletions

View file

@ -6,6 +6,10 @@ NVIDIA_NIM_API_KEY=""
OPENROUTER_API_KEY=""
# DeepSeek Config
DEEPSEEK_API_KEY=""
# LM Studio Config (local provider, no API key required)
LM_STUDIO_BASE_URL="http://localhost:1234/v1"
@ -16,7 +20,7 @@ LLAMACPP_BASE_URL="http://localhost:8080/v1"
# All Claude model requests are mapped to these models, plain model is fallback
# Format: provider_type/model/name
# Valid providers: "nvidia_nim" | "open_router" | "lmstudio" | "llamacpp"
# Valid providers: "nvidia_nim" | "open_router" | "deepseek" | "lmstudio" | "llamacpp"
MODEL_OPUS="nvidia_nim/z-ai/glm4.7"
MODEL_SONNET="open_router/arcee-ai/trinity-large-preview:free"
MODEL_HAIKU="open_router/stepfun/step-3.5-flash:free"

View file

@ -12,7 +12,7 @@
[![Code style: Ruff](https://img.shields.io/badge/code%20formatting-ruff-f5a623.svg?style=for-the-badge)](https://github.com/astral-sh/ruff)
[![Logging: Loguru](https://img.shields.io/badge/logging-loguru-4ecdc4.svg?style=for-the-badge)](https://github.com/Delgan/loguru)
A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NIM** (40 req/min free), **OpenRouter** (hundreds of models), **LM Studio** (fully local), or **llama.cpp** (local with Anthropic endpoints).
A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NIM** (40 req/min free), **OpenRouter** (hundreds of models), **DeepSeek** (direct API), **LM Studio** (fully local), or **llama.cpp** (local with Anthropic endpoints).
[Quick Start](#quick-start) · [Providers](#providers) · [Discord Bot](#discord-bot) · [Configuration](#configuration) · [Development](#development) · [Contributing](#contributing)
@ -31,7 +31,7 @@ A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NI
| -------------------------- | ----------------------------------------------------------------------------------------------- |
| **Zero Cost** | 40 req/min free on NVIDIA NIM. Free models on OpenRouter. Fully local with LM Studio |
| **Drop-in Replacement** | Set 2 env vars. No modifications to Claude Code CLI or VSCode extension needed |
| **4 Providers** | NVIDIA NIM, OpenRouter (hundreds of models), LM Studio (local), llama.cpp (`llama-server`) |
| **5 Providers** | NVIDIA NIM, OpenRouter, DeepSeek, LM Studio (local), llama.cpp (`llama-server`) |
| **Per-Model Mapping** | Route Opus / Sonnet / Haiku to different models and providers. Mix providers freely |
| **Thinking Token Support** | Parses `<think>` tags and `reasoning_content` into native Claude thinking blocks |
| **Heuristic Tool Parser** | Models outputting tool calls as text are auto-parsed into structured tool use |
@ -48,6 +48,7 @@ A lightweight proxy that routes Claude Code's Anthropic API calls to **NVIDIA NI
1. Get an API key (or use LM Studio / llama.cpp locally):
- **NVIDIA NIM**: [build.nvidia.com/settings/api-keys](https://build.nvidia.com/settings/api-keys)
- **OpenRouter**: [openrouter.ai/keys](https://openrouter.ai/keys)
- **DeepSeek**: [platform.deepseek.com/api_keys](https://platform.deepseek.com/api_keys)
- **LM Studio**: No API key needed. Run locally with [LM Studio](https://lmstudio.ai)
- **llama.cpp**: No API key needed. Run `llama-server` locally.
2. Install [Claude Code](https://github.com/anthropics/claude-code)
@ -100,6 +101,20 @@ MODEL="open_router/stepfun/step-3.5-flash:free" # fallback
</details>
<details>
<summary><b>DeepSeek</b> (direct API)</summary>
```dotenv
DEEPSEEK_API_KEY="your-deepseek-key-here"
MODEL_OPUS="deepseek/deepseek-reasoner"
MODEL_SONNET="deepseek/deepseek-chat"
MODEL_HAIKU="deepseek/deepseek-chat"
MODEL="deepseek/deepseek-chat" # fallback
```
</details>
<details>
<summary><b>LM Studio</b> (fully local, no API key)</summary>
@ -293,6 +308,7 @@ The proxy also exposes Claude-compatible probe routes: `GET /v1/models`, `POST /
| -------------- | ------------ | ---------- | ------------------------------------ |
| **NVIDIA NIM** | Free | 40 req/min | Daily driver, generous free tier |
| **OpenRouter** | Free / Paid | Varies | Model variety, fallback options |
| **DeepSeek** | Usage-based | Varies | Direct access to DeepSeek chat/reasoner |
| **LM Studio** | Free (local) | Unlimited | Privacy, offline use, no rate limits |
| **llama.cpp** | Free (local) | Unlimited | Lightweight local inference engine |
@ -302,6 +318,7 @@ Models use a prefix format: `provider_prefix/model/name`. An invalid prefix caus
| ---------- | ----------------- | -------------------- | ----------------------------- |
| NVIDIA NIM | `nvidia_nim/...` | `NVIDIA_NIM_API_KEY` | `integrate.api.nvidia.com/v1` |
| OpenRouter | `open_router/...` | `OPENROUTER_API_KEY` | `openrouter.ai/api/v1` |
| DeepSeek | `deepseek/...` | `DEEPSEEK_API_KEY` | `api.deepseek.com` |
| LM Studio | `lmstudio/...` | (none) | `localhost:1234/v1` |
| llama.cpp | `llamacpp/...` | (none) | `localhost:8080/v1` |
@ -334,6 +351,18 @@ Browse: [openrouter.ai/models](https://openrouter.ai/models) · [Free models](ht
</details>
<details>
<summary><b>DeepSeek models</b></summary>
DeepSeek currently exposes the direct API models:
- `deepseek/deepseek-chat`
- `deepseek/deepseek-reasoner`
Browse: [api-docs.deepseek.com](https://api-docs.deepseek.com)
</details>
<details>
<summary><b>LM Studio models</b></summary>
@ -455,6 +484,7 @@ Configure via `WHISPER_DEVICE` (`cpu` | `cuda` | `nvidia_nim`) and `WHISPER_MODE
| `NVIDIA_NIM_API_KEY` | NVIDIA API key | required for NIM |
| `ENABLE_THINKING` | Global switch for provider reasoning requests and Claude thinking blocks. Set `false` to hide thinking across all providers. | `true` |
| `OPENROUTER_API_KEY` | OpenRouter API key | required for OpenRouter |
| `DEEPSEEK_API_KEY` | DeepSeek API key | required for DeepSeek |
| `LM_STUDIO_BASE_URL` | LM Studio server URL | `http://localhost:1234/v1` |
| `LLAMACPP_BASE_URL` | llama.cpp server URL | `http://localhost:8080/v1` |
@ -514,7 +544,7 @@ See [`.env.example`](.env.example) for all supported parameters.
free-claude-code/
├── server.py # Entry point
├── api/ # FastAPI routes, request detection, optimization handlers
├── providers/ # BaseProvider, OpenAICompatibleProvider, NIM, OpenRouter, LM Studio, llamacpp
├── providers/ # BaseProvider, OpenAICompatibleProvider, NIM, OpenRouter, DeepSeek, LM Studio, llamacpp
│ └── common/ # Shared utils (SSE builder, message converter, parsers, error mapping)
├── messaging/ # MessagingPlatform ABC + Discord/Telegram bots, session management
├── config/ # Settings, NIM config, logging

View file

@ -7,6 +7,7 @@ from config.settings import Settings
from config.settings import get_settings as _get_settings
from providers.base import BaseProvider, ProviderConfig
from providers.common import get_user_facing_error_message
from providers.deepseek import DEEPSEEK_BASE_URL, DeepSeekProvider
from providers.exceptions import AuthenticationError
from providers.llamacpp import LlamaCppProvider
from providers.lmstudio import LMStudioProvider
@ -60,6 +61,24 @@ def _create_provider_for_type(provider_type: str, settings: Settings) -> BasePro
enable_thinking=settings.enable_thinking,
)
return OpenRouterProvider(config)
if provider_type == "deepseek":
if not settings.deepseek_api_key or not settings.deepseek_api_key.strip():
raise AuthenticationError(
"DEEPSEEK_API_KEY is not set. Add it to your .env file. "
"Get a key at https://platform.deepseek.com/api_keys"
)
config = ProviderConfig(
api_key=settings.deepseek_api_key,
base_url=DEEPSEEK_BASE_URL,
rate_limit=settings.provider_rate_limit,
rate_window=settings.provider_rate_window,
max_concurrency=settings.provider_max_concurrency,
http_read_timeout=settings.http_read_timeout,
http_write_timeout=settings.http_write_timeout,
http_connect_timeout=settings.http_connect_timeout,
enable_thinking=settings.enable_thinking,
)
return DeepSeekProvider(config)
if provider_type == "lmstudio":
config = ProviderConfig(
api_key="lm-studio",
@ -87,12 +106,12 @@ def _create_provider_for_type(provider_type: str, settings: Settings) -> BasePro
)
return LlamaCppProvider(config)
logger.error(
"Unknown provider_type: '{}'. Supported: 'nvidia_nim', 'open_router', 'lmstudio', 'llamacpp'",
"Unknown provider_type: '{}'. Supported: 'nvidia_nim', 'open_router', 'deepseek', 'lmstudio', 'llamacpp'",
provider_type,
)
raise ValueError(
f"Unknown provider_type: '{provider_type}'. "
f"Supported: 'nvidia_nim', 'open_router', 'lmstudio', 'llamacpp'"
f"Supported: 'nvidia_nim', 'open_router', 'deepseek', 'lmstudio', 'llamacpp'"
)

View file

@ -6,13 +6,17 @@ NVIDIA_NIM_API_KEY=""
OPENROUTER_API_KEY=""
# DeepSeek Config
DEEPSEEK_API_KEY=""
# LM Studio Config (local provider, no API key required)
LM_STUDIO_BASE_URL="http://localhost:1234/v1"
# All Claude model requests are mapped to these models, plain model is fallback
# Format: provider_type/model/name
# Valid providers: "nvidia_nim" | "open_router" | "lmstudio"
# Valid providers: "nvidia_nim" | "open_router" | "deepseek" | "lmstudio" | "llamacpp"
MODEL_OPUS="nvidia_nim/z-ai/glm4.7"
MODEL_SONNET="open_router/arcee-ai/trinity-large-preview:free"
MODEL_HAIKU="open_router/stepfun/step-3.5-flash:free"
@ -68,4 +72,4 @@ FAST_PREFIX_DETECTION=true
ENABLE_NETWORK_PROBE_MOCK=true
ENABLE_TITLE_GENERATION_SKIP=true
ENABLE_SUGGESTION_MODE_SKIP=true
ENABLE_FILEPATH_EXTRACTION_MOCK=true
ENABLE_FILEPATH_EXTRACTION_MOCK=true

View file

@ -81,6 +81,9 @@ class Settings(BaseSettings):
# ==================== OpenRouter Config ====================
open_router_api_key: str = Field(default="", validation_alias="OPENROUTER_API_KEY")
# ==================== DeepSeek Config ====================
deepseek_api_key: str = Field(default="", validation_alias="DEEPSEEK_API_KEY")
# ==================== Messaging Platform Selection ====================
# Valid: "telegram" | "discord"
messaging_platform: str = Field(
@ -219,7 +222,13 @@ class Settings(BaseSettings):
def validate_model_format(cls, v: str | None) -> str | None:
if v is None:
return None
valid_providers = ("nvidia_nim", "open_router", "lmstudio", "llamacpp")
valid_providers = (
"nvidia_nim",
"open_router",
"deepseek",
"lmstudio",
"llamacpp",
)
if "/" not in v:
raise ValueError(
f"Model must be prefixed with provider type. "
@ -230,7 +239,7 @@ class Settings(BaseSettings):
if provider not in valid_providers:
raise ValueError(
f"Invalid provider: '{provider}'. "
f"Supported: 'nvidia_nim', 'open_router', 'lmstudio', 'llamacpp'"
f"Supported: 'nvidia_nim', 'open_router', 'deepseek', 'lmstudio', 'llamacpp'"
)
return v

View file

@ -1,6 +1,7 @@
"""Providers package - implement your own provider by extending BaseProvider."""
from .base import BaseProvider, ProviderConfig
from .deepseek import DeepSeekProvider
from .exceptions import (
APIError,
AuthenticationError,
@ -18,6 +19,7 @@ __all__ = [
"APIError",
"AuthenticationError",
"BaseProvider",
"DeepSeekProvider",
"InvalidRequestError",
"LMStudioProvider",
"LlamaCppProvider",

View file

@ -27,11 +27,12 @@ class AnthropicToOpenAIConverter:
*,
include_thinking: bool = True,
include_reasoning_for_openrouter: bool = False,
include_reasoning_content: bool = False,
) -> list[dict[str, Any]]:
"""Convert a list of Anthropic messages to OpenAI format.
When include_reasoning_for_openrouter is True, assistant messages with
thinking blocks get reasoning_content added for OpenRouter multi-turn
When reasoning_content preservation is enabled, assistant messages with
thinking blocks get reasoning_content added for provider multi-turn
reasoning continuation.
"""
result = []
@ -49,6 +50,7 @@ class AnthropicToOpenAIConverter:
content,
include_thinking=include_thinking,
include_reasoning_for_openrouter=include_reasoning_for_openrouter,
include_reasoning_content=include_reasoning_content,
)
)
elif role == "user":
@ -66,11 +68,15 @@ class AnthropicToOpenAIConverter:
*,
include_thinking: bool = True,
include_reasoning_for_openrouter: bool = False,
include_reasoning_content: bool = False,
) -> list[dict[str, Any]]:
"""Convert assistant message blocks, preserving interleaved thinking+text order."""
content_parts: list[str] = []
thinking_parts: list[str] = []
tool_calls: list[dict[str, Any]] = []
emit_reasoning_content = (
include_reasoning_for_openrouter or include_reasoning_content
)
for block in content:
block_type = get_block_type(block)
@ -82,7 +88,7 @@ class AnthropicToOpenAIConverter:
continue
thinking = get_block_attr(block, "thinking", "")
content_parts.append(f"<think>\n{thinking}\n</think>")
if include_reasoning_for_openrouter:
if emit_reasoning_content:
thinking_parts.append(thinking)
elif block_type == "tool_use":
tool_input = get_block_attr(block, "input", {})
@ -112,7 +118,7 @@ class AnthropicToOpenAIConverter:
}
if tool_calls:
msg["tool_calls"] = tool_calls
if include_reasoning_for_openrouter and thinking_parts:
if emit_reasoning_content and thinking_parts:
msg["reasoning_content"] = "\n".join(thinking_parts)
return [msg]
@ -191,6 +197,7 @@ def build_base_request_body(
default_max_tokens: int | None = None,
include_thinking: bool = True,
include_reasoning_for_openrouter: bool = False,
include_reasoning_content: bool = False,
) -> dict[str, Any]:
"""Build the common parts of an OpenAI-format request body.
@ -204,6 +211,7 @@ def build_base_request_body(
request_data.messages,
include_thinking=include_thinking,
include_reasoning_for_openrouter=include_reasoning_for_openrouter,
include_reasoning_content=include_reasoning_content,
)
system = getattr(request_data, "system", None)

View file

@ -0,0 +1,5 @@
"""DeepSeek provider exports."""
from .client import DEEPSEEK_BASE_URL, DeepSeekProvider
__all__ = ["DEEPSEEK_BASE_URL", "DeepSeekProvider"]

View file

@ -0,0 +1,29 @@
"""DeepSeek provider implementation."""
from typing import Any
from providers.base import ProviderConfig
from providers.openai_compat import OpenAICompatibleProvider
from .request import build_request_body
DEEPSEEK_BASE_URL = "https://api.deepseek.com"
class DeepSeekProvider(OpenAICompatibleProvider):
"""DeepSeek provider using OpenAI-compatible chat completions."""
def __init__(self, config: ProviderConfig):
super().__init__(
config,
provider_name="DEEPSEEK",
base_url=config.base_url or DEEPSEEK_BASE_URL,
api_key=config.api_key,
)
def _build_request_body(self, request: Any) -> dict:
"""Internal helper for tests and shared building."""
return build_request_body(
request,
thinking_enabled=self._is_thinking_enabled(request),
)

View file

@ -0,0 +1,39 @@
"""Request builder for DeepSeek provider."""
from typing import Any
from loguru import logger
from providers.common.message_converter import build_base_request_body
def build_request_body(request_data: Any, *, thinking_enabled: bool) -> dict:
"""Build OpenAI-format request body from Anthropic request for DeepSeek."""
logger.debug(
"DEEPSEEK_REQUEST: conversion start model={} msgs={}",
getattr(request_data, "model", "?"),
len(getattr(request_data, "messages", [])),
)
body = build_base_request_body(
request_data,
include_reasoning_content=True,
)
extra_body: dict[str, Any] = {}
request_extra = getattr(request_data, "extra_body", None)
if request_extra:
extra_body.update(request_extra)
if thinking_enabled and body.get("model") != "deepseek-reasoner":
extra_body.setdefault("thinking", {"type": "enabled"})
if extra_body:
body["extra_body"] = extra_body
logger.debug(
"DEEPSEEK_REQUEST: conversion done model={} msgs={} tools={}",
body.get("model"),
len(body.get("messages", [])),
len(body.get("tools", [])),
)
return body

View file

@ -10,6 +10,7 @@ from api.dependencies import (
get_settings,
)
from config.nim import NimSettings
from providers.deepseek import DeepSeekProvider
from providers.lmstudio import LMStudioProvider
from providers.nvidia_nim import NvidiaNimProvider
from providers.open_router import OpenRouterProvider
@ -25,11 +26,13 @@ def _make_mock_settings(**overrides):
mock.provider_rate_window = 60
mock.provider_max_concurrency = 5
mock.open_router_api_key = "test_openrouter_key"
mock.deepseek_api_key = "test_deepseek_key"
mock.lm_studio_base_url = "http://localhost:1234/v1"
mock.nim = NimSettings()
mock.http_read_timeout = 300.0
mock.http_write_timeout = 10.0
mock.http_connect_timeout = 2.0
mock.enable_thinking = True
for key, value in overrides.items():
setattr(mock, key, value)
return mock
@ -120,6 +123,49 @@ async def test_get_provider_lmstudio():
assert provider._base_url == "http://localhost:1234/v1"
@pytest.mark.asyncio
async def test_get_provider_deepseek():
"""Test that provider_type=deepseek returns DeepSeekProvider."""
with patch("api.dependencies.get_settings") as mock_settings:
mock_settings.return_value = _make_mock_settings(provider_type="deepseek")
provider = get_provider()
assert isinstance(provider, DeepSeekProvider)
assert provider._base_url == "https://api.deepseek.com"
assert provider._api_key == "test_deepseek_key"
assert provider._config.enable_thinking is True
@pytest.mark.asyncio
async def test_get_provider_deepseek_uses_fixed_base_url():
"""DeepSeek provider always uses the fixed provider base URL."""
with patch("api.dependencies.get_settings") as mock_settings:
mock_settings.return_value = _make_mock_settings(
provider_type="deepseek",
)
provider = get_provider()
assert isinstance(provider, DeepSeekProvider)
assert provider._base_url == "https://api.deepseek.com"
@pytest.mark.asyncio
async def test_get_provider_deepseek_passes_enable_thinking():
"""DeepSeek provider receives the global thinking toggle."""
with patch("api.dependencies.get_settings") as mock_settings:
mock_settings.return_value = _make_mock_settings(
provider_type="deepseek",
enable_thinking=False,
)
provider = get_provider()
assert isinstance(provider, DeepSeekProvider)
assert provider._config.enable_thinking is False
@pytest.mark.asyncio
async def test_get_provider_lmstudio_uses_lm_studio_base_url():
"""LM Studio provider uses lm_studio_base_url from settings."""
@ -200,6 +246,23 @@ async def test_get_provider_open_router_missing_api_key():
assert "openrouter.ai" in exc_info.value.detail
@pytest.mark.asyncio
async def test_get_provider_deepseek_missing_api_key():
"""DeepSeek with empty API key raises HTTPException 503."""
with patch("api.dependencies.get_settings") as mock_settings:
mock_settings.return_value = _make_mock_settings(
provider_type="deepseek",
deepseek_api_key="",
)
with pytest.raises(HTTPException) as exc_info:
get_provider()
assert exc_info.value.status_code == 503
assert "DEEPSEEK_API_KEY" in exc_info.value.detail
assert "platform.deepseek.com" in exc_info.value.detail
@pytest.mark.asyncio
async def test_get_provider_unknown_type():
"""Test that unknown provider_type raises ValueError."""

View file

@ -359,6 +359,7 @@ class TestPerModelMapping:
"open_router/anthropic/claude-3-opus",
"open_router/anthropic/claude-3-haiku",
),
({"MODEL": "deepseek/deepseek-chat"}, "deepseek/deepseek-chat", None),
({"MODEL": "lmstudio/qwen2.5-7b"}, "lmstudio/qwen2.5-7b", None),
({"MODEL": "llamacpp/local-model"}, "llamacpp/local-model", None),
],
@ -494,6 +495,7 @@ class TestPerModelMapping:
assert Settings.parse_provider_type("nvidia_nim/meta/llama") == "nvidia_nim"
assert Settings.parse_provider_type("open_router/deepseek/r1") == "open_router"
assert Settings.parse_provider_type("deepseek/deepseek-chat") == "deepseek"
assert Settings.parse_provider_type("lmstudio/qwen") == "lmstudio"
assert Settings.parse_provider_type("llamacpp/model") == "llamacpp"
@ -502,5 +504,6 @@ class TestPerModelMapping:
from config.settings import Settings
assert Settings.parse_model_name("nvidia_nim/meta/llama") == "meta/llama"
assert Settings.parse_model_name("deepseek/deepseek-chat") == "deepseek-chat"
assert Settings.parse_model_name("lmstudio/qwen") == "qwen"
assert Settings.parse_model_name("llamacpp/model") == "model"

View file

@ -0,0 +1,181 @@
"""Tests for DeepSeek provider."""
from unittest.mock import AsyncMock, MagicMock, patch
import pytest
from providers.base import ProviderConfig
from providers.deepseek import DEEPSEEK_BASE_URL, DeepSeekProvider
class MockMessage:
def __init__(self, role, content):
self.role = role
self.content = content
class MockBlock:
def __init__(self, **kwargs):
for key, value in kwargs.items():
setattr(self, key, value)
class MockRequest:
def __init__(self, **kwargs):
self.model = "deepseek-chat"
self.messages = [MockMessage("user", "Hello")]
self.max_tokens = 100
self.temperature = 0.5
self.top_p = 0.9
self.system = "System prompt"
self.stop_sequences = None
self.tools = []
self.extra_body = {}
self.thinking = MagicMock()
self.thinking.enabled = True
for key, value in kwargs.items():
setattr(self, key, value)
@pytest.fixture
def deepseek_config():
return ProviderConfig(
api_key="test_deepseek_key",
base_url=DEEPSEEK_BASE_URL,
rate_limit=10,
rate_window=60,
enable_thinking=True,
)
@pytest.fixture(autouse=True)
def mock_rate_limiter():
"""Mock the global rate limiter to prevent waiting."""
with patch("providers.openai_compat.GlobalRateLimiter") as mock:
instance = mock.get_instance.return_value
instance.wait_if_blocked = AsyncMock(return_value=False)
async def _passthrough(fn, *args, **kwargs):
return await fn(*args, **kwargs)
instance.execute_with_retry = AsyncMock(side_effect=_passthrough)
yield instance
@pytest.fixture
def deepseek_provider(deepseek_config):
return DeepSeekProvider(deepseek_config)
def test_init(deepseek_config):
"""Test provider initialization."""
with patch("providers.openai_compat.AsyncOpenAI") as mock_openai:
provider = DeepSeekProvider(deepseek_config)
assert provider._api_key == "test_deepseek_key"
assert provider._base_url == DEEPSEEK_BASE_URL
mock_openai.assert_called_once()
def test_build_request_body_enables_thinking_for_chat_model(deepseek_provider):
"""Thinking-enabled requests add DeepSeek's thinking payload for chat model."""
req = MockRequest(model="deepseek-chat")
body = deepseek_provider._build_request_body(req)
assert body["model"] == "deepseek-chat"
assert body["extra_body"]["thinking"] == {"type": "enabled"}
assert body["messages"][0]["role"] == "system"
def test_build_request_body_global_disable_blocks_request_thinking():
"""Global disable suppresses provider-side thinking even if the request enables it."""
provider = DeepSeekProvider(
ProviderConfig(
api_key="test_deepseek_key",
base_url=DEEPSEEK_BASE_URL,
rate_limit=10,
rate_window=60,
enable_thinking=False,
)
)
req = MockRequest(model="deepseek-chat")
body = provider._build_request_body(req)
assert "extra_body" not in body or "thinking" not in body["extra_body"]
def test_build_request_body_request_disable_blocks_global_thinking(deepseek_provider):
"""Request-level disable suppresses provider-side thinking when global is enabled."""
req = MockRequest(model="deepseek-chat")
req.thinking.enabled = False
body = deepseek_provider._build_request_body(req)
assert "extra_body" not in body or "thinking" not in body["extra_body"]
def test_build_request_body_reasoner_skips_thinking_extra(deepseek_provider):
"""deepseek-reasoner does not need an extra thinking payload."""
req = MockRequest(model="deepseek-reasoner")
body = deepseek_provider._build_request_body(req)
assert body["model"] == "deepseek-reasoner"
assert "extra_body" not in body or "thinking" not in body["extra_body"]
def test_build_request_body_preserves_caller_thinking_override(deepseek_provider):
"""Caller-provided thinking payload should not be overwritten."""
req = MockRequest(
model="deepseek-chat",
extra_body={"thinking": {"type": "manual"}},
)
body = deepseek_provider._build_request_body(req)
assert body["extra_body"]["thinking"] == {"type": "manual"}
def test_build_request_body_preserves_reasoning_content(deepseek_provider):
"""Thinking blocks are mirrored into reasoning_content for continuation."""
req = MockRequest(
system=None,
messages=[
MockMessage(
"assistant",
[
MockBlock(type="thinking", thinking="First think"),
MockBlock(type="text", text="Then answer"),
],
)
],
)
body = deepseek_provider._build_request_body(req)
assert body["messages"][0]["reasoning_content"] == "First think"
@pytest.mark.asyncio
async def test_stream_response_reasoning_content(deepseek_provider):
"""reasoning_content deltas are emitted as thinking blocks."""
req = MockRequest()
mock_chunk = MagicMock()
mock_chunk.choices = [
MagicMock(
delta=MagicMock(content=None, reasoning_content="Thinking..."),
finish_reason="stop",
)
]
mock_chunk.usage = MagicMock(completion_tokens=2)
async def mock_stream():
yield mock_chunk
with patch.object(
deepseek_provider._client.chat.completions, "create", new_callable=AsyncMock
) as mock_create:
mock_create.return_value = mock_stream()
events = [event async for event in deepseek_provider.stream_response(req)]
assert any(
'"thinking_delta"' in event and "Thinking..." in event for event in events
)