Commit graph

103 commits

Author SHA1 Message Date
Bennet Bo Fenner
68340172a1
agent: Remove unused LanguageModelImage APIs (#57050)
Pulled out from #56866. Will help with MCP image support

Release Notes:

- N/A
2026-05-18 12:22:22 +00:00
marius851000
53c910982c
open_ai: Fix parsing response if token use info is unspecified (#55919)
I tried to use google cloud to test gemma4 and compare with the result
of ollama. it had response such as

```json
{"choices":[{"delta":{"content":"Hello","reasoning_content":null,"role":null,"tool_calls":null},"finish_reason":null,"index":0,"logprobs":null,"matched_stop":null}],"created":1778081610,"id":"KV_7adz7Ov20xN8Py-angQ8","model":"google/gemma-4-26b-a4b-it-maas","object":"chat.completion.chunk","usage":{"extra_properties":{"google":{"traffic_type":"ON_DEMAND"}}}}
```

(notice that, while "usage" is present, it does not have any of the
usual value)

Eventually, I had some more issue when parsing the response (unrelated
to this), so I decided to try the google ai endpoint, with its own set
of issue.

Those simple change should only loosen the accepted format, so no new
compatibility error are expected (but I haven’t tried with other
provider)

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [ ] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
(no change)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable


Release Notes:

- Improved open-ai compatibility when token usage info is absent
2026-05-17 19:50:14 +00:00
morgankrey
37f6d7a15c
Add ChatGPT subscription provider via OAuth 2.0 PKCE (#53166)
Adds a new language model provider that lets users authenticate with
their ChatGPT Plus/Pro subscription and use OpenAI models
(codex-mini-latest, o4-mini, o3) directly in the Zed agent — without
needing a separate API key.

## How it works

1. **OAuth 2.0 + PKCE sign-in**: Uses OpenAI's official Codex CLI client
ID to run an authorization code flow. A local HTTP server on
`127.0.0.1:1455` captures the callback, exchanges the code for tokens,
and stores them in the system keychain.

2. **Token refresh**: Access tokens are automatically refreshed when
they're within 5 minutes of expiry, using the stored refresh token.

3. **Responses API**: Requests go to
`https://chatgpt.com/backend-api/codex/responses` using the existing
`open_ai::responses` client (Responses API format, not Chat Completions
which was deprecated for this endpoint in Feb 2026).

4. **Required headers**: `originator: zed`, `OpenAI-Beta:
responses=experimental`, `ChatGPT-Account-Id` (extracted from JWT),
`store: false` in the body.

## Files changed

- `crates/open_ai/src/responses.rs`: Add `store: Option<bool>` field to
`Request`; add `extra_headers` param to `stream_response` for
per-provider header injection
- `crates/language_models/src/provider/openai_subscribed.rs`: New
provider (sign-in UI, OAuth flow, token storage/refresh, model list)
- `crates/language_models/src/provider/open_ai.rs`,
`open_ai_compatible.rs`, `opencode.rs`: Pass `vec![]` for new
`extra_headers` param
- `crates/language_models/src/language_models.rs`: Register the new
provider
- `crates/language_models/Cargo.toml`: Add `rand` and `sha2` deps for
PKCE

## Open questions / known gaps

- [ ] **Terms of service**: Usage appears to be within OpenAI's ToS
(interactive use via their official CLI client ID), but needs legal
sign-off before shipping
- [ ] **Redirect URI**: Currently `http://localhost:1455/auth/callback`
— may need to match exactly what OpenAI's Codex CLI uses
- [ ] **UI polish**: The sign-in card is functional but minimal; needs
design review
- [ ] **Error messages**: OAuth error responses from the callback URL
aren't surfaced to the user yet
- [ ] **`o3` availability**: o3 may require a higher subscription tier;
consider gating it

## Testing

Sign-in flow was designed to match the Copilot Chat provider pattern.
Manual testing against the live OAuth endpoint is needed.

Release Notes:

- Added ChatGPT subscription provider, allowing users to use their
ChatGPT Plus/Pro subscription with the Zed agent

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
Co-authored-by: Richard Feldman <richard@zed.dev>
Co-authored-by: Richard Feldman <oss@rtfeldman.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-05-14 21:03:56 +00:00
Ben Brandt
78c889c21d
open_ai: Responses API improvements (#56476)
Release Notes:

- Removed deprecated OpenAI models
- Added support for gpt-5.4-nano/mini models for OpenAI provider
- Improved output quality when using OpenAI models

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
Co-authored-by: Gaauwe Rombouts <mail@grombouts.nl>
2026-05-12 14:47:16 +00:00
Bennet Bo Fenner
aeb05899b3
open_ai: Support specifying reasoning effort (#56411)
Closes #54875

Release Notes:

- Added support for specifying effort level when using OpenAI models
2026-05-11 13:48:33 +00:00
kangxl
febc3ebcb4
agent: Fix Preserve tool call ID/name across DashScope empty delta chunks (#54872)
Release Notes:

- Fixed: DashScope (Aliyun) tool calls now preserve id and name across
streaming delta chunks
   

---------------------------------------------------------------------------------------------------
    Aliyun (DashScope) SSE streaming sends id="" and name="" in
    subsequent tool_calls delta chunks after the first chunk. Previously
    these empty strings would unconditionally overwrite the accumulated
    id and name values, causing tool calls to lose identity and fail.
    
    Add is_empty() guards so id and name are only updated when the
    delta provides a non-empty value (falsy guard pattern), matching how
    Hermes Agent and OpenAI SDK handle this provider edge case.
    
    Test stream_maps_preserves_tool_id_and_name_across_empty_deltas
    simulates DashScope's actual streaming behavior and asserts that
    the completed ToolUse retains the correct id, name, and arguments.
    
    Files changed: 1 (+148/-2)
    - crates/open_ai/src/completion.rs
    
    CLA signed.
    
- [x] I've reviewed my own diff for quality, security, and reliability
    - [x] Tests cover the new/changed behavior
    - [x] Performance impact has been considered and is acceptable
    
 
<img width="980" height="1392" alt="CleanShot 2026-04-26 at 00 03 20@2x"
src="https://github.com/user-attachments/assets/428a845b-82a0-44eb-9e43-1a351de6ca6a"
/>

After FIx
<img width="900" height="1398" alt="CleanShot 2026-04-26 at 00 02 15@2x"
src="https://github.com/user-attachments/assets/604e36fd-bf90-4549-9e60-8a927033d3e9"
/>

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2026-05-11 13:47:30 +00:00
Bennet Bo Fenner
bf3fc2336d
agent: Allow tools to output multiple content parts (#54518)
Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- N/A
2026-04-27 12:36:11 +00:00
Bennet Bo Fenner
83cc2ec054
open_ai: Use responses API for all models (#54910)
From the
[docs](https://developers.openai.com/api/docs/guides/migrate-to-responses#responses-benefits):

> Better performance: Using reasoning models, like GPT-5, with Responses
will result in better model intelligence when compared to Chat
Completions. Our internal evals reveal a 3% improvement in SWE-bench
with same prompt and setup.
Agentic by default: The Responses API is an agentic loop, allowing the
model to call multiple tools, like web_search, image_generation,
file_search, code_interpreter, remote MCP servers, as well as your own
custom functions, within the span of one API request.
Lower costs: Results in lower costs due to improved cache utilization
(40% to 80% improvement when compared to Chat Completions in internal
tests).
Stateful context: Use store: true to maintain state from turn to turn,
preserving reasoning and tool context from turn-to-turn.
Flexible inputs: Pass a string with input or a list of messages; use
instructions for system-level guidance.
Encrypted reasoning: Opt-out of statefulness while still benefiting from
advanced reasoning.
Future-proof: Future-proofed for upcoming models.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [ ] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Closes #ISSUE

Release Notes:

- Always use Responses API for OpenAI models
2026-04-27 09:52:51 +00:00
Bennet Bo Fenner
db1d79a5ca
open_ai: Add support for gpt-5.5 (#54820)
Release Notes:

- Added support for GPT 5.5 and GPT 5.5 Pro via the OpenAI provider
2026-04-24 20:26:56 +00:00
Matt Van Horn
adab7b8871
language_models: Honor images capability for custom OpenAI models (#54223)
## Summary

Users who add custom OpenAI models under
`language_models.openai.available_models` can set `capabilities.images:
true` to declare that the endpoint accepts image inputs. Today, that
setting is silently ignored: the Agent panel's image-attach button stays
disabled regardless, and the only workaround is to switch to a built-in
OpenAI model, attach the image, and switch back.

Root cause: `Model::Custom` does not carry a `supports_images` field,
and the OpenAI provider's `supports_images()` for the `Custom` arm
hardcodes `false`.

## Changes

1. `crates/settings_content/src/language_model.rs`: add `images: bool`
to `OpenAiModelCapabilities` with `#[serde(default)]` so existing
settings.json files keep working unchanged.
2. `crates/open_ai/src/open_ai.rs`: add `supports_images: bool` to
`Model::Custom` with a matching serde default.
3. `crates/language_models/src/provider/open_ai.rs`: pass
`model.capabilities.images` into the `Model::Custom` variant in
`provided_models`, and return the stored value from `supports_images()`
for `Custom`.

Existing `Model::Custom { .. }` match sites (`completion.rs:829`,
various in `open_ai.rs`) all use `..` so they continue to compile
without change.

## Testing

- `cargo check -p settings_content -p open_ai -p language_models`:
clean.
- I was not able to complete `./script/clippy` locally: the build
stalled on the first-time `webrtc-sys` download for livekit-rust-sdks
(TLS close_notify failure on docs.rs mirror). Happy to rerun once CI has
cached artifacts.
- Manually verified the capability plumbing by tracing: settings.json ->
`OpenAiModelCapabilities.images` -> `Model::Custom { supports_images }`
-> `supports_images()` -> `Thread::prompt_capabilities` ->
`SessionCapabilities.supports_images()` -> `build_add_context_menu` gate
in `thread_view.rs`.

## Related Issues

Closes #50752

Release Notes:

- Fixed custom OpenAI models ignoring the `capabilities.images` setting
in `language_models.openai.available_models`.

This contribution was developed with AI assistance (Codex).

---------

Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
2026-04-24 10:36:46 +00:00
Ben Brandt
2eafa6e6aa
language_models: Remove unused language model token counting (#54177)
Drop the `count_tokens` API and related implementations across
providers, and remove the unused `tiktoken-rs` dependency.

I was going to update the dependency becuase they finally released a fix
we needed. But then I realized we only used this api in one place, the
Rules library. And for most models it would have been wildly incorrect
becuase we use tiktoken, i.e. OpenAI tokenizers, for almost every model,
which is going to give incorrect results.

Given that, I just removed these because the difference in how we get
these has caused plenty of confusion in the past.

Self-Review Checklist:

- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable

Release Notes:

- N/A
2026-04-22 13:39:48 +00:00
Guilherme do Amaral Alves
7b082cbb6f
Add interleaved_reasoning option to openai compatible models (#54016)
Release Notes:

- Added interleaved_reasoning option to openai compatible models

---

This PR adds the interleaved_reasoning option for OpenAI-compatible
models, addressing the issue described in
https://github.com/ggml-org/llama.cpp/issues/20837.

In my testing, enabling interleaved_reasoning not only resolved the
tool-calling issues encountered by Qwen3.5 models in llama.cpp, but also
appeared to improve the model's coding capabilities. I have also
verified the outgoing requests using a proxy to ensure the parameter is
being sent correctly.It is also likely that this change will benefit
other models and providers as well.

Note: While I used AI to assist with the implementation, I have reviewed
and tested the changes. As I am relatively new to Rust and the Zed
codebase, I would appreciate any feedback or suggestions for
improvement. I am happy to make further adjustments if needed.

Thank you all for building such an amazing editor!

Co-authored-by: Oleksiy Syvokon <oleksiy@zed.dev>
2026-04-22 10:40:37 +00:00
Danilo Leal
399d3d267e
docs: Update mentions to "assistant panel" (#53514)
We don't use this terminology anymore; now it's "agent panel".

Release Notes:

- N/A
2026-04-09 10:42:21 -03:00
Agus Zubiaga
98c17ca160
language_models: Refactor deps and extract cloud (#53270)
- `language_model` no longer depends on provider-specific crates such as
`anthropic` and `open_ai` (inverted dependency)
- `language_model_core` was extracted from `language_model` which
contains the types for the provider-specific crates to convert to/from.
- `gpui::SharedString` has been extracted into its own crate (still
exposed by `gpui`), so `language_model_core` and provider API crates
don't have to depend on `gpui`.
- Removes some unnecessary `&'static str` | `SharedString` -> `String`
-> `SharedString` conversions across the codebase.
- Extracts the core logic of the cloud `LanguageModelProvider` into its
own crate with simpler dependencies.


Release Notes:

- N/A

---------

Co-authored-by: John Tur <john-tur@outlook.com>
2026-04-07 12:28:19 -03:00
Ben Brandt
be3a5e2c06
open_ai: Support structured OpenAI tool output content (#51832)
Allow function call outputs to carry either plain text or a list
of input content items, so image tool results are serialized as
image content instead of a raw base64 string.

Release Notes:

- N/A
2026-03-18 12:06:54 +00:00
Elier
905d28cc54
Add stream_options.include_usage for OpenAI-compatible API token usage (#45812)
## Summary

This PR enables token usage reporting in streaming responses for
OpenAI-compatible APIs (OpenAI, xAI/Grok, OpenRouter, etc).

## Problem

Currently, the token counter UI in the Agent Panel doesn't display usage
for some OpenAI-compatible providers because they don't return usage
data during streaming by default. According to OpenAI's API
documentation, the `stream_options.include_usage` parameter must be set
to `true` to receive usage statistics in streaming responses.

## Solution

- Added StreamOptions struct with `include_usage` field to the open_ai
crate
- Added `stream_options` field to the Request struct
- Automatically set `stream_options: { include_usage: true }` when
`stream: true`
- Updated edit_prediction requests with `stream_options: None`
(non-streaming)

## Testing

Tested with xAI Grok models - token counter now correctly shows usage
after sending a message.

## References

- [OpenAI Chat Completions API -
stream_options](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)
- [xAI API Documentation](https://docs.x.ai/api)
2026-03-17 10:38:14 +00:00
Neel
175707f95c
open_ai: Support reasoning summaries in OpenAI Responses API (#50959)
Related to AI-79.

Release Notes:

- N/A
2026-03-09 13:51:22 +00:00
Richard Feldman
3b3ffc022e
Add GPT-5.4 and GPT-5.4-pro BYOK models (#50858)
Add GPT-5.4 and GPT-5.4-pro as Bring Your Own Key model options for the
OpenAI provider.

**GPT-5.4** (`gpt-5.4`):
- 1,050,000 token context window, 128K max output
- Supports chat completions, images, parallel tool calls
- Default reasoning effort: none

**GPT-5.4-pro** (`gpt-5.4-pro`):
- 1,050,000 token context window, 128K max output
- Responses API only (no chat completions)
- Default reasoning effort: medium (supports medium/high/xhigh)

Also fixes context window sizes for GPT-5 mini and GPT-5 nano (272K →
400K) to match current OpenAI docs.

Closes AI-78

Release Notes:

- Added GPT-5.4 and GPT-5.4-pro as available models when using your own
OpenAI API key.
2026-03-05 23:40:03 -05:00
Richard Feldman
a18b7727ee
Add GPT-5.3-Codex BYOK model under the OpenAI provider (#50122)
Adds `gpt-5.3-codex` as a built-in model under the OpenAI provider for
BYOK usage.

Model specs:
- 400,000 context window
- 128,000 max output tokens
- Reasoning token support (default medium effort)
- Uses the Responses API (like other codex models)
- Token counting falls back to the gpt-5 tokenizer

Closes AI-59

Release Notes:

- Added support for GPT-5.3-Codex as a bring-your-own-key model in the
OpenAI provider.
2026-02-25 16:29:01 -05:00
Richard Feldman
0b8424a14c
Remove deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini (#49082)
Remove GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini from BYOK model
options in Zed before OpenAI retires these models.

These models are being retired by OpenAI (ChatGPT workspace support ends
April 3, 2026), so they have been removed from the available models list
in Zed's BYOK provider.

Closes AI-4

Release Notes:

- Removed deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini models
from OpenAI BYOK provider
2026-02-13 04:54:22 +00:00
Oleksiy Syvokon
757ee0571e
ep: Use rejected_output for DPO training + OpenAI support (#47697)
Release Notes:

- N/A

---------

Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2026-01-27 13:02:40 +00:00
Aero
7bd3075d53
open_ai: Support reasoning content (#43662)
Support for Kimi K2 Thinking

Release Notes:

- Added support for thinking traces when using OpenAI-API-compatible AI providers

---------

Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>
2026-01-21 10:08:59 +00:00
Richard Feldman
e5706f2349
Add BYOK GPT-5.2-codex support (#47025)
<img width="449" height="559" alt="Screenshot 2026-01-16 at 4 52 12 PM"
src="https://github.com/user-attachments/assets/1b5583d7-9b90-46b1-a32f-9821543ea542"
/>

Release Notes:

- Add support for GPT-5.2-Codex via OpenAI API Key
2026-01-16 17:09:08 -05:00
Marshall Bowers
c6a38f2cfb
open_ai: Use proper type for Responses API input (#46526)
This PR makes it so we use a proper type for the Responses API `input`
rather than a `serde_json::Value`.

It should have never used `serde_json::Value` to begin with.

Release Notes:

- N/A
2026-01-10 17:40:20 +00:00
Marshall Bowers
30f776e47f
open_ai: Move responses module to its own file (#46450)
This PR moves the `responses` module to its own module in the `open_ai`
crate.

Release Notes:

- N/A
2026-01-09 14:29:08 +00:00
Matt Stallone
84017bca89
Add OpenAI Responses API support with chat_completions capability flag (#39989)
Add support for OpenAI's /responses endpoint for models that don't
support /chat/completions API. This enables compatibility with newer
model variants (`gpt-5-codex`, `gpt-5-pro`, `o3-pro`, etc) while
maintaining compatibility with existing configs

Changes:
- Add `supports_chat_completions` flag to model capabilities that
defaults to true for existing behavior
- Implement responses API client with streaming support as per [OpenAI
documentation](https://app.stainless.com/api/spec/documented/openai/openapi.documented.yml).
- Add `ResponseEventMapper` to convert responses events to completion
events for maintainer simplicity
- Update UI to allow toggling `chat_completions` capability
- Add `gpt-5-codex` model

Closes #38858

Release Notes:
- Added support for `gpt-5-codex` model

---------

Co-authored-by: Bennet Bo Fenner <bennet@zed.dev>
2026-01-05 18:15:54 +01:00
Richard Feldman
b5a0a3322d
Add GPT-5.2 support (#44656)
<img width="429" height="188" alt="Screenshot 2025-12-11 at 3 45 26 PM"
src="https://github.com/user-attachments/assets/fe9f1b86-7268-4c63-a8c2-75ac671012c9"
/>


Release Notes:

- Added GPT-5.2 support when using your own OpenAI key
2025-12-11 15:49:10 -05:00
Agus Zubiaga
f08fd732a7
Add experimental mercury edit prediction provider (#44256)
Release Notes:

- N/A

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
2025-12-06 10:08:44 +00:00
Mikayla Maki
53eb35f5b2
Add GPT 5.1 to Zed BYOK (#43492)
Release Notes:

- Added support for OpenAI's GPT 5.1 model to BYOK
2025-11-25 14:17:27 -08:00
Tim McLean
fb90b12073
Add retry support for OpenAI-compatible LLM providers (#37891)
Automatically retry the agent's LLM completion requests when the
provider returns 429 Too Many Requests. Uses the Retry-After header to
determine the retry delay if it is available.

Many providers are frequently overloaded or have low rate limits. These
providers are essentially unusable without automatic retries.

Tested with Cerebras configured via openai_compatible.

Related: #31531 

Release Notes:

- Added automatic retries for OpenAI-compatible LLM providers

---------

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-11-13 14:15:46 +00:00
Max Brunsfeld
784fdcaee3
zeta2: Build edit prediction prompt and process model output in client (#41870)
Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
2025-11-06 18:36:58 -05:00
Techy
27a18843d4
open_ai: Make the deltas optional (#39142)
I am using an Azure OpenAI instance since that is what is provided at
work and with how they have it setup not all responses contain a delta,
which lead to errors and truncated responses. This is related to how
they are filtering potentially offensive requests and responses. I don't
believe this filter was made in-house, instead I believe it is provided
by Microsoft/Azure, so I suspect this fix may help other users.

Release Notes:

- N/A

Co-authored-by: Bennet Bo Fenner <bennetbo@gmx.de>
2025-11-05 13:47:14 +01:00
Julia Ryan
ef5b8c6fed
Remove workspace-hack (#40216)
We've been considering removing workspace-hack for a couple reasons:
- Lukas ran into a situation where its build script seemed to be causing
spurious rebuilds. This seems more likely to be a cargo bug than an
issue with workspace-hack itself (given that it has an empty build
script), but we don't necessarily want to take the time to hunt that
down right now.
- Marshall mentioned hakari interacts poorly with automated crate
updates (in our case provided by rennovate) because you'd need to have
`cargo hakari generate && cargo hakari manage-deps` after their changes
and we prefer to not have actions that make commits.

Currently removing workspace-hack causes our workspace to grow from
~1700 to ~2000 crates being built (depending on platform), which is
mainly a problem when you're building the whole workspace or running
tests across the the normal and remote binaries (which is where
feature-unification nets us the most sharing). It doesn't impact
incremental times noticeably when you're just iterating on `-p zed`, and
we'll hopefully get these savings back in the future when
rust-lang/cargo#14774 (which re-implements the functionality of hakari)
is finished.

Release Notes:

- N/A
2025-10-17 18:58:14 +00:00
Conrad Irwin
fcdab160f9
Settings refactor (#38367)
Co-Authored-By: Ben K <ben@zed.dev>
Co-Authored-By: Anthony <anthony@zed.dev>
Co-Authored-By: Mikayla <mikayla@zed.dev>

Release Notes:

- settings: Major internal changes to settings. The primary user-facing
effect is that some settings which did not make sense in project
settings files are no-longer read from there. (For example the inline
blame settings)

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Mikayla Maki <mikayla.c.maki@gmail.com>
Co-authored-by: Anthony <anthony@zed.dev>
2025-09-18 16:47:23 +00:00
ZhangJun
7091c70a1e
open_ai: Trim newline before "data:" prefix and account for the possibility of no space after ":" (#37644)
I'am using an openai compatible model, but got nothing in agent thread
panel, and Zed log has "Model generated an empty summary" line.

I add one log to open_ai.rs:
<img width="2454" height="626" alt="图片"
src="https://github.com/user-attachments/assets/85354c7d-a0cc-4bba-86fd-2a640038a13e"
/>

and got:

<img width="3456" height="278" alt="图片"
src="https://github.com/user-attachments/assets/7746aedd-5d76-44b5-90f2-e129a1507178"
/>

It appear that `let line = line.strip_prefix("data: ")?;` can not handle
correctly.

Release Notes:

- N/A

---------

Co-authored-by: Ben Brandt <benjamin.j.brandt@gmail.com>
2025-09-08 22:01:55 +02:00
Umesh Yadav
9f749881b3
language_models: Fix tool_choice null issue for other providers (#34554)
Follow up: #34532

Closes #35434 

Mostly fixes a issue were when the tool_choice is none it was getting
serialised as null. This was fixed for openrouter just wanted to follow
up and cleanup for other providers which might have this issue as this
is against the spec.

Release Notes:

- N/A
2025-09-03 01:22:57 +02:00
Antonio Scandurra
39d86eeb7f
Trim API key when submitting requests to LLM providers (#37082)
This prevents the common footgun of copy/pasting an API key
starting/ending with extra newlines, which would lead to a "bad request"
error.

Closes #37038 

Release Notes:

- agent: Support pasting language model API keys that contain newlines.
2025-08-28 12:00:44 +00:00
Michael Sloan
0470baca50
open_ai: Remove model field from ResponseStreamEvent (#36902)
Closes #36901

Release Notes:

- Fixed use of Open WebUI as an LLM provider.
2025-08-25 19:50:08 +00:00
Piotr Osiewicz
05fc0c432c
Fix a bunch of other low-hanging style lints (#36498)
- **Fix a bunch of low hanging style lints like unnecessary-return**
- **Fix single worktree violation**
- **And the rest**

Release Notes:

- N/A
2025-08-19 21:26:17 +02:00
Oleksiy Syvokon
42ffa8900a
open_ai: Fix error response parsing (#36390)
Closes #35925

Release Notes:

- Fixed OpenAI error response parsing in some cases
2025-08-18 08:54:31 +00:00
Oleksiy Syvokon
2a57b160b0
openai: Don't send prompt_cache_key for OpenAI-compatible models (#36231)
Some APIs fail when they get this parameter

Closes #36215

Release Notes:

- Fixed OpenAI-compatible providers that don't support prompt caching
and/or reasoning
2025-08-15 13:54:24 +03:00
Oleksiy Syvokon
a3dcc76687
openai: Don't send reasoning_effort if it's not set (#36228)
Release Notes:

- N/A
2025-08-15 09:12:18 +00:00
Cretezy
8ff2e3e195
language_models: Add reasoning_effort for custom models (#35929)
Release Notes:

- Added `reasoning_effort` support to custom models

Tested using the following config:
```json5
  "language_models": {
    "openai": {
      "available_models": [
        {
          "name": "gpt-5-mini",
          "display_name": "GPT 5 Mini (custom reasoning)",
          "max_output_tokens": 128000,
          "max_tokens": 272000,
          "reasoning_effort": "high" // Can be minimal, low, medium (default), and high
        }
      ],
      "version": "1"
    }
  }
```

Docs:
https://platform.openai.com/docs/api-reference/chat/create#chat_create-reasoning_effort

This work could be used to split the GPT 5/5-mini/5-nano into each of
it's reasoning effort variant. E.g. `gpt-5`, `gpt-5 low`, `gpt-5
minimal`, `gpt-5 high`, and same for mini/nano.

Release Notes:

* Added a setting to control `reasoning_effort` in OpenAI models
2025-08-13 06:09:16 +00:00
Oleksiy Syvokon
7167f193c0
open_ai: Send prompt_cache_key to improve caching (#36065)
Release Notes:

- N/A

Co-authored-by: Michael Sloan <mgsloan@gmail.com>
2025-08-12 21:51:23 +03:00
Oleksiy Syvokon
7ff0f1525e
open_ai: Log inputs that caused parsing errors (#36063)
Release Notes:

- N/A

Co-authored-by: Michael Sloan <mgsloan@gmail.com>
2025-08-12 21:49:19 +03:00
Richard Feldman
7d4d8b8398
Add GPT-5 support through OpenAI API (#35822)
(This PR does not add GPT-5 to Zed Pro, but rather adds access if you're
using your own OpenAI API key.)

<img width="772" height="333" alt="Screenshot 2025-08-07 at 2 23 18 PM"
src="https://github.com/user-attachments/assets/42e75082-118a-4737-89b6-a740ae33b169"
/>

---

**NOTE:** If your API key is not through a verified organization, you
may see this error:

<img width="549" height="253" alt="Screenshot 2025-08-07 at 2 04 54 PM"
src="https://github.com/user-attachments/assets/d0b6d739-9c39-4af3-88d7-0c9609b0e6ba"
/>

Even if your org is verified, you still may not have access to GPT-5, in
which case you could see this error:

<img width="543" height="98" alt="Screenshot 2025-08-07 at 2 09 18 PM"
src="https://github.com/user-attachments/assets/e3ed31e3-2a11-4f07-8f3c-5b410fbe4540"
/>

One way to test if you're in this situation is to visit
https://platform.openai.com/chat/edit?models=gpt-5 and see if you get
the same "you don't have access to GPT-5" error on OpenAI's official
playground. It looks like this:

<img width="581" height="196" alt="Screenshot 2025-08-07 at 2 15 25 PM"
src="https://github.com/user-attachments/assets/ea1454ca-3c10-4703-8126-c02cb92a34f2"
/>

Release Notes:

- Added GPT-5, as well as its mini and nano variants. To use this, you
need to have an OpenAI API key configured via the `OPENAI_API_KEY`
environment variable.
2025-08-07 23:35:41 +00:00
Umesh Yadav
3f4098e87b
open_ai: Make OpenAI error message generic (#33383)
Context: In this PR: https://github.com/zed-industries/zed/pull/33362,
we started to use underlying open_ai crate for making api calls for
vercel as well. Now whenever we get the error we get something like the
below. Where on part of the error mentions OpenAI but the rest of the
error returns the actual error from provider. This PR tries to make the
error generic for now so that people don't get confused seeing OpenAI in
their v0 integration.

```
Error interacting with language model
Failed to connect to OpenAI API: 403 Forbidden {"success":false,"error":"Premium or Team plan required to access the v0 API: https://v0.dev/chat/settings/billing"}
```

Release Notes:

- N/A
2025-06-28 14:38:27 +02:00
Umesh Yadav
108162423d
language_models: Emit UsageUpdate events for token usage in DeepSeek and OpenAI (#33242)
Closes #ISSUE

Release Notes:

- N/A
2025-06-25 09:42:30 +02:00
Bennet Bo Fenner
c34b24b5fb
open_ai: Fix issues with OpenAI compatible APIs (#32982)
Ran into this while adding support for Vercel v0s models:
- The timestamp seems to be returned in Milliseconds instead of seconds
so it breaks the bounds of `created: u32`. We did not use this field
anywhere so just decided to remove it
- Sometimes the `choices` field can be empty when the last chunk comes
in because it only contains `usage`

Release Notes:

- N/A
2025-06-18 21:51:51 +00:00
Richard Feldman
5405c2c2d3
Standardize on u64 for token counts (#32869)
Previously we were using a mix of `u32` and `usize`, e.g. `max_tokens:
usize, max_output_tokens: Option<u32>` in the same `struct`.

Although [tiktoken](https://github.com/openai/tiktoken) uses `usize`,
token counts should be consistent across targets (e.g. the same model
doesn't suddenly get a smaller context window if you're compiling for
wasm32), and these token counts could end up getting serialized using a
binary protocol, so `usize` is not the right choice for token counts.

I chose to standardize on `u64` over `u32` because we don't store many
of them (so the extra size should be insignificant) and future models
may exceed `u32::MAX` tokens.

Release Notes:

- N/A
2025-06-17 10:43:07 -04:00