zed/crates/eval
Richard Feldman 0b8424a14c
Remove deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini (#49082)
Remove GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini from BYOK model
options in Zed before OpenAI retires these models.

These models are being retired by OpenAI (ChatGPT workspace support ends
April 3, 2026), so they have been removed from the available models list
in Zed's BYOK provider.

Closes AI-4

Release Notes:

- Removed deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini models
from OpenAI BYOK provider
2026-02-13 04:54:22 +00:00
..
docs eval: Add HTML overview for evaluation runs (#29413) 2025-04-25 17:49:05 +03:00
src Remove deprecated GPT-4o, GPT-4.1, GPT-4.1-mini, and o4-mini (#49082) 2026-02-13 04:54:22 +00:00
.gitignore Add judge to new eval + provide LSP diagnostics (#28713) 2025-04-14 20:18:47 +00:00
build.rs Use distinct user agents in agent eval and zeta-cli (#35897) 2025-08-08 23:26:38 +00:00
Cargo.toml eval: Port to agent2 (#40704) 2025-10-22 17:55:26 +00:00
LICENSE-GPL Lay the groundwork for a Rust-based eval (#28488) 2025-04-10 04:45:27 +00:00
README.md eval: Add support for reading from a .env file (#29426) 2025-04-25 15:53:02 +00:00
runner_settings.json Replace always_allow_tool_actions with tool_permissions.default (#48553) 2026-02-10 18:57:31 -05:00

Eval

This eval assumes the working directory is the root of the repository. Run it with:

cargo run -p eval

The eval will optionally read a .env file in crates/eval if you need it to set environment variables, such as API keys.

Explorer Tool

The explorer tool generates a self-contained HTML view from one or more thread JSON file. It provides a visual interface to explore the agent thread, including tool calls and results. See ./docs/explorer.md for more details.

Usage

cargo run -p eval --bin explorer -- --input <path-to-json-files> --output <output-html-path>

Example:

cargo run -p eval --bin explorer -- --input ./runs/2025-04-23_15-53-30/fastmcp_bugifx/*/last.messages.json --output /tmp/explorer.html