Commit graph

65 commits

Author SHA1 Message Date
Ben Kunkle
cd0b373501
ep_cli: Add filter languages subcommand (#47242)
Closes #ISSUE

Release Notes:

- N/A *or* Added/Fixed/Improved ...
2026-01-20 15:58:50 -05:00
Ben Kunkle
37185ea864
ep_cli: Fix "Too many open files" errors (#47243)
Closes #ISSUE

Release Notes:

- N/A *or* Added/Fixed/Improved ...
2026-01-20 15:41:31 -05:00
Max Brunsfeld
a0728db61b
Add --offset flag to ep cli (#47175)
Release Notes:

- N/A

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-20 14:08:15 -05:00
Oleksiy Syvokon
9fce07599a
ep: Make --provider optional, skip prediction when results exist (#47225)
When --provider is not provided, `ep` will now use whatever provider is
recorded in the data.

Release Notes:

- N/A
2026-01-20 17:26:37 +02:00
Oleksiy Syvokon
8e48a16193
Add ep parse-output command (#47220)
This command takes raw LLM outputs (`predictions.actual_output`) that
could be generated elsewhere and parses them into a canonical unified
diff (`predictions.actual_patch`).

This is useful for simplifying the evaluation pipeline and for rerunning
the parser without having to generate LLM outputs.

Release Notes:

- N/A
2026-01-20 17:12:58 +02:00
Oleksiy Syvokon
9a97c5c3db
ep: Add a prompt with git-style merge markers (#47215)
Release Notes:

- N/A
2026-01-20 13:33:43 +00:00
Oleksiy Syvokon
3e309abe59
ep: Fix teacher prompt formatting (#47172)
Release Notes:

- N/A
2026-01-19 20:57:56 +00:00
Oleksiy Syvokon
f98acf4ca9
Make ep split-commit respect --failed=skip (#47150)
Release Notes:

- N/A
2026-01-19 17:26:41 +02:00
Oleksiy Syvokon
ad7c30e539
ep: Missing newlines in teacher prompt (#47143)
Release Notes:

- N/A
2026-01-19 14:45:30 +00:00
Max Brunsfeld
50a90d35b2
Add a 'rejected patch' field to example specs, for DPO examples (#47043)
The `capture example` action now populates the markdown file with a noop
"Rejected Patch", so that you can easily specify the good and bad
output.

Release Notes:

- N/A
2026-01-18 20:25:23 -08:00
Max Brunsfeld
1468ee2ae5
Fix more errors found when retrieving context for a huge example batch (#47039)
Release Notes:

- N/A
2026-01-16 16:43:11 -08:00
Max Brunsfeld
afaccf9c67
Fix edit history clearing bug in ep (#47017)
We were including changes due to Buffer.reload in the edit history.

Release Notes:

- N/A

---------

Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
2026-01-16 15:44:38 -08:00
Ben Kunkle
d1e4ef09ee
Fix not sending file_chunks parameter to Sweep in evals (#46999)
Closes #ISSUE

Release Notes:

- N/A *or* Added/Fixed/Improved ...
2026-01-16 11:56:33 -05:00
Agus Zubiaga
3ce386a118
ep: Add 180 token editable region experiment (#46945)
Release Notes:

- N/A

Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-15 17:46:22 -03:00
Agus Zubiaga
189c9f4124
ep cli: Compute editable region during format-prompt (#46929)
Release Notes:

- N/A

Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-15 18:50:15 +00:00
Oleksiy Syvokon
a10fdfd2b8
ep: Combine PredictionProvider and ZetaVersion (#46896)
We can specify prompt version in the provider name itself, like this
`--provider zeta2:0113`.

This kind of tag will also be stored in the `provider` field of
jsonlines files.

This drops the `--version` parameter.


Release Notes:

- N/A
2026-01-15 14:00:21 -03:00
Max Brunsfeld
ceecf82287
Allow EP synthesize command to take multiple repos (#46853)
Release Notes:

- N/A

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-01-15 01:23:40 +00:00
Max Brunsfeld
445c95aa3c
Fix issues processing captured edit prediction examples (#46773)
Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-01-14 14:32:42 -08:00
Max Brunsfeld
20284e4f21
Introduce zeta2 format with cursor content in original order (#46732)
This one does `fim_prefix`, `fim_middle`, and `fim_suffix` in that
order, in the prompt, instead of putting the current middle last.

Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-13 21:53:44 +00:00
Max Brunsfeld
d67c8f2884
Prevent stale related excerpts by avoiding storing their contents as strings (#46666)
This fixes an issue that we noticed in particular with Mercury edit
predictions.

* [x] fix storage to not go stale
* [x] exclude excerpts that intersect the cursor excerpt
* [x] see if string representation of excerpts can be cached, to avoid
rebuilding it on every prediction

Release Notes:

- N/A

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-13 13:31:23 -08:00
Max Brunsfeld
6b1eb25370
Fix a missing newline in the zeta prompt (#46677)
Release Notes:

- N/A
2026-01-12 23:27:06 -08:00
Oleksiy Syvokon
cbb6e2f563
ep: Fix applying patch to text without a trailing newline (#46471)
Release Notes:

- N/A

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-01-12 14:59:25 +00:00
Oleksiy Syvokon
e6467fc47a
ep: Fix editable region for teacher models (#46459)
Editable region was different for Zeta2 and Teacher, leading to "Edits
outside of editable region" errors.

Release Notes:

- N/A

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-01-09 16:01:19 +00:00
Oleksiy Syvokon
f42d714d33
Add ep --failed=skip to exclude errored examples from output (#46453)
Release Notes:

- N/A
2026-01-09 14:45:08 +00:00
Max Brunsfeld
ad369ca2b7
ep: Cache Anthropic client (#46406)
This makes running `predict` with the teacher model much faster, when
there are many examples.

Release Notes:

- N/A

---------

Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-09 00:04:40 +00:00
Max Brunsfeld
0f75c079a5
Edit prediction: teacher prompt improvements (#46392)
Release Notes:

- N/A

---------

Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-08 22:12:32 +00:00
Agus Zubiaga
23f571d64c
ep: Enable workspace test-support (#46395)
Some code got added to `workspace` that prevents us from running tests
for the `edit_prediction(cli)` crates specifically without the
`test-support` feature flag.

Release Notes:

- N/A

Co-authored-by: Ben Kunkle <ben@zed.dev>
2026-01-08 18:30:07 -03:00
Max Brunsfeld
8bbc3c36c4
Fix EP CLI output flicker (#46313)
Release Notes:

- N/A
2026-01-08 17:08:24 +00:00
Oleksiy Syvokon
11cfdb1e62
Add ep split subcommand for dataset splitting (#46364)
Adds a new `ep split` command that splits JSONL datasets into multiple
output files with stratification by `repository_url` when present.

Example usage:

  ep split input.jsonl train.jsonl=80% valid.jsonl=rest

Release Notes:

- N/A
2026-01-08 13:31:26 +00:00
Oleksiy Syvokon
4c46872ab7
ep: Handle errored requests in Anthropic batches (#46351)
Also, save all requests in a single sqlite transaction -- much faster.

Release Notes:

- N/A
2026-01-08 10:59:03 +00:00
Agus Zubiaga
42af91ddee
ep cli: Resume from output file (#46293) 2026-01-07 21:50:46 -03:00
Mikayla Maki
97c35c084b
gpui: Actually remove the Result from AsyncApp (#45809)
Depends on: https://github.com/zed-industries/zed/pull/45768

Refactor plan:
https://gist.github.com/mikayla-maki/6c4bf263fd80050715ba01f45478796e
Overall plan:
https://gist.github.com/mikayla-maki/7bb5078e4385a2e683e1e1eb40d17d38

This is the big one.

Release Notes:

- N/A

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-07 12:48:24 -08:00
Oleksiy Syvokon
76fed22668
ep: Add file deletion support in unified diff parsing (#46279)
- Replace `is_new_file: bool` with `FileStatus` enum
(Created/Modified/Deleted) in udiff.rs to properly track file status
through diff operations
- Handle deleted files in `apply_diff` by calling
`project.delete_file()`
- Fix diff serialization in reorder_patch.rs to output `+++ /dev/null`
for file deletions and parse both `--- /dev/null` and `+++ /dev/null`
correctly
- Add bounds check for edit ranges exceeding buffer length

Also includes edit_prediction_cli improvements:
- Track `context_range` and `editable_range` in ExampleBuffer for more
precise prompt formatting
- Export MAX_CONTEXT_TOKENS and MAX_REWRITE_TOKENS from zeta2
- Wait for buffer parsing before computing ranges
- Respect NO_COLOR env var and enable info-level logging


Release Notes:

- N/A

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-01-07 18:54:22 +00:00
Agus Zubiaga
30dc8c5f30
ep cli: More substatus granularity (#46266)
Helps narrow down steps that are taking too long

Release Notes:

- N/A
2026-01-07 15:49:18 -03:00
Oleksiy Syvokon
49c4dcb1ef
ep: Fix code block extraction to require closing fence at line start (#46270)
`extract_last_codeblock` was using `find()` to locate closing fences,
which would match backticks anywhere in the text. This caused incorrect
parsing when the content contained inline backticks or nested code
blocks, resulting in wrong diffs.

Fix by requiring the closing fence to be preceded by a newline.
Release Notes:

- N/A
2026-01-07 16:52:04 +00:00
Oleksiy Syvokon
82826e723e
ep: Fix revision resolution for tilde expressions (#46258)
When a revision like `abcd~1` wasn't found locally, the function would
fetch and then return `FETCH_HEAD`, which points to the tip of the
fetched branch rather than the requested revision expression.

Now it re-resolves the original revision after fetching, correctly
handling tilde expressions and other git revision syntax.

Release Notes:

- N/A
2026-01-07 14:18:54 +00:00
Oleksiy Syvokon
0136e41327
ep: Fix incorrect example count in failure summary (#46257)
The 'X of Y examples failed' message was counting completed steps/tasks
instead of actual examples. For example, with 2 examples each going
through Load and Context steps, it would report '1 of 3' instead of '1
of 2'.

Release Notes:

- N/A
2026-01-07 14:09:33 +00:00
Agus Zubiaga
523c27ab1f
ep cli: Clean leftover git locks (#46255)
Sometimes git locks are left over from crashed runs. We now assume
there's only one process of the CLI running and clean them up. If we
want to run multiple processes at the same time, we should consider our
own file-based lock, but it seems fine for an internal tool.

Release Notes:

- N/A
2026-01-07 14:01:57 +00:00
Agus Zubiaga
134c5e6bf9
ep cli: Handle opening buffers from files created by the edit history (#46254)
Since we don't persist new files to disk, they don't have entries, so we
have to look them up in memory first.

Release Notes:

- N/A
2026-01-07 13:52:42 +00:00
Max Brunsfeld
8ca638150a
Fix some issues with edit prediction CLI (#46197)
* Added `--repo` and `--name` flags for running only examples with a
specific name, or repo (substring matching)
* Fixed a race condition that caused hangs when running multiple
examples at the same repo and sha
* Fixed a bug where scoring was completely wrong because I had passed
the arguments to `apply_diff_to_string` in the wrong order
* The current evals now run quickly and without errors.

Release Notes:

- N/A
2026-01-06 14:11:19 -08:00
Agus Zubiaga
f0e0213552
ep cli: Fix finalize counter (#46198)
Release Notes:

- N/A
2026-01-06 22:04:36 +00:00
Agus Zubiaga
5be7d6f641
ep cli: Refresh cursor path entry (#46195)
Release Notes:

- N/A
2026-01-06 21:41:58 +00:00
Agus Zubiaga
a8bc84c43f
ep cli: Include cursor file in errors (#46192)
Errors now include the cursor file path so we can just cmd+click it for
inspection

Release Notes:

- N/A
2026-01-06 21:16:56 +00:00
Agus Zubiaga
114bc699a8
ep: Support both full paths and relative paths in examples (#46177)
Release Notes:

- N/A

Co-authored-by: Max Brunsfeld <maxbrunsfeld@gmail.com>
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
2026-01-06 15:14:36 -03:00
Oleksiy Syvokon
efeea7973e
Edit prediction changes (#46169)
1. Handle diffs with no trailing new lines 
2. ep: Don't assume workdir name in edit history paths
3. Fix `imitate_human_edits()` for pure insertions

Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <agus@zed.dev>
2026-01-06 16:49:26 +00:00
Oleksiy Syvokon
92b0f144c0
Changes to make ep split-commits work (#46160)
1. `apply_diff` will create a file if the diff says so (by starting with
`--- /dev/null`)
2. Update examples format to match recent changes
3. `ep split-commits` can work with a stream of inputs and generate `n`
samples per input
4. Unicode handling fixes

Release Notes:

- N/A
2026-01-06 14:38:44 +00:00
Agus Zubiaga
583a479f77
ep cli: Load captured examples from Snowflake (#46102)
Release Notes:

- N/A
2026-01-06 10:40:13 -03:00
Oleksiy Syvokon
2a45dbf63e
Add ep split-commit command (#46067)
Generates a training or evaluation example from a
chronologically-ordered commit. This is a port from the Python codebase
(except for the reorder_patch.rs, which was originally written in Rust
in).


Release Notes:

- N/A
2026-01-05 13:13:46 +02:00
Max Brunsfeld
9a79cb8ba1
Improve support for collecting edit prediction training and eval examples (#45914)
* Fix some bugs in capture of EP examples from running app
* Tweak markdown format for EP examples
    * Store repo and revision in TOML front matter
    * Represent cursor position using a comment line
* Allow multiple expected patches in evals
* Remove line-based scoring criteria for evals
* Add a `synthesize` subcommand to the EP cli that generates examples
from git commits

Release Notes:

- N/A
2026-01-03 16:08:35 -08:00
Max Brunsfeld
07ada58466
Improve edit prediction example capture (#45536)
This PR improves the `edit prediction: Capture Example` in several ways:
* fixed bugs in how the uncommitted diff was calculated
* added a `edit_predictions.examples_dir` setting that can be set in
order to have the action automatically save examples into the given
folder
* moved the action into the `edit_predictions` crate, in preparation for
collecting this data passively from end users, when they have opted in
to data sharing, similar to what we did for Zeta 1

Release Notes:

- N/A
2025-12-22 20:40:02 +00:00