This format generates fewer token while maintaining the quality:
```
Model Generated tokens ↓ DeltaChrF ↑
0306-seed-multi-regions 46,239 80.62
0304-seed-no-edits 110,871 80.61
0303-seed 271,457 79.62
```
In addition to the student format, this change adds a new teacher
prompt. It seems to be worse than the original, but I haven't optimized
it at all. Keeping it for now as a base for potential improvements.
Release Notes:
- N/A
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Previously we didn't distinguish between an empty `.related_files[]` and
a case where context collection hadn't run yet. As a result, context
retrieval was always attempted for examples with empty `related_files`.
Release Notes:
- N/A
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Emit client-side organization changed events through
`RefreshLlmTokenListener` so it produces the same `RefreshLlmTokenEvent`
used for server-pushed `UserUpdated` messages.
This keeps token refresh fan-out in one place.
Closes CLO-383.
Release Notes:
- N/A
---------
Co-authored-by: Tom Houlé <tom@tomhoule.com>
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
This can be used like `ep predict --provider
baseten:V0131GitMergeMarkersPrefix`. Since it doesn't require
load_project, it can be used with captured requests.
Release Notes:
- N/A
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
I've also fixed a race condition with the programmatic context retrieval
in the CLI, which was causing no excerpts to be fetched for the Rust
examples.
Release Notes:
- N/A
This is a staff only toggle for now, since the consequences of
activating it are not obvious and quite dire (tokens costs 6 times
more).
Also, persist thinking, thinking effort and fast mode in DbThread so the
thinking mode toggle and thinking effort are persisted.
Release Notes:
- Agent: The thinking mode toggle and thinking effort are now persisted
when selecting a thread from history.
In some evals, the teacher produced hallucinations, seemingly due to
context rot. This makes the zeta prompt crate's budgeted rendering
usable by the teacher, so that it can truncate the list of excerpts.
I've also cleaned up the implementation of zeta_prompt's
`format_related_files_within_budget`, and changed the behavior so that
it filters the the excerpts by priority but renders the files in their
original order.
Release Notes:
- N/A
Previously, we were not computing excerpt regions correctly for EP
examples captured from prod. This PR fixes that, and also simplifies the
data flow in the EP CLI. Examples either come from a concise spec (like
the markdown evals), or are collected from prod. Either way, we compute
from them a `ZetaPromptInput`, and the downstream steps like
prompt-formatting and scoring are derived from that.
Release Notes:
- N/A
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
This will not affect how Zeta 2 behaves in production until we update
Cloud to pull in the changes to the `zeta_prompt` crate. But from some
early testing, it seems to improve behavior, not worsen it, even though
the editable region size differs from the currently-deployed model's
training data.
Release Notes:
- N/A
---------
Co-authored-by: Oleksiy Syvokon <oleksiy.syvokon@gmail.com>
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
Duplicates are defined as cursor positions that have an approximate
Jaccard similarity greater than 0.5 (over token 5-grams).
From the resulting clusters of near-duplicates, we select up to N
examples that are maximally different from each other.
Release Notes:
- N/A
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Release Notes:
- Added the ability to use a self-hosted OpenAI-compatible server for
edit predictions.
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
#2874 on steroids
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A
---------
Co-authored-by: Eric Holk <eric@zed.dev>
Closes #ISSUE
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [ ] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A *or* Added/Fixed/Improved ...
* More conservative predictions for prose
* Explain "user accepted prediction" in the teacher prompt
* Sonnet 4.6 support
* Don't strip comments in teacher prompt's edit history
Release Notes:
- N/A
This allows us to just pull requests that have the latest EP request
schema with the `predicted` boolean on events in the edit history.
Release Notes:
- N/A
The cache isn't needed, now that we have a better way of reducing
resource consumption (disabling worktree scanning), and it adds race
conditions.
Release Notes:
- N/A
It now creates multi-turn conversation, where first two messages are the
original teacher prompt and output.
This way, we'll have changes in teacher prompt automatically applied to
repair without having to explicitly sync them.
Release Notes:
- N/A
Other changes:
- Changed tokenization to more code-aware tokenization from split-commit
- Fixed word-diff implementation which was inefficient and sometimes
incorrect
Release Notes:
- N/A