Helps with https://github.com/zed-industries/zed/issues/38927
- **editor: Add a benchmark for find/replace**
- **text: batch fragment insertions before turning them into a SumTree**
## Context
<!-- What does this PR do, and why? How is it expected to impact users?
Not just what changed, but what motivated it and why this approach.
Link to Linear issue (e.g., ENG-123) or GitHub issue (e.g., Closes#456)
if one exists — helps with traceability. -->
## How to Review
<!-- Help reviewers focus their attention:
- For small PRs: note what to focus on (e.g., "error handling in
foo.rs")
- For large PRs (>400 LOC): provide a guided tour — numbered list of
files/commits to read in order. (The `large-pr` label is applied
automatically.)
- See the review process guidelines for comment conventions -->
## Self-Review Checklist
<!-- Check before requesting review: -->
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [ ] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Release Notes:
- Improved performance of "Replace All" in buffer search
---------
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>
TODO:
- [x] merge main
- [x] nonshrinking `set_excerpts_for_path`
- [x] Test-drive potential problem areas in the app
- [x] prepare cloud side
- [x] test collaboration
- [ ] docstrings
- [ ] ???
## Context
### Background
Currently, a multibuffer consists of an arbitrary list of
anchor-delimited excerpts from individual buffers. Excerpt ranges for a
fixed buffer are permitted to overlap, and can appear in any order in
the multibuffer, possibly separated by excerpts from other buffers.
However, in practice all code that constructs multibuffers does so using
the APIs defined in the `path_key` submodule of the `multi_buffer` crate
(`set_excerpts_for_path` etc.) If you only use these APIs, the resulting
multibuffer will maintain the following invariants:
- All excerpts for the same buffer appear contiguously in the
multibuffer
- Excerpts for the same buffer cannot overlap
- Excerpts for the same buffer appear in order
- The placement of the excerpts for a specific buffer in the multibuffer
are determined by the `PathKey` passed to `set_excerpts_for_path`. There
is exactly one `PathKey` per buffer in the multibuffer
### Purpose of this PR
This PR changes the multibuffer so that the invariants maintained by the
`path_key` APIs *always* hold. It's no longer possible to construct a
multibuffer with overlapping excerpts, etc. The APIs that permitted
this, like `insert_excerpts_with_ids_after`, have been removed in favor
of the `path_key` suite.
The main upshot of this is that given a `text::Anchor` and a
multibuffer, it's possible to efficiently figure out the unique excerpt
that includes that anchor, if any:
```
impl MultiBufferSnapshot {
fn buffer_anchor_to_anchor(&self, anchor: text::Anchor) -> Option<multi_buffer::Anchor>;
}
```
And in the other direction, given a `multi_buffer::Anchor`, we can look
at its `text::Anchor` to locate the excerpt that contains it. That means
we don't need an `ExcerptId` to create or resolve
`multi_buffer::Anchor`, and in fact we can delete `ExcerptId` entirely,
so that excerpts no longer have any identity outside their
`Range<text::Anchor>`.
There are a large number of changes to `editor` and other downstream
crates as a result of removing `ExcerptId` and multibuffer APIs that
assumed it.
### Other changes
There are some other improvements that are not immediate consequences of
that big change, but helped make it smoother. Notably:
- The `buffer_id` field of `text::Anchor` is no longer optional.
`text::Anchor::{MIN, MAX}` have been removed in favor of
`min_for_buffer`, etc.
- `multi_buffer::Anchor` is now a three-variant enum (inlined slightly):
```
enum Anchor {
Min,
Excerpt {
text_anchor: text::Anchor,
path_key_index: PathKeyIndex,
diff_base_anchor: Option<text::Anchor>,
},
Max,
}
```
That means it's no longer possible to unconditionally access the
`text_anchor` field, which is good because most of the places that were
doing that were buggy for min/max! Instead, we have a new API that
correctly resolves min/max to the start of the first excerpt or the end
of the last excerpt:
```
impl MultiBufferSnapshot {
fn anchor_to_buffer_anchor(&self, anchor: multi_buffer::Anchor) -> Option<text::Anchor>;
}
```
- `MultiBufferExcerpt` has been removed in favor of a new
`map_excerpt_ranges` API directly on `MultiBufferSnapshot`.
## Self-Review Checklist
<!-- Check before requesting review: -->
- [x] I've reviewed my own diff for quality, security, and reliability
- [x] Unsafe blocks (if any) have justifying comments
- [x] The content is consistent with the [UI/UX
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
- [x] Tests cover the new/changed behavior
- [x] Performance impact has been considered and is acceptable
Release Notes:
- N/A
---------
Co-authored-by: Conrad Irwin <conrad.irwin@gmail.com>
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Co-authored-by: Jakub Konka <kubkon@jakubkonka.com>
Co-authored-by: Conrad <conrad@zed.dev>
When undoing a paste, it is really confusing when that actually also
removes what was type right before the paste if the paste happened fast
enough after.
Release Notes:
- Fixed undoing a paste sometimes also undoing edits right before the
paste
To do
* [x] turn the operations from collab into a failing test
* [x] fix the crash
* [ ] turn the huge set of operations into a succinct test that can be
checked in on main
Release Notes:
- N/A
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
Closes#43329
## Summary
This fixes `path:line:column` navigation for files containing non-ASCII
text.
Before this change, open path flows were passing the external column
directly into `go_to_singleton_buffer_point`. That happened to work for
ASCII, but it was wrong for Unicode because external columns are
user-visible character positions while the editor buffer stores columns
as UTF-8 byte offsets.
This PR adds a shared text layer conversion for external row/column
coordinates and uses it in the affected open-path flows:
- file finder navigation
- recent project remote connection navigation
- recent project remote server navigation
It also adds regression coverage for the Unicode case that originally
failed.
As a small - necessary - prerequisite, this also adds
`remote_connection` test support to `file_finder`'s dev-deps so the
local regression test can build and run on this branch. That follows the
same feature mismatch pattern previously fixed in #48280. I wasn't able
to locally verify the tests/etc. w/o the addition... so I've rolled it
into this PR. Tests are green.
The earlier attempt in #47093 was headed in the right direction, but it
did not land and did not include the final regression coverage requested
in review.
## Verification
- `cargo fmt --all -- --check`
- `./script/clippy -p text`
- `./script/clippy -p file_finder`
- `./script/clippy -p recent_projects`
- `cargo test -p file_finder --lib --no-run`
- `cargo test -p file_finder
file_finder_tests::test_row_column_numbers_query_inside_file -- --exact`
- `cargo test -p file_finder
file_finder_tests::test_row_column_numbers_query_inside_unicode_file --
--exact`
- `cargo test -p text
tests::test_point_for_row_and_column_from_external_source -- --exact`
## Manual
I reproduced locally on my machine (macOS) w/ a stateless launch using a
Unicode file (Cyrillic)
Before:
- `:1:5` landed too far left
- `:1:10` landed around the 4th visible Cyrillic character
After:
- `:1:5` lands after 4 visible characters / before the 5th
- `:1:10` lands after 9 visible characters / before the 10th
Release Notes:
- Fixed `path:line:column` navigation so non-ASCII columns land on the
correct character.
---------
Co-authored-by: Kirill Bulatov <kirill@zed.dev>
We are seeing the timestamp comparison fail in an unexpected way, e.g.
https://zed-dev.sentry.io/issues/7293482758/events/c7f7eab3f8f2463f879d4889a80d623e,
where it seems like `text::Anchor::is_max` should be returning true but
it apparently isn't. Add some more information when this panic happens
to understand what's going on.
Release Notes:
- N/A
Co-authored-by: Piotr Osiewicz <24362066+osiewicz@users.noreply.github.com>
Reduces memory usage of `InsertionSlice` from 32 to 24 bytes, `Fragment`
from 120 to 96 bytes by narrowing offsets that are relative to
individual insertion operations from `usize` to `u32`. These offsets are
bounded by the size of a single insertion, not the total buffer size, so
`u32` is sufficient.
To prevent any single insertion from exceeding `u32::MAX` bytes, both
`Buffer::new_normalized` and `apply_local_edit`/`apply_remote_edit` now
split large text insertions into multiple fragments via
`push_fragments_for_insertion`.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Helps #38927
- In the common case, each fragment has at most one deletion, so we can
skip the heap allocation
- SmallVec is smaller than an empty HashSet
- Makes cloning cheaper
- From my analysis of the code it's not possible to have duplicate ticks
(I may be wrong tho)
Before you mark this PR as ready for review, make sure that you have:
- [ ] Added a solid test coverage and/or screenshots from doing manual
testing
- [x] Done a self-review taking into account security and performance
aspects
- [ ] Aligned any UI changes with the [UI
checklist](https://github.com/zed-industries/zed/blob/main/CONTRIBUTING.md#uiux-checklist)
Release Notes:
- N/A
This shrinks the size of `text::Anchor` from 32 bytes to 24 and
`multi_buffer::Anchor` from 72 bytes to 56
Release Notes:
- Improved the memory usage of Zed a bit
- Ensure that both sides are passed the appropriate companion data to
preserve spacers when syncing
- Remove companion handling in codepaths related to range folding, since
this isn't supported in the side-by-side diff
- Move handling of buffer folding into the block map
- Rework `set_companion` to handle both `DisplayMap`s at once
- DRY some code around block map syncing in the `DisplayMap`
TODO:
- [x] diagnose and fix issue that causes balancing blocks not to render
properly when they are adjacent to spacers (e.g. merge conflict buttons)
- [x] clear balancing blocks when clearing companion
- [x] additional tests: interaction between spacers and balancing
blocks, resizing
Release Notes:
- N/A
When the version is the same we used to still seek through the
underlying fragment sumtree despite having nothing to return. This has a
lot of unnecessary overhead.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Part of #7450
Big thanks to @macmv for pushing this forwards so much!
Rebased version of https://github.com/zed-industries/zed/pull/39539 as
working on an in-org branch simplifies a lot of things for us)
Release Notes:
- Added LSP semantic tokens highlighting support
---------
Co-authored-by: Neil Macneale V <neil.macneale.v@gmail.com>
Co-authored-by: Kirill Bulatov <kirill@zed.dev>
Co-authored-by: Zed Zippy <234243425+zed-zippy[bot]@users.noreply.github.com>
The side-by-side diff heavily relies on a primitive from `buffer_diff`
that converts a point on one side of the diff to a range of points on
the other side. The way this primitive is set up on main is pretty
naive--every time we call `points_to_base_text_points` (or
`base_text_points_to_points`), we need to iterate over all hunks in the
diff. That's particularly bad for the case of constructing a new
side-by-side diff starting from a multibuffer, because we call those
APIs once per excerpt, and the number of excerpts is ~equal to the
number of hunks.
This PR changes the point translation APIs exposed by `buffer_diff` to
make it easier to use them efficiently in `editor`. The new shape is a
pair of functions that return a patch that can be used to translate from
the main buffer to the base text or vice versa. When syncing edits
through the block map that touch several excerpts for the same buffer,
we can reuse this patch for excerpts after the first--so when building a
new side-by-side diff, we'll iterate over each hunk just once.
The shape of the new APIs also sets us up to scale down to cases like
editing on the right-hand side of the diff: we can pass in a point range
and give them permission to return an approximate patch that's only
guaranteed to give the correct results when used with points in that
range. For edits that only affect one excerpt, and given how the project
diff is set up, that should allow us to skip iterating over most of the
hunks in a buffer.
Release Notes:
- N/A
---------
Co-authored-by: cameron <cameron.studdstreet@gmail.com>
The logic in `anchor_at_offset` and `to_offset` seems inverted when
checking character boundaries. `assert_char_boundary` returns `true`
when the offset IS valid, but the code was adjusting offsets when the
function returned `true` (valid) instead of `false` (invalid).
Release Notes:
- N/A
This PR reworks the (still feature-gated) side-by-side diff view to use
a different approach to representing the multibuffers on the left- and
right-hand sides.
Previously, these two multibuffers used identical sets of buffers and
excerpts, and were made to behave differently by adding a new knob to
the multibuffer controlling how diffs are displayed. Specifically, the
left-hand side multibuffer would filter out the added range of each hunk
from the excerpts using a new `FilteredInsertedHunk` diff transform, and
the right-hand side would simply not show the deleted sides of expanded
hunks. This approach has some problems:
- Line numbers, and actions that navigate by line number, behaved
incorrectly for the left-hand side.
- Syntax highlighting and other features that use the buffer syntax tree
also behaved incorrectly for the left-hand side.
In this PR, we've switched to using independent buffers to build the
left-hand side. These buffers are constructed using the base texts for
the corresponding diffs, and their lifecycle is managed by `BufferDiff`.
The red "deleted" regions on the left-hand side are represented by
`BufferContent` diff transforms, not `DeletedHunk` transforms. This
means each excerpt on the left represents a contiguous slice of a single
buffer, which fixes the above issues by construction.
The tradeoff with this new approach is that we now have to manually
synchronize excerpt ranges from the right side to the left, which we do
using `BufferDiffSnapshot::row_to_base_text_row`.
Release Notes:
- N/A
---------
Co-authored-by: cameron <cameron.studdstreet@gmail.com>
Co-authored-by: HactarCE <6060305+HactarCE@users.noreply.github.com>
Co-authored-by: Miguel Raz Guzmán Macedo <miguel@zed.dev>
Co-authored-by: Anthony <anthony@zed.dev>
Co-authored-by: Cameron <cameron@zed.dev>
Currently we have a single cache for this data shared between all
snapshots which is incorrect, as we might update the cache to a new
version while having old snapshots around which then may try to access
new data with old offsets/rows.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
To do
* [x] Default to no context retrieval. Allow opting in to LSP-based
retrieval via a setting (for users in `zeta2` feature flag)
* [x] Feed this context to models when enabled
* [x] Make the zeta2 context view work well with LSP retrieval
* [x] Add a UI for the setting (for feature-flagged users)
* [x] Ensure Zeta CLI `context` command is usable
---
* [ ] Filter out LSP definitions that are too large / entire files (e.g.
modules)
* [ ] Introduce timeouts
* [ ] Test with other LSPs
* [ ] Figure out hangs
Release Notes:
- N/A
---------
Co-authored-by: Ben Kunkle <ben@zed.dev>
Co-authored-by: Agus Zubiaga <agus@zed.dev>
It is easy for us to get the two fields out of sync causing weird
problems, there is no reason to have both here so.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Co-authored by: Antonio Scandurra <antonio@zed.dev>
These appear in a lot of stacktraces (especially on windows) despite
them being plain forwarding calls.
Also removes some intermediate calls within gpui that will only turn
into more unnecessary compiler work.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
This PR introduces a new `MultiBufferOffset` new type wrapping size. The
goal of this is to make it clear at the type level when we are
interacting with offsets of a multi buffer versus offsets of a language
/ text buffer. This improves readability of things quite a bit by making
it clear what kind of offsets one is working with while also reducing
accidental bugs by using the wrong kin of offset for the wrong API.
This PR also uncovered two minor bugs due to that.
Does not yet introduce the MultiBufferPoint equivalent, that is for a
follow up PR.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
As discussed in the first responders meeting. We have collected a lot of
backtraces from these, but it's not quite clear yet what causes this.
Removing these should ideally make things a bit more stable even if we
may run into panics later one when the faulty anchor is used still.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
We only use it a handful of times and the default amount of threads
(logical cpu core number) its spawns is overkill for this. This also
gives the threads names oppose to being labeled `<unknown>`
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Closes https://github.com/zed-industries/zed/issues/38453
Current `Buffer` API only allows getting buffer text with `\n` line
breaks — even if the `\r\n` was used in the original file's text.
This it not correct in certain cases like LSP formatting, where language
servers need to have original document context for e.g. formatting
purposes.
Added new `Buffer` API, replaced all buffer LSP registration places with
the new one and added more tests.
Release Notes:
- Fixed ESLint linebreak-style errors by preserving line endings in LSP
communication
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Kirill Bulatov <kirill@zed.dev>
In an attempt to figure out what's wrong with `point_to_buffer_offset`
for crash https://github.com/zed-industries/zed/issues/40453. We want to
know which branch among these two is the bad one.
Release Notes:
- N/A
Co-authored-by: Lukas Wirth <lukas@zed.dev>
Closes#5294
This PR adds a line ending indicator to the status bar, hidden by
default as discussed in
https://github.com/zed-industries/zed/issues/5294.
### Changes
- 8b063a22d8700bed9c93989b9e0f6a064b2e86cf add the indicator and
`status_bar.line_endings_button` setting.
- ~~9926237b709dd4e25ce58d558fd385d63b405f3b changes
`status_bar.line_endings_button` from a boolean to an enum:~~
<details> <summary> show details </summary>
- `always` Always show line endings indicator.
- `non_native` Indicate when line endings do not match the current
platform.
- `lf_only` Indicate when using unix-style (LF) line endings only.
- `crlf_only` Indicate when using windows-style (CRLF) line endings
only.
- `never` Do not show line endings indicator.
I know this many options might be overdoing it, but I was torn between
the pleasant default of `non_native` and the simplicity of `lf_only` /
`crlf_only`.
My thinking was if one is developing on a project which exclusively uses
one line-ending style or the other, it would be nice to be able to
configure no-indicator-in-the-happy-case behavior regardless of the
platform zed is running on. But I'm not really familiar with any
projects that use exclusively CRLF line endings in practice. Is this a
scenario worth supporting or just something I dreamed up?
</details>
- 01174191e4cf337069e7a31b0f0432ae94c52515 rename the action context for
`line ending: Toggle` -> `line ending selector: Toggle`.
When running the action in the command palette with the old name I felt
surprised to be greeted with an additional menu, with the new name it
feels more predictable (plus now it matches
`language_selector::Toggle`!)
### Future work
Hidden status bar items still get padding, creating inconsistent spacing
(and it kind of stands out where I placed the line-endings button):
<img alt="the gap after the indicator is larger than for other buttons"
src="https://github.com/user-attachments/assets/24a346d4-3ff6-4f7f-bd87-64d453c2441a"
/>
I started a new follow-up PR to address that:
https://github.com/zed-industries/zed/pull/39992
Release Notes:
- Added line ending indicator to the status bar (disabled by default;
enabled by setting `status_bar.line_endings_button` to `true`)
Reduces peak stack usage in these functions and should generally be a
bit performant.
Display map benchmark results
```
To tab point/to_tab_point/1024
time: [531.40 ns 532.10 ns 532.97 ns]
change: [-2.1824% -2.0054% -1.8125%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
To fold point/to_fold_point/1024
time: [530.81 ns 531.30 ns 531.80 ns]
change: [-2.0295% -1.9054% -1.7716%] (p = 0.00 < 0.05)
Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
```
Release Notes:
- N/A *or* Added/Fixed/Improved ...
- Notable change is the use of a newtype for `ReplicaId`
- Fixes `WorktreeStore::create_remote_worktree` creating a remote
worktree with the local replica id, though this is not currently used
- Fixes observing the `Agent` (that is following the agent) causing
global clocks to allocate 65535 elements
- Shrinks the size of `Global` a bit. In a local or non-collab remote
session it won't ever allocate still.
Release Notes:
- N/A *or* Added/Fixed/Improved ...
Closes#5355
Release Notes:
- Fixed rendering glitches with files with more than 16 million lines
(that occured due to floating number rounding errors).
---------
Co-authored-by: Smit Barmase <heysmitbarmase@gmail.com>