zed/crates/language/benches/highlight_map.rs
Nathan Sobo 3ce0cd11ec
Extract language_core and grammars crates from language (#52238)
This extracts a `language_core` crate from the existing `language`
crate, and creates a `grammars` data crate. The goal is to separate
tree-sitter grammar infrastructure, language configuration, and LSP
adapter types from the heavier buffer/editor integration layer in
`language`.

## Motivation

The `language` crate pulls in `text`, `theme`, `settings`, `rpc`,
`task`, `fs`, `clock`, `sum_tree`, and `fuzzy` — all of which are needed
for buffer integration (`Buffer`, `SyntaxMap`, `Outline`,
`DiagnosticSet`) but not for grammar parsing or language configuration.
Extracting the core types lets downstream consumers depend on
`language_core` without pulling in the full integration stack.

## Dependency graph after extraction

```
language_core   ← gpui, lsp, tree-sitter, util, collections
grammars        ← language_core, rust_embed, tree-sitter-{rust,python,...}
language        ← language_core, text, theme, settings, rpc, task, fs, ...
languages       ← language, grammars
```

## What moved to `language_core`

- `Grammar`, `GrammarId`, and all query config/builder types
- `LanguageConfig`, `LanguageMatcher`, bracket/comment/indent config
types
- `HighlightMap`, `HighlightId` (theme-dependent free functions
`highlight_style` and `highlight_name` stay in `language`)
- `LanguageName`, `LanguageId`
- `LanguageQueries`, `QUERY_FILENAME_PREFIXES`
- `CodeLabel`, `CodeLabelBuilder`, `Symbol`
- `Diagnostic`, `DiagnosticSourceKind`
- `Toolchain`, `ToolchainScope`, `ToolchainList`, `ToolchainMetadata`
- `ManifestName`
- `SoftWrap`
- LSP data types: `BinaryStatus`, `ServerHealth`,
`LanguageServerStatusUpdate`, `PromptResponseContext`, `ToLspPosition`

## What stays in `language`

- `Buffer`, `BufferSnapshot`, `SyntaxMap`, `Outline`, `DiagnosticSet`,
`LanguageScope`
- `LspAdapter`, `CachedLspAdapter`, `LspAdapterDelegate` (reference
`Arc<Language>` and `WorktreeId`)
- `ToolchainLister`, `LanguageToolchainStore` (reference `task` and
`settings` types)
- `ManifestQuery`, `ManifestProvider`, `ManifestDelegate` (reference
`WorktreeId`)
- Parser/query cursor pools, `PLAIN_TEXT`, point conversion functions

## What the `grammars` crate provides

- Embedded `.scm` query files and `config.toml` files for all built-in
languages (via `rust_embed`)
- `load_queries(name)`, `load_config(name)`,
`load_config_for_feature(name, grammars_loaded)`, and `get_file(path)`
functions
- `native_grammars()` for tree-sitter grammar registration (behind
`load-grammars` feature)

## Pre-cleanup (also in this PR)

- Removed unused `Option<&Buffer>` from
`LspAdapter::process_diagnostics`
- Removed unused `&App` from `LspAdapter::retain_old_diagnostic`
- Removed `fs: &dyn Fs` from `ToolchainLister` trait methods
(`PythonToolchainProvider` captures `fs` at construction time instead)
- Moved `Diagnostic`/`DiagnosticSourceKind` out of `buffer.rs` into
their own module

## Backward compatibility

The `language` crate re-exports everything from `language_core`, so
existing `use language::Grammar` (etc.) continues to work unchanged. The
only downstream change required is importing `CodeLabelExt` where
`.fallback_for_completion()` is called on the now-foreign `CodeLabel`
type.

Release Notes:

- N/A

---------

Co-authored-by: Agus Zubiaga <agus@zed.dev>
Co-authored-by: Tom Houlé <tom@tomhoule.com>
2026-03-25 23:41:09 +00:00

144 lines
3.5 KiB
Rust

use criterion::{BenchmarkId, Criterion, black_box, criterion_group, criterion_main};
use gpui::rgba;
use language::build_highlight_map;
use theme::SyntaxTheme;
fn syntax_theme(highlight_names: &[&str]) -> SyntaxTheme {
SyntaxTheme::new(highlight_names.iter().enumerate().map(|(i, name)| {
let r = ((i * 37) % 256) as u8;
let g = ((i * 53) % 256) as u8;
let b = ((i * 71) % 256) as u8;
let color = rgba(u32::from_be_bytes([r, g, b, 0xff]));
(name.to_string(), color.into())
}))
}
static SMALL_THEME_KEYS: &[&str] = &[
"comment", "function", "keyword", "string", "type", "variable",
];
static LARGE_THEME_KEYS: &[&str] = &[
"attribute",
"boolean",
"comment",
"comment.doc",
"constant",
"constant.builtin",
"constructor",
"embedded",
"emphasis",
"emphasis.strong",
"function",
"function.builtin",
"function.method",
"function.method.builtin",
"function.special.definition",
"keyword",
"keyword.control",
"keyword.control.conditional",
"keyword.control.import",
"keyword.control.repeat",
"keyword.control.return",
"keyword.modifier",
"keyword.operator",
"label",
"link_text",
"link_uri",
"number",
"operator",
"property",
"punctuation",
"punctuation.bracket",
"punctuation.delimiter",
"punctuation.list_marker",
"punctuation.special",
"string",
"string.escape",
"string.regex",
"string.special",
"string.special.symbol",
"tag",
"text.literal",
"title",
"type",
"type.builtin",
"type.super",
"variable",
"variable.builtin",
"variable.member",
"variable.parameter",
"variable.special",
];
static SMALL_CAPTURE_NAMES: &[&str] = &[
"function",
"keyword",
"string.escape",
"type.builtin",
"variable.builtin",
];
static LARGE_CAPTURE_NAMES: &[&str] = &[
"attribute",
"boolean",
"comment",
"comment.doc",
"constant",
"constant.builtin",
"constructor",
"function",
"function.builtin",
"function.method",
"keyword",
"keyword.control",
"keyword.control.conditional",
"keyword.control.import",
"keyword.modifier",
"keyword.operator",
"label",
"number",
"operator",
"property",
"punctuation.bracket",
"punctuation.delimiter",
"punctuation.special",
"string",
"string.escape",
"string.regex",
"string.special",
"tag",
"type",
"type.builtin",
"variable",
"variable.builtin",
"variable.member",
"variable.parameter",
];
fn bench_build_highlight_map(c: &mut Criterion) {
let mut group = c.benchmark_group("build_highlight_map");
for (capture_label, capture_names) in [
("small_captures", SMALL_CAPTURE_NAMES as &[&str]),
("large_captures", LARGE_CAPTURE_NAMES as &[&str]),
] {
for (theme_label, theme_keys) in [
("small_theme", SMALL_THEME_KEYS as &[&str]),
("large_theme", LARGE_THEME_KEYS as &[&str]),
] {
let theme = syntax_theme(theme_keys);
group.bench_with_input(
BenchmarkId::new(capture_label, theme_label),
&(capture_names, &theme),
|b, (capture_names, theme)| {
b.iter(|| build_highlight_map(black_box(capture_names), black_box(theme)));
},
);
}
}
group.finish();
}
criterion_group!(benches, bench_build_highlight_map);
criterion_main!(benches);