fix(core): address /diff PR review comments

Addresses the five open review threads on #3491:

- parseShortstat: anchored and bounded the regex (`^...$` with `\d{1,10}`)
  so adversarial inputs can no longer drive polynomial backtracking. Closes
  CodeQL alert #137.
- fetchGitDiff: only parse the untracked-path list when we actually need
  it; the fast path now counts NUL bytes in the raw `ls-files -z` stdout
  (wenshao P1).
- fetchGitDiff: base the `MAX_FILES_FOR_DETAILS` short-circuit on
  `tracked + untracked`, so repos with few edits but many untracked files
  still take the summary-only path (wenshao P2).
- fetchGitDiff: count newlines in each untracked text file (binary sniff +
  1 MB read cap) and fold that into both the header `+N` and the per-file
  row, so a brand-new file no longer renders as `+0 / -0` (BZ-D P2).
- parseGitNumstat: switch to `git diff --numstat -z`. The parser now uses
  index-based slicing and a rename-pair state machine, so tracked
  filenames containing tabs/newlines/non-ASCII keep their real bytes
  (BZ-D P3). Renames collapse into a single `old => new` entry.

UI: untracked rows render as `+N filename (new)` (or
`~ filename (binary, new)`) instead of the placeholder `?` marker;
`/diff` now shows real additions for fresh files.
This commit is contained in:
克竟 2026-04-24 16:13:32 +08:00
parent d88eba1752
commit 005f88e2e4
7 changed files with 363 additions and 82 deletions

View file

@ -1,5 +1,5 @@
{
"generatedAt": "2026-04-24T07:26:06.808Z",
"generatedAt": "2026-04-24T08:13:04.067Z",
"keys": [
" Models: Qwen latest models\n",
" qwen auth qwen-oauth - Authenticate with Qwen OAuth (discontinued)",