unsloth/scripts
Daniel Han 0c8eb10e4c
scripts: ship deterministic comment / docstring-only diff verifier (#5422)
scripts/verify_comment_only_diff.py compares a list of changed files
between two git refs and reports whether each diff is strictly comments
or docstrings.

  * .py: parse both revs into AST, strip module / class / function
    docstrings, then compare ast.unparse output. Pure Python comments
    are discarded by ast.parse by construction, so any post-strip diff
    is real code.
  * .yml / .yaml: yaml.safe_load both sides and compare the parsed
    Python object; if scalar values differ, also strip shell comments
    inside any multi-line scalar (i.e. `run: |` script bodies) before
    comparing.

Exit code is 0 if every file is comment-only, 1 otherwise. The script
also prints a tight diff snippet for any FAIL line so a reviewer can
spot the real code change at a glance.

This is what I used to gate the trim PRs #5418 (this repo) and #640
(unsloth-zoo). Shipping it under scripts/ so any contributor can
deterministically prove a comment / docstring refactor is truly
comment-only, without manually eyeballing every line of a 4000-line
diff.

Usage:

    python scripts/verify_comment_only_diff.py [--base REF] [--head REF] path ...

Defaults: --base origin/main, --head HEAD. Paths are repo-relative.

Smoke test against the squash-merged PR #5418 (a real 3-file pure trim):

    git diff --name-only 6994d07f~1..6994d07f \
      | xargs python scripts/verify_comment_only_diff.py --base 6994d07f~1 --head 6994d07f

reports OK for all 3 files.
2026-05-14 05:02:37 -07:00
..
data CI: scope GITHUB_TOKEN permissions, add MLX CI, unblock ~60 skipped tests (#5312) 2026-05-11 03:19:13 -07:00
check_new_install_scripts.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
enforce_kwargs_spacing.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
install_gemma4_mlx.sh security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
install_qwen3_6_mlx.sh security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
lint_workflow_triggers.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
lockfile_supply_chain_audit.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
notebook_to_python.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
notebook_validator.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
run_ruff_format.py Formatting & bug fixes (#3563) 2025-11-07 06:00:22 -08:00
scan_npm_packages.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
scan_packages.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
stamp_studio_release.py security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397) 2026-05-13 04:58:12 -07:00
verify_comment_only_diff.py scripts: ship deterministic comment / docstring-only diff verifier (#5422) 2026-05-14 05:02:37 -07:00