unsloth/scripts/data
Daniel Han bfb5c2872c CI(notebooks): cross-repo validator for unslothai/notebooks
New PR-time + scheduled workflow that walks every nb/, kaggle/, and
original_template/ notebook in unslothai/notebooks and statically
validates the install cells and user-facing code against:

  - googlecolab/backend-info pip-freeze.gpu.txt (Colab oracle, refreshed
    on every run; fallback snapshot committed under scripts/data/).
  - PyPI metadata for transitive constraint resolution.
  - Hardcoded torch/torchcodec ABI table.
  - Hardcoded peft/torchao floor table.
  - The live unsloth + trl API surface, introspected under
    tests/_zoo_aggressive_cuda_spoof.py so the api job runs on a
    GPU-less ubuntu-latest runner.

Catches the bug classes from notebooks#258 / #260 / #261 / #264 / #221
and commit 51b1462 mechanically:

  R-INST-001  forbid git+ HEAD installs (notebooks#221)
  R-INST-002  --no-deps + transitive constraint violation
  R-INST-003  peft 0.19+ requires torchao 0.16.0+ (notebooks#258)
  R-INST-004  torch <-> torchcodec ABI mismatch (notebooks#261a)
  R-INST-005  --no-deps transformers + Colab tokenizers drift
              (notebooks#261b / #264)
  R-INST-006  forbid !!pip
  R-API-003   adamw_torch_fused -> adamw_8bit hint (warning)
  R-API-004   notebook references symbols outside live unsloth surface
  R-EXC-001   DONT_UPDATE_EXCEPTIONS notebooks must satisfy the same
              policy clauses as generated notebooks (notebooks#260)
  R-DRIFT-001 update_all_notebooks.py emits no diff (commit 51b1462)
  R-CONV-001  notebook_to_python.py converts every .ipynb cleanly

Files:
  .github/workflows/notebooks-ci.yml          PR-time + cron + dispatch
  scripts/notebook_validator.py               1148 LOC, single-file
  scripts/notebook_to_python.py               battle-tested converter
  scripts/data/colab_pip_freeze.gpu.txt       fallback snapshot
  scripts/data/colab_to_cpu_pin.json          cu128 -> CPU wheel map
  tests/notebooks/test_validator_fixtures.py  21 golden tests, all green

CPU-only by design. The api-introspect job follows the existing
consolidated-tests-ci spoof pattern (lines 309/417/536/626/826/1081/
1586/1998 of consolidated-tests-ci.yml). The smoke-install job is
opt-in via workflow_dispatch and stubs torchcodec since no CPU wheel
exists.

Validated on the live unslothai/notebooks@7af0ac0f tree: every fixture
test passes, exceptions check is silent, lint surfaces 27 errors + 6
warnings on real notebooks (mix of #258-class regressions in 6 nb/
notebooks the previous template fixes did not reach, plus 14
git+-HEAD installs in hand-tuned exception notebooks).
2026-05-07 11:42:57 +00:00
..
colab_pip_freeze.gpu.txt CI(notebooks): cross-repo validator for unslothai/notebooks 2026-05-07 11:42:57 +00:00
colab_to_cpu_pin.json CI(notebooks): cross-repo validator for unslothai/notebooks 2026-05-07 11:42:57 +00:00