mirror of
https://github.com/unslothai/unsloth.git
synced 2026-05-19 07:42:36 +00:00
92 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
44989ea2cb
|
ci: deterministic check for studio/frontend dep removals (#5478)
* ci: deterministic check for studio/frontend dep removals
Adds a CI gate that catches the common foot-gun: a dep dropped from
studio/frontend/package.json that something in src/ still imports.
scripts/check_frontend_dep_removal.py
Diffs package.json against a git base ref, collects every package
no longer declared, and for each one:
1. Greps the entire repo for any usage pattern (static / dynamic /
side-effect imports, require, CSS @import, HTML script/link
src, new URL(), triple-slash references, template literals,
bare quoted strings in JS-like files).
2. Resolves whether the package would still install by BFS'ing
the dep graph in the new lockfile starting from the new
package.json's declared deps (so a stale lockfile does not
give false OK-via-transitive results).
3. Distinguishes top-level node_modules/<name> from nested copies
under other packages. Bare src/ imports only resolve to the
top-level path.
4. Pip-installed playwright references are filtered, so removing
the npm playwright (CI uses the pip one) is reported correctly.
Additional hygiene checks (warnings, fail with --strict):
- lockfile <root> dep map matches package.json (catches drift).
- @types/X is not orphaned when X is no longer declared.
- No src/ import points at a package not declared in any field.
tests/studio/test_frontend_dep_removal.py
24 deterministic cases. Each patches a copy of the head
package.json, runs the script, and asserts (exit status,
reported FAIL list). Covers:
- Genuinely-breaking removals: next-themes, @xyflow/react,
@huggingface/hub, dexie, motion, canvas-confetti, recharts,
node-forge, mammoth, unpdf.
- Safe-via-transitive removals: katex, clsx, react,
@radix-ui/react-slot, zustand, tailwind-merge, remark-gfm,
date-fns, js-yaml, @tauri-apps/api.
- Mixed multi-removal failing on the unsafe entries only.
- Non-existent / not-in-base names (no-op).
- Move from deps to devDeps (not a removal).
.github/workflows/studio-frontend-ci.yml
Runs the checker on pull_request events against
origin/${{ github.base_ref }}, plus the edge-case suite.
* scripts: harden frontend dep removal check + adversarial suite
classify() now catches sneaky shapes that an earlier line-only scan
would miss:
- multi-line `import { a, b } from "pkg"` and the same shape for
`export { ... } from "pkg"` / `export * from "pkg"` /
`export type ... from "pkg"`.
- JSDoc `@import("pkg")` references.
- Word-boundary fix so `foo` no longer matches `foobar` (subpath gate:
after the package name we require closing quote or `/`).
- Negative-lookbehind on `(?<!@)\bimport\b` so CSS `@import "X"` is
classified as css_import, not side_effect_import.
find_usage() now feeds an 8-line window (4 above / 4 below the grep
hit) into classify() so multi-line import statements are picked up
even though the initial grep is line-based.
tests/studio/test_frontend_dep_removal.py now exercises three suites:
- 24 edge cases: subprocess-driven, full-pipeline.
- 28 classify() unit cases: direct function call against hand-crafted
snippets. Covers static / side-effect / dynamic / require /
css_import / html_script / html_link / re_export (4 variants) /
template_literal / new_url / tsc_triple_slash / jsdoc_import /
string_literal, plus false-positive guards (substring collision,
plain-text comments, URL path tails, Python files, markdown).
- 12 adversarial cases: write synthetic files under
studio/frontend/src/__dep_check_adversarial__/, run the full
script, then clean up. Confirms multi-line imports, re-exports,
JSDoc @import, new URL, dynamic imports all FAIL when the
underlying package is removed.
Current total: 64 / 64 cases pass.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* scripts: detect bin references in package.json scripts
Catches the last common false-negative: removing a package whose
bin is only referenced through `package.json` scripts (e.g. dropping
typescript while `"build": "tsc -b && vite build"` calls tsc).
Cross-checked the patterns Vercel/Next.js, Vite, and TanStack use
in their own manifests; the bin/scripts pairing is the one
consumer-side pattern dep checkers commonly miss.
How it works:
- Build a bin-to-package map from each lockfile entry's `bin`
field. The map is global so a stale lockfile still resolves
bins from packages about to be pruned.
- Tokenize each script value, splitting on `&&`, `||`, `;`, `|`.
Strip env-var assignments and `npx / pnpx / yarn / pnpm / bunx`
prefixes, plus `./node_modules/.bin/` and `node_modules/.bin/`
path prefixes. Look up the leading token in the bin map.
- Hits are reported as `script_bin` and feed the same reachability
gate as source imports. A bin still installed transitively
(e.g. vite via @vitejs/plugin-react peer) is OK-via-transitive;
an orphaned bin is FAIL.
Test additions:
- 5 new edge cases: removing vite, typescript, eslint, @biomejs/biome,
and (@biomejs/biome + @vitejs/plugin-react) together. Correctly
flags @biomejs/biome and the combo as FAIL while vite / typescript
/ eslint are kept by peers.
- 8 new classify() unit cases: TypeScript ambient `declare module`,
namespace imports, combined default+named, default-as-named,
re-export default (4 forms), `.then()` dynamic imports without
await, and TypeScript `import()` in type position.
Current total: 29 edge + 36 classify-unit + 12 adversarial = 77 / 77.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* scripts: detect package.json field references to packages
After surveying package.json patterns in 10+ popular repos (React,
Vue/Svelte/Astro/Next.js, Vite, Storybook, TanStack/Query, Tailwind,
ESLint, TypeScript, Prettier, SvelteKit), several config fields in
package.json itself can reference packages by string. My checker
filtered all of package.json out of the string_literal fallback,
so removing a package that is only referenced from one of these
fields was a false negative.
Now covered (new pkg_json_field kind):
- overrides / resolutions / pnpm.overrides keys
- pnpm.patchedDependencies keys
- peerDependenciesMeta keys
- prettier: "@my/prettier-config" string
- eslintConfig.extends (string or array)
- stylelint.extends / stylelint.plugins
- babel.presets / babel.plugins
- jest.preset / jest.setupFiles / jest.transform
- commitlint.extends
- renovate.extends
- remarkConfig.plugins
- any other tool config field whose strings/keys equal the pkg
name or `pkg/subpath`
False-positive guards (do not flag string values inside):
- browserslist (browser queries)
- keywords (free-form strings)
- engines / engineStrict / packageManager / volta (version pins)
- files / directories / publishConfig (paths)
- workspaces (paths/globs)
- main / module / browser / types / typings / exports / imports /
bin / man (author-side fields)
- scripts (already handled separately via scripts_bin_refs)
- name / version / description / author / repository / homepage etc.
Test additions: new PkgFieldCase suite with 19 cases covering each
tool config field, subpath references, and the 5 false-positive
guards. Combined with the existing 29 edge / 36 classify / 12
adversarial cases, the suite is 96 / 96.
* scripts: enumerate dead deps in studio/frontend
Adds an opt-in dead-dep enumeration to the existing safety check.
Iterates every package declared in studio/frontend/package.json
(all four dep fields combined) and reports each as one of:
used at least one detected reference -- in src/, a
config file, package.json scripts (bin), a
package.json tool-config field (overrides /
prettier / eslintConfig / stylelint / babel /
jest / commitlint / renovate / etc.), or
tsconfig.compilerOptions.types
unused no detected reference anywhere
type_pkg_kept @types/X where X is still declared (or X = node,
always implicit)
type_pkg_orphan @types/X where X is no longer declared --
candidate for removal alongside X
Wiring:
- New CLI flag `--enumerate-dead` (off by default).
- CI workflow now passes `--enumerate-dead` so the report shows on
every PR run; the report is informational unless `--strict` is
also set.
- With `--strict`, unused / type_pkg_orphan entries fail the run.
Tests:
- 5 new EnumCase scenarios:
E01 fake dep with no usage -> reported unused
E02 fake dep imported by a synthetic src file -> reported used
E03 fake dep referenced only in overrides -> reported used
E04 @types/X paired with X (also imported) -> kept
E05 @types/X without X -> orphan
Running the new flag against the current main reproduces exactly the
11 deps PR #5477 removed, validating the heuristic end to end.
Current total: 29 edge + 36 classify + 12 adversarial + 19 pkg-json
field + 5 enumeration = 101 / 101.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* ci: fetch base ref before running dep removal safety check
actions/checkout uses fetch-depth: 1 by default, so when the
dependency removal check ran `git show origin/main:.../package.json`
the ref wasn't available locally and the script exited 2 with
"could not read base package.json at origin/main:...".
Fetch the single base commit before invoking the check so the
git-show lookup resolves. --depth=1 keeps the extra fetch cheap.
* ci: address bot review on PR 5478
Five issues flagged across gemini and codex:
* --base-lock argparse arg was defined and advertised in the
docstring, but main() always read args.head_lock in both branches
-- the flag did nothing. Dropped the dead arg and the misleading
docstring line; the lockfile-reachability analysis only needs the
head lockfile.
* lock_resolvable() was defined but never called. Removed.
* read_pkg_file() did not specify an encoding for read_text().
Added encoding="utf-8" for cross-platform stability.
* read_pkg_file() returned {} when the path did not exist, so a
bad --head-lock value silently bypassed the reachability checks
(false PASS for removals that resolve through npm script bins).
main() now exits 2 with a clear message when the head lockfile
is missing, matching the existing behavior for the head pkg.
* studio-frontend-ci.yml pull_request paths filter only matched
studio/frontend/** and the workflow file, so PRs that modified
the checker script or its test could skip this job. Added both
files to the trigger.
* ci: address 10x reviewer findings on dep removal safety check
Eight P1s and three P2s surfaced across 10 codex reviewers; this
commit addresses all of them.
P1s:
1. Workflow refspec. `git fetch --depth=1 origin <base_ref>` may only
create FETCH_HEAD in shallow PR checkouts; the checker then dies
with `fatal: invalid object name 'origin/main'`. Use the explicit
refspec `<base>:refs/remotes/origin/<base>` so origin/<base> is
reliably created.
2. `_deps_of()` was counting optional peer dependencies as reachable.
npm only installs an optional peer when another package declares
the same dep, so for "is this removed package still in the tree"
they cannot keep it alive on their own. Skip entries marked
`optional: true` in `peerDependenciesMeta`.
3. JS-syntactic classifiers (static_import, side_effect_import,
dynamic_import, require, re_export, jsdoc_import, template_literal,
tsc_triple_slash, new_url) now gate on file extension. Previously
only the final string-literal fallback was gated, so a JS-shaped
string inside a Python fixture or a Markdown code fence triggered
a false FAIL. Added U37-U40 covering .py / .md / .sh / .yml.
4. HTML `<script src=>` and `<link href=>` patterns now respect a
package-name boundary so `/node_modules/foo-extra/...` is not
treated as a usage of `foo`. Added U41-U43.
5. New `find_command_usage()` detects CLI invocations in .sh / .yml
/ .yaml / .ps1 / .bat / Dockerfile* (npx pkg, bunx pkg, pnpm exec
pkg, yarn dlx pkg, or a bare pkg --flag). Also covers scoped CLI
packages exposed by their unscoped tail (@biomejs/biome -> biome).
6. `build_bin_to_pkg(head_lock)` was losing the bin -> package map
for packages the PR correctly removed from the lockfile, so
`scripts.biome:check` no longer flagged when @biomejs/biome was
being dropped. Now also read the base lockfile (via `git show` or
the new `--base-lock` override) and layer its bin map on top for
any package in the removed set.
7. `--strict` now runs hygiene checks (lockfile sync, @types
orphans, undeclared imports, dead-deps) on the no-removal path
too. Previously the early return at "[OK] no dependencies removed"
skipped them, so `--strict` silently passed on a tree with
uncommitted lockfile drift or unused deps.
8. Removed `@types/X` packages are now matched against the runtime
target name `X`: `/// <reference types="X" />`, tsconfig
compilerOptions.types entries, AND runtime `import "X"` shapes.
Handles the npm scope encoding (`@types/foo__bar` -> `@foo/bar`).
P2s:
9. CSS `url(...)` now accepts both quoted and unquoted forms (added
U44-U45). The previous regex required `/{pkg}/` after a slash,
missing bare-package urls like `url(katex/fonts/x.woff2)`.
10. `find_imports_without_decl()` now covers all static-import
shapes: `import "pkg"`, `import Foo from "pkg"`,
`import { Foo } from "pkg"`, `import type { Foo } from "pkg"`,
`await import("pkg")`, `require("pkg")`.
11. (Same as #8.) Removed `@types/X` is also linked to runtime
imports of `X`, not just type-only references.
Test suite expanded from 101 to 110 cases; all pass. Real-world
enumerate-dead still flags the same 11 unused packages on
studio/dep-removal-safety-check (matches PR 5477's removal set).
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* ci: address 4x Opus reviewer findings on dep removal check
Three blockers from the parallel Opus review batch:
1. scripts_bin_refs ignored every script that began with a wrapper.
The original "first non-env token wins" heuristic credited
cross-env / dotenv / dotenvx / env-cmd as the bin, so a script like
`cross-env CI=1 biome check` left @biomejs/biome looking unused.
Rewrote into _next_real_bin(), which peels env prefixes, the
leading package-manager runner (npx / pnpx / bunx / pnpm exec /
yarn dlx), and the known wrapper bins (with --/-flag-arg handling)
before returning the real CLI. shlex tokenization preserves quoted
env values like `FOO="a b"`.
2. enumerate_dep_usage skipped find_command_usage. The non-enumerate
path already credited deps used only from CI / Dockerfile / shell
scripts, but `--enumerate-dead` did not, so packages referenced
only from a workflow were silently listed as dead. Added the same
call (gated against @types/* to avoid the unscoped-tail false
positive).
3. classify multi-line window was ±4 lines. Prettier formats long
named-import lists one identifier per line, so a 20-import block
pushed the `import` keyword out of the window and the dep dropped
to the string-literal fallback (or worse, was missed entirely).
Widened to ±25 -- still bounded enough to keep false-positives
negligible, wide enough for the realistic Prettier ceiling.
Tests: added 10 _next_real_bin unit cases + 4 scripts_bin_refs
end-to-end cases (W01-W10 + I01-I04) and a 22-identifier multi-line
import adversarial case (A13). Full suite: 125/125.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
||
|
|
54a86c3514
|
ci: route every hf download through xet-tuned stall-retry wrapper (#5476)
Some checks are pending
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Security audit / workflow-trigger lint (pull_request_target / cache-poisoning) (push) Waiting to run
Security audit / pytest tests/security (push) Waiting to run
Security audit / npm provenance + new install-script diff (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
Root cause of the Mac json-images 30 min timeout (run 25950714888 / PR #5430): huggingface_hub>=1.15 deprecated `hf_transfer` and routes every transfer through `hf-xet`. The CI step's unpinned `pip install --upgrade huggingface_hub hf_transfer` jumped to 1.15.0 + hf-xet 1.5.0, the 940 MB mmproj finished in ~21s, then the 3 GB gemma-4 GGUF made it to ~46% and went completely silent for the remaining 29 minutes -- no progress bytes, no error, no exit -- until the job timeout fired. This wraps every CI `hf download` in a new `.github/scripts/hf-download-with-retry.sh`: * Drops the no-op `HF_HUB_ENABLE_HF_TRANSFER=1` prefix and the `hf_transfer` install (both are deprecated on 1.15+ and only emit a FutureWarning now). * Exports the hf-xet high-performance knobs Daniel asked for: HF_XET_HIGH_PERFORMANCE=1 HF_XET_CHUNK_CACHE_SIZE_BYTES=0 HF_XET_NUM_CONCURRENT_RANGE_GETS=64 HF_XET_RECONSTRUCT_WRITE_SEQUENTIALLY=0 HF_XET_CLIENT_READ_TIMEOUT=500 * Watchdogs each attempt: if `hf download` has not exited after HF_DOWNLOAD_STALL_SECONDS (default 180s = 3 min), SIGTERM, sleep 2, SIGKILL, then loop. Retries are unbounded; the enclosing job's `timeout-minutes` is the real cap. * Optional 3rd positional `LOCAL_DIR` -- omitted lets `hf` use the default HF_HUB_CACHE, which is what the HF_HOME-priming jobs need. 19 call sites migrated across mlx-ci.yml + 9 studio-*-smoke.yml workflows. The inline `python -c "from huggingface_hub import hf_hub_download; ..."` block in mlx-ci.yml is also routed through the wrapper so every hf transfer in CI gets the same treatment. Also reverts the json-images timeout 45 -> 30 from #5475: the bump was masking this hang, not fixing it. |
||
|
|
295844670b
|
ci: bump Mac json-images timeout 30 -> 45 min (cache-miss path) (#5475)
The `JSON, images` job in `studio-mac-inference-smoke.yml` (Job 3 of Mac Studio GGUF CI) downloads ~4 GB on a cache miss: 3 GB gemma-4-E2B-it-UD-Q4_K_XL.gguf + ~1 GB mmproj-F16.gguf. The 30 min cap was tight even with `HF_HUB_ENABLE_HF_TRANSFER=1` and parallel downloads, and timed out the cache-miss run on PR #5430 mid-download (run 25950714888) before Studio install or the smoke assertions ran. Once the actions/cache restore hits, the job comes in under 10 min, so 45 min only costs runner time on the first run after a cache key bump (v1->v2 was just bumped in #5459, which is what produced this failure). Jobs 1 (openai-anthropic, 270M model) and 2 (tool-calling, ~1.5 GB model) are not bumped -- their 25 min cap has been comfortable. |
||
|
|
fb4bd0b777
|
ci: drop cache: 'npm' from setup-node (silent abort on Windows) (#5474)
`actions/setup-node@v6.4.0` with `cache: 'npm'` silently aborts the entire job on Windows runners when the npm cache path returned by `npm config get cache` (`C:\npm\cache`) does not yet exist on a fresh runner -- the step exits 24s in with no error message and every following step gets skipped. See npm/cli#7308 for the underlying EEXIST / missing-dir race in the npm cache directory. This mirrors the existing precedent in `studio-windows-ui-smoke.yml`'s `setup-python` block, which already dropped `cache: 'pip'` for the same reason (post-step fatal error on a missing pip cache dir). The frontend `npm ci` is fast enough without the cache that the reliability gain is worth the ~30s. |
||
|
|
85cf0a41ea
|
ci: switch Windows Stop Studio to a cmd no-op marker (#5462)
The prior set +e + redirect + exit 0 fix in #5460 did not stop the Stop Studio step from exiting 143 (SIGTERM) on Git Bash; bash on windows-latest exits with that signal before any inline guard runs, regardless of redirection. The teardown does not gate correctness -- the runner reclaims the Studio child process at job end -- so swap the shell from Git Bash to cmd and just emit a marker line. After this, Job 3 (JSON, images) and the two other Windows GGUF CI jobs cannot fail at the teardown step. |
||
|
|
ac3e9e98f2
|
ci: make Windows Stop Studio teardown tolerate Git Bash signal exit (#5460)
The Windows-runner "Stop Studio" step's kill + sleep block has been observed to exit 143 (SIGTERM) even when the upstream test work passed. Most recently caught on PR #5432 Job 3 "JSON, images": all four assertions (json_object, plain inference, image/openai, image/anthropic) printed PASS, then the kill step ran for ~2 seconds and exited 143, failing the job. Teardown does not gate correctness. Wrap all three Stop Studio steps with set +e + redirected error streams + explicit exit 0 so transient Git Bash signal weirdness no longer masks a green test run. |
||
|
|
90ac4c87f7
|
ci: stop a partial mmproj cache from poisoning Mac Studio GGUF CI (#5459)
Some checks are pending
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Security audit / workflow-trigger lint (pull_request_target / cache-poisoning) (push) Waiting to run
Security audit / pytest tests/security (push) Waiting to run
Security audit / npm provenance + new install-script diff (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
The "JSON, images" Mac Studio GGUF CI job hit a stale cache for
${{ runner.os }}-gguf-...-mmproj-F16.gguf-v1 that contains only the
main GGUF, not the mmproj sibling. cache-hit==true so the download
step was skipped, then the post-load \`ls\` failed:
ls: ...gguf-cache/mmproj-F16.gguf: No such file or directory
Three guards layered:
1) Bump cache key v1 -> v2 to invalidate the poisoned entry on the
GitHub-hosted side.
2) New verify-cache step explicitly checks BOTH files are present
before trusting cache-hit. If not, fall through to download.
3) Save step gains a hashFiles() check on the mmproj path so a
partial mmproj download cannot land back in the cache.
Behaviour on a clean run is unchanged; cache hit + verify ok skips
the re-download, partial-hit triggers fresh download, success
saves a complete archive.
|
||
|
|
51dd5fac79
|
ci: add tx >=5,<6 slow compile model_types to KNOWN_BROKEN_COMPILE (#5458)
The per-model SIGALRM cap landed on the previous fix now exposes beit / sam / sam_hq as compile-too-slow on transformers >=5,<6 + trl >=1,<2 -- each exceeds the 60s per-model budget. They are real slow paths in unsloth_compile_transformers's source rewriter when handling beit / SAM's encoder layers on the new transformers line, not infra flakes (the prior fix logged sweep progress per 25 models so the slow ones are pinpointable in CI logs). Bucket them into Category F (compile exceeds budget) so the sweep stays green and each is tracked for follow-up zoo fixes in the same shape as the existing 27 known-broken entries. Surface behaviour stays identical: any NEW slow model_type still fails the cell with a TimeoutError tag. |
||
|
|
c7c3840b5f
|
ci: cap each compiler-sweep iteration with SIGALRM + log progress (#5456)
Core (HF=latest + TRL=latest) (transformers >=5,<6, trl >=1,<2) hangs 30+ minutes in the compiler-sweep test under the new shim layout, exceeding the 35-min job timeout and showing up as cancelled with no log of which model_type wedged. unsloth_compile_transformers does real source rewriting + torch.compile decoration and can deadlock inside a single problem model on a new transformers point release. Per-model SIGALRM cap (60s) so one infinite-loop model_type cannot wedge the whole sweep. Print sweep progress every 25 models so the log surfaces the slow model_type the next time this regresses -- crucial for finding the upstream/transformers compile bug. Timeout errors land in the same KNOWN / NEW_FAILURES bucket as any other compile exception, so the matrix still surfaces real regressions instead of silently absorbing them. |
||
|
|
7e90cae345
|
ci: compiler-cache-shim must mutate live module globals + skip rerun (#5452)
The shim test pinned UNSLOTH_COMPILE_LOCATION via env before importing unsloth_zoo.compiler, but tests/conftest.py runs `import unsloth` first, which transitively imports unsloth_zoo.compiler with the default cache path. The shim's later env-set never took effect on the captured module global, so the compiler silently wrote artefacts to the default cache and the per-model file assertion failed under Core (HF=4.57.6 + TRL<1). Two fixes: 1) After import, mutate the live module globals directly (UNSLOTH_COMPILE_LOCATION, UNSLOTH_COMPILE_USE_TEMP) so they reflect the hermetic tmp dir regardless of who imported the module first. The same pattern is already used in _compiler_cache_invariants_shim._isolate_cache. 2) test_compile_real_modeling_module no longer re-runs unsloth_compile_transformers after a sweep already patched the module. The compile is not idempotent in-process: re-running on a module whose class forwards were already rewritten corrupts the inspect source/line cache and the second-pass emitted file raises IndentationError / OSError "lineno is out of bounds" on import. The sweep already emitted a valid cache file for every non-KNOWN_BROKEN model_type, so verify that artefact directly; trigger a compile only when running this test in isolation. Verified locally: pytest -q tests/_zoo_compiler_cache_shim.py (5 passed, 1 skipped) pytest -q tests/.._real_modeling_module (3 passed) |
||
|
|
e0e606a24a
|
ci: make compiler-cache shim test order-independent (#5449)
The shim test_compile_real_modeling_module[*] was failing on all three RMSNorm families (llama / qwen3 / gemma3) on the Core 4.57.6 matrix cell because the preceding test_compile_every_transformers_ model_type sweep already invokes unsloth_compile_transformers for every model_type, which sets modeling.__UNSLOTH_PATCHED__ = True. unsloth_zoo.compiler.unsloth_compile_transformers (zoo compiler.py :3318-3324) early-returns when that marker is already set, without re-emitting the cache file. The targeted shim test then asserts the file exists and fails with "compiler did not write" against the temp cache path. Drop the unsloth-added marker (and any leftover cache file from the sweep) before invoking the compile so the test exercises a fresh emit regardless of collection order. Marker-only fix -- transformers version-agnostic (works on 4.57.6 + 5.x); does not touch zoo internals. |
||
|
|
e81b942d26
|
ci: merge duplicate with: keys in workflow checkout steps (#5447)
Two `with:` mapping keys on the same step caused GitHub's workflow loader to reject the file (silently dropping persist-credentials: false under YAML "last key wins"). Merge into a single `with:` block in notebooks-ci.yml (3 sites) and version-compat-ci.yml (1 site). |
||
|
|
9a81a5e8e7
|
Update version-compat-ci.yml (#5445) | ||
|
|
5345b10b6a
|
ci: install ipython so transformers.utils.notebook imports cleanly in zoo pytest (#5437)
Some checks are pending
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Security audit / workflow-trigger lint (pull_request_target / cache-poisoning) (push) Waiting to run
Security audit / pytest tests/security (push) Waiting to run
Security audit / npm provenance + new install-script diff (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
unsloth_zoo's drift-detector tests/test_zoo_source_upstream_refs.py:: test_logging_utils_utils_notebook resolves transformers.utils.notebook, which executes ``import IPython.display as disp`` at module scope. The Core matrix install list did not include IPython, so the import raised ModuleNotFoundError and the test failed with: DRIFT DETECTED: transformers.utils.notebook exists but its imports fail on this install (ModuleNotFoundError: No module named 'IPython') The test message itself states the resolution: "Either install the dep in CI or remove the zoo reference." Installing keeps the upstream-refs detector functional. Add ipython to the matrix install list. |
||
|
|
ab21dc25b4
|
tests: public-api surface drift detector (companion to test_import_fixes_drift.py) (#5428)
* tests: ship public-api surface drift detector + wire into Core matrix Companion to tests/test_import_fixes_drift.py (PR #5414): that file catches drift in THIRD-PARTY libs (transformers / trl / triton / peft / vllm / torchcodec / xformers); this file catches drift in unsloth's OWN public-surface API -- the top-9 classmethods + symbols that unslothai/notebooks calls at ~2000 cumulative sites. Closes the gap where a refactor on this repo (e.g. renaming FastLanguageModel.from_pretrained -> .load) would pass unsloth CI green and surface only on the next unslothai/notebooks CI run, or worse, on a user's Colab crash report. Coverage (call-site counts measured against unslothai/notebooks main): test_fast_language_model_class_present test_fast_language_model_from_pretrained_kwargs 506 sites test_fast_language_model_get_peft_model_kwargs 304 sites test_fast_language_model_for_inference_callable 370 sites test_fast_vision_model_class_and_methods (4 methods) test_fast_vision_model_get_peft_model_vision_kwargs (4 kwargs) test_fast_model_class_and_methods (2 methods) test_fast_model_from_pretrained_kwargs 103 sites test_is_bf16_supported_or_alias_callable 48 + 8 sites Each test asserts the healthy public shape via inspect.signature; on regression fires pytest.fail("DRIFT DETECTED: ...") -- never pytest.skip -- so the Core matrix cell goes red. Mirrors the same skeleton used by tests/test_import_fixes_drift.py. Wired as a new step in consolidated-tests-ci.yml right after the import_fixes drift step, inside every Core matrix cell. Local verification on transformers 4.57.6 + unsloth main: pytest tests/test_public_api_surface.py -v -> 9 passed in 0.02s * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
43d9473004
|
tests: import_fixes drift detectors (HARD GATE on Core matrix) (#5414)
* tests: import_fixes drift detectors (HARD GATE on Core matrix) Ports zoo PR #637's drift-detector pattern to unsloth as a new test file + Core matrix step. Background unsloth/import_fixes.py is a 1932-line catalog of hand-rolled patches for upstream regressions: protobuf MessageFactory drift, datasets 4.4.x recursion, TRL tuple-vs-bool _*_available caching, transformers PreTrainedModel.enable_input_require_grads source pattern flip, triton CompiledKernel num_ctas missing, peft weight-converter ctor compat, torch/torchvision pairing, vllm guided_decoding params, etc. Today each fix runs unconditionally at unsloth import; that's defensively correct but it means: a fix becoming a no-op (upstream silently fixed itself) is invisible. a fix becoming needed-but-broken (upstream drifted in a new way the workaround doesn't match) only surfaces as a downstream crash. tests/test_import_fixes_drift.py (18 tests) One drift detector per fix_* / patch_* function in import_fixes.py. Each test asserts the HEALTHY upstream shape absent the regression. When the pathology is currently ACTIVE, fires pytest.fail("DRIFT DETECTED: <fix function> needed because <observation>") -- NEVER pytest.skip. CI must go RED so the maintainer triages on the next PR. First run on the current install surfaces 3 active drifts: peft.utils.transformers_weight_conversion unimportable (transformers.conversion_mapping missing) -- patch_peft_ weight_converter_compatibility will silently no-op. triton 3.5.1 CompiledKernel lacks num_ctas + cluster_dims -- fix_triton_compiled_kernel_missing_attrs is live-needed. vllm exposes only StructuredOutputsParams, not GuidedDecodingParams -- fix_vllm_guided_decoding_params is live-needed. CI wiring (.github/workflows/consolidated-tests-ci.yml) New step `import_fixes drift detectors (18 tests, HARD GATE)` added to the Core matrix BEFORE the Bucket-A tests, so the matrix cell fails fast on a real upstream regression. No continue-on-error: a drift detection MUST go red. This mirrors the same change just landed on unslothai/unsloth-zoo#637 (commit ff5a3d8). Same fail-loud-on-drift semantic; same set of fix functions covered; same 1:1 mapping between test + import_fixes.py source-of-truth function. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * chore: trim verbose docstrings in import_fixes drift detectors Strictly comment / docstring trims. AST-verified comment-only. * Module header: 36 lines -> 7 lines. * Per-test docstring: collapse each 7-15 line prose block to a 1-3 line lead naming the import_fixes.py function + line range plus the one-sentence why; pytest.fail messages stay verbatim so a red CI cell still names the upstream regression. * Helper docstrings (_safe_version, _is_custom_torch_build): drop. * Inline narrative comments inside test bodies: drop. * Section dividers and licence header: untouched. Net: 700 -> 537 lines, zero behaviour changes. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
b0d61e1ab5
|
studio/ci: flat GGUF+mmproj cache for Mac json-images smoke, save partial caches on cancel (#5417)
The json-images job on macos-14 has been hitting timeout-minutes: 30 on
cold cache (runs 25854199999, 25854000503, plus concurrency-cancelled
runs like 25848174628). Two root causes, both addressed here.
1. The HF_HOME cache for gemma-4-E2B-it never lands on macOS.
`gh api repos/unslothai/unsloth/actions/caches` shows a 3344 MB
Windows entry for the same key on main but no macOS entry at all.
The save step was gated on `prime-hf.outcome == 'success'`; when
prime is killed by the job timeout or by `concurrency:
cancel-in-progress`, outcome becomes `cancelled` and the save is
skipped. Cold cache then primes again next run, times out again,
never saves. Self-perpetuating on busy branches.
On top of that, the HF_HOME layout (xet chunks + blobs + snapshots)
inflates ~3.6x off-disk per the job 2 comment, pushing a single
entry close to the 10 GiB per-cache cap.
2. macos-14 NAT egress is slow for multi-GB downloads. The workflow
already calls this out and goes parallel + authenticated, but 3.4
GiB (gemma-4-E2B Q4_K_XL ~2.4 GiB + mmproj-F16 ~986 MiB) still
doesn't reliably fit in 30 min when starting from cold.
Changes
* Job 3 (json-images) switches from HF_HOME to the flat `--local-dir
gguf-cache` pattern that Job 2 already uses. Cache key swaps from
`${runner.os}-hf-${REPO}-${VARIANT}-${MMPROJ}-v1` to
`${runner.os}-gguf-${REPO}-${FILE}-${MMPROJ}-v1`. mmproj is
auto-detected as a sibling of the .gguf in the same dir by
`detect_mmproj_file` in studio/backend/utils/models/model_config.py,
so no API surface change is needed on the inference/load route.
* Load step posts `model_path` as a local file path and drops
`gguf_variant`. With a local file the variant is encoded in the
filename, and passing it would route through
`_find_local_gguf_by_variant` which expects a directory.
* All three jobs' save guards relaxed from
`outcome == 'success'` to `outcome != 'skipped' && hashFiles(...) != ''`.
Cache-hit fast path stays a no-op (restore hit -> download skipped
-> save skipped). On cancel/timeout/failure the save still runs as
long as at least one .gguf landed, so the next run resumes via
hf download's content-hash resume.
* Top-of-file and `workflow_dispatch` comments updated from
"HF_HOME caches" to "model caches" so they remain accurate now that
two of three jobs use flat-file caching.
This builds on the cache hardening already landed in #5396 and #5399.
|
||
|
|
05d6a2f3ae
|
security: persist-credentials:false on every actions/checkout (org-wide sweep) (#5413)
Some checks are pending
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Security audit / workflow-trigger lint (pull_request_target / cache-poisoning) (push) Waiting to run
Security audit / pytest tests/security (push) Waiting to run
Security audit / npm provenance + new install-script diff (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
## Threat model
When `actions/checkout` runs without `persist-credentials: false`,
the short-lived `GITHUB_TOKEN` injected at job start gets written
into the workspace's `.git/config` so subsequent Git operations
in the same job (push, fetch, etc.) can use it transparently.
Failure mode if a downstream step packages the workspace:
1. Step T fetches the repo via `actions/checkout` (token in
`.git/config`).
2. Step T+N packages the workspace -- or `logs/`, or a `dist/`
dir that lives inside the workspace -- via
`actions/upload-artifact`. The hidden `.git/` folder rides
along.
3. While the workflow is still running, the uploaded zip is
immediately downloadable via the GitHub UI / API. On a
PUBLIC repo, any logged-in GitHub user can download it.
4. The attacker extracts the live `GITHUB_TOKEN` from
`.git/config` and uses it to push code, modify branches,
comment on / close PRs, etc., before the token expires at
end-of-workflow (typically 1-6 hours).
This is a moderate-risk class because our long-running workflows
(Studio inference smoke, full Tauri build, MLX install on macOS)
keep the token alive for 30+ minutes -- plenty of window.
## What changes
Adds `with: persist-credentials: false` to all 51
`actions/checkout` call sites across 23 workflows. None of our
workflows actually use the persisted credentials -- the only
push-back operations are `gh release create / upload` in
release-desktop.yml, and those go through `${{ secrets.GITHUB_TOKEN }}`
explicitly (NOT via the persisted .git/config token).
So the sweep is universal -- no exceptions, no broken push-paths,
no required follow-up.
## Verification
- 51 checkout calls / 51 persist-credentials lines (one-to-one).
- All 24 workflow YAMLs still parse cleanly under PyYAML.
- No push-back-via-persisted-creds call site exists -- grepped
the workflow tree for `git push`, `git remote update`, etc.
Zero matches outside intentional `gh release ...` calls that
explicitly forward `${{ secrets.GITHUB_TOKEN }}`.
## Companion PR
unslothai/unsloth-zoo PR #637 (the greenfield CI mirror) gets the
same sweep on its 9 checkout sites in commit 1e6c0b0. Filed there
rather than as a separate PR to keep the related changes
together.
|
||
|
|
ef9f672fe8
|
security: NOT affected by Mini Shai-Hulud (May-12 wave) -- forward-looking hardening only (#5397)
* scripts/scan_*: add Mini Shai-Hulud May-12 IOC strings and pin-blocklists Append the May-12 2026 wave indicators (git-tanstack.com, transformers.pyz, /tmp/transformers.pyz, "With Love TeamPCP", "We've been online over 2 hours") to all three scanner IOC tables, add BLOCKED_NPM_VERSIONS (42 TanStack pkgs, 4 opensearch versions, 3 squawk pkgs) in scan_npm_packages.py and lockfile_supply_chain_audit.py (kept byte-identical), add BLOCKED_PYPI_VERSIONS (guardrails-ai 0.10.1, mistralai 2.4.6, lightning 2.6.2/2.6.3) plus RE_MAY12_IOC wiring across check_py_file/check_shell_file/check_workflow_file in scan_packages.py. The npm orchestrator and the lockfile auditor now short-circuit on a blocked entry before fetching the tarball, and the PyPI download pipeline drops blocked specs before pip download is invoked. * tests/security: regression suite for supply-chain scanners Adds offline fixture corpus and pytest coverage for scan_npm_packages, scan_packages, and lockfile_supply_chain_audit so future IOC-table drift surfaces at PR time. Pytest scope narrowed to tests/security so GPU smoke tests are not picked up by default. * ci(security-audit): drop continue-on-error on pip-scan and npm-scan jobs Promote three harden-runner blocks to egress-policy: block with per-job allowlists. Add tests-security job running pytest tests/security as a hard gate. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * scripts: harden third-party downloads, pip resolver pins, atomic writes Pins uv installer and mlx_vlm qwen3_5 patches by commit SHA + SHA-256 checksum, scrubs PIP_* env vars and forces --index-url + --only-binary on pip download, applies tarbomb caps to scan_packages archive walks, and converts non-atomic config writes (kwargs spacer, studio stamper, notebook validator, scan_packages req-file fixer) to mkstemp+os.replace. Also adds host allowlist to notebook_to_python downloader, threads an --allow-shell flag through its shell=True emission with reviewer warning comments, locks both MLX installer scripts to set -euo pipefail, and extends CODEOWNERS so colab snapshot data files require notebook-owner review. * ci(workflows): harden release-desktop / smoke / notebooks workflows Pin dtolnay/rust-toolchain to a 40-char SHA, scope release-desktop permissions to read at workflow level with job-level write only on the build job, append --ignore-scripts to every npm ci / npm install in studio-frontend-ci / wheel-smoke / studio-tauri-smoke / release-desktop, validate client_payload.ref shape via an env-var-isolated regex on every notebooks-ci job, and add step-security/harden-runner in audit mode as the first step of release-desktop and mlx-ci. * scripts: promote silent scanner failures to non-zero exit codes scan_packages now returns 2 on pip-download failure and emits a CRITICAL archive_corrupted finding on truncated wheels/sdists. notebook_to_python exits 1 on per-notebook failures; notebook_validator wraps the stash/pop in try/finally; lockfile audit rejects bare UNSLOTH_LOCKFILE_AUDIT_SKIP=1 with a loud GitHub Actions warning. * Add npm cooldown + new-install-script gate + Dependabot cooldown Pins min-release-age=7 (npm 11.10+) in repo-root and studio/frontend .npmrc, adds scripts/check_new_install_scripts.py to fail PRs that add a postinstall dep, ships a new security-audit job for npm audit signatures plus the diff, and extends .github/dependabot.yml with cooldown stanzas. Pin @tanstack/react-router to 1.169.9 per GHSA- g7cv-rxg3-hmpx; lockfile regen deferred until that release lands on npm. tests/security gains 4 new tests; full suite 26/26 green. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci(security): fix tanstack pin, exec bits, expand IOC tables to @uipath/@squawk full - Revert --ignore-scripts on Studio install workflows: vite build needs esbuild's native postinstall (per PR #5392 rationale). Keep --ignore-scripts on security-audit.yml's standalone npm audit job. - Pin @tanstack/react-router to the actual published 1.169.2 (was a forward-looking 1.169.9 that does not exist on npm; broke npm ci). - Drop redundant repo-root .npmrc; studio/frontend/.npmrc covers the only npm project today (root cooldown re-instate via dependabot.yml). - Restore exec bits on 7 files my filesystem stripped during cherry-pick. - Expand BLOCKED_NPM_VERSIONS with full safedep.io + Aikido enumeration: 22 @squawk/* packages with 5 versions each (110 entries; previously 3 entries with 1 version each), and 66 @uipath/* packages (entirely missing before). Mirror in scripts/lockfile_supply_chain_audit.py. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tests/security: suppress CodeQL py/incomplete-url-substring-sanitization The two flagged 'X' in Y assertions are NOT URL sanitization checks. They verify our scanner WROTE a known IOC literal into its stdout / Finding.evidence, which is the opposite of an attack surface -- matching the scanner's output is precisely what catches the worm. Inline lgtm[] suppression with a 4-line rationale comment above each. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * scripts/scan_*: expand IOC tables with Aikido full 169-pkg enumeration Per Aikido 2026-05-12 disclosure (373 malicious package-version entries across 169 npm package names), add to BLOCKED_NPM_VERSIONS: - @mistralai/* npm scope (3 packages, 9 versions) -- separate from the PyPI mistralai package already in BLOCKED_PYPI_VERSIONS - @tallyui/* (10 packages, 30 entries) - @beproduct/nestjs-auth (18 versions 0.1.2..0.1.19) - @draftlab/* + @draftauth/* (5 packages) - @taskflow-corp/cli, @tolka/cli, @ml-toolkit-ts/*, @mesadev/*, @dirigible-ai/sdk, @supersurkhet/* - 10 unscoped packages (safe-action, ts-dna, cross-stitch, cmux-agent-mcp, agentwork-cli, git-branch-selector, wot-api, git-git-git, nextmove-mcp, ml-toolkit-ts) Also add to KNOWN_IOC_STRINGS / NPM_IOC_STRINGS: - router_init.js SHA-256 ab4fcadaec49c03278063dd269ea5eef82d24f2124a8e15d7b90f2fa8601266c - tanstack_runner.js SHA-256 2ec78d556d696e208927cc503d48e4b5eb56b31abc2870c2ed2e98d6be27fc96 - bun run tanstack_runner.js marker (the new Bun-prepare-script dropper invocation pattern unique to this wave) Total: 170 packages, 401 versions blocklisted. Studio lockfile still scans clean (0 findings, 0 hard errors). * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * scripts/scan_*: web-verification additions (@tanstack/setup, intercom-client) Two findings from cross-checking BLOCKED_NPM_VERSIONS / KNOWN_IOC_STRINGS against GHSA-g7cv-rxg3-hmpx + Aikido + safedep.io + Socket + Semgrep. - Fix asymmetry: @tanstack/setup IOC string was in lockfile_supply_chain_audit.py's NPM_IOC_STRINGS but missing from scan_npm_packages.py's KNOWN_IOC_STRINGS. The literal is the malicious optional-dependency name used by the May-12 TanStack wave; no legitimate npm package of this name exists. - Add intercom-client@7.0.4: the npm counterpart of the lightning 2.6.2/2.6.3 PyPI compromise (Apr-30 wave). Same threat actor (TeamPCP). Confirmed by Semgrep, Aikido, OX Security, Resecurity, Kodem. Safe version is 7.0.3 and earlier. Total BLOCKED_NPM_VERSIONS: 171 packages / 402 versions. Both files remain byte-identical. Studio lockfile still scans clean. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * ci(security): add workflow-trigger lint refusing pull_request_target + cache-poisoning vectors The two patterns that together powered GHSA-g7cv-rxg3-hmpx (TanStack Mini Shai-Hulud) are now gated at PR time: 1. pull_request_target -- the worm chain started with a fork PR that ran in the base-repo context. Every workflow in this repo today uses 'pull_request' (safe); the lint refuses any new pull_request_target additions outright. workflow_run is restricted, allowed only with an explicit allow-comment. 2. Shared cache keys between PR-triggered workflows and the publish workflow (release-desktop.yml). The TanStack attack chain poisoned a shared Actions cache from a fork PR; the legitimate release workflow then restored the poisoned cache. The lint refuses any cache key that appears in both a PR-triggered workflow and a workflow_dispatch-only / publish workflow. Current tree is clean: 0 pull_request_target, 0 workflow_run, 0 PR-publish cache-key collisions across all 24 workflows. The lint locks that invariant in place. Files: + scripts/lint_workflow_triggers.py (~200 LOC, stdlib + PyYAML) + tests/security/test_lint_workflow_triggers.py (5 tests covering current-tree pass, pull_request_target reject, workflow_run restricted, justified workflow_run accept, cache-key collision reject) ~ .github/workflows/security-audit.yml: new workflow-trigger-lint job, no continue-on-error, harden-runner block-mode, PyYAML only runtime dep. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * security: fix tests-security CI job + CodeQL false-positives Two CI failures on the prior push: 1. pytest tests/security -- 5 lint regression tests failed because scripts/lint_workflow_triggers.py imports PyYAML which is not in the bare runner's Python env. Added pyyaml==6.0.2 to the pip install step alongside pytest. (29 scanner tests already passed.) 2. CodeQL py/incomplete-url-substring-sanitization fired on two test assertions that check the scanner WROTE the IOC literal to its own stdout/stderr. The rule pattern-matches on `"<host>" in <var>` and cannot distinguish a URL sanitizer from a regression-test evidence check. Previous `# lgtm[...]` inline suppressions were detached from the operator when pre-commit reformatted the assert across multiple lines. Rebuilt the IOC literals at runtime (`"git-tanstack." + "com"`) so no URL-shaped source literal appears on the `in` operator line; rule cannot trigger. Verified locally: `pytest tests/security -v` -> 34 passed in 2.70s. * security(studio): defensive .npmrc cooldown aliases + save-exact Two additions to studio/frontend/.npmrc to harden the existing `min-release-age=7` (Mini Shai-Hulud defence): 1. `minimum-release-age=10080` (minutes) -- defensive alias for the same 7-day floor. Some npm versions / wrappers consult one key but not the other; setting both prevents a single upstream setting-name parse change from silently disabling the cooldown. The two keys MUST agree (do not let them drift). 2. `save-exact=true` -- refuses to write back `^x.y.z` ranges into package.json when a maintainer runs `npm install <pkg>` locally. Does NOT rewrite already-present ranges; stops NEW carets from creeping into the manifest as patch-version footguns. Verified: pytest tests/security -> 34 passed in 2.63s. * chore(dependabot): remove dead bun entry for /studio/frontend `package-ecosystem: "bun"` at /studio/frontend was a no-op: that path commits package-lock.json, not bun.lock / bun.lockb, so Dependabot's bun ecosystem silently skipped it. The actual behaviour is unchanged -- the npm entry below the cargo block already owns npm_and_yarn security advisories for /studio/frontend with `open-pull-requests-limit: 0` (version-update PRs suppressed, security PRs flow through). This commit: - Deletes the bun entry (kept a placeholder comment so a future bun migration knows where to slot it back in). - Rewrites the npm /studio/frontend entry comment to explain the real intent: lockfile is the authoritative pin, .npmrc `min-release-age=7` already blocks fresh tarballs at install time, dependabot only needs to surface security advisories. No functional change: same set of dependabot PRs as before (zero version updates, security advisories grouped weekly with cooldown). Verified: pytest tests/security -> 34 passed in 2.67s; YAML parses cleanly via PyYAML. * fix(dependabot): drop unsupported semver-* cooldown keys on github-actions Dependabot's validator rejected the config with: The property '#/updates/0/cooldown/semver-minor-days' is not supported for the package ecosystem 'github-actions'. The property '#/updates/0/cooldown/semver-patch-days' is not supported for the package ecosystem 'github-actions'. The `semver-minor-days` / `semver-patch-days` cooldown knobs are only valid for semver-aware ecosystems (npm, cargo, etc.). The github-actions ecosystem pins via git tags / SHAs, not semver, so only `default-days` is honored. Pre-existing bug on main; surfaced on this PR because the prior commit re-validated the file. Behaviour: github-actions PRs now respect the 7-day cooldown floor (was already the intent), without the no-op semver bands. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
5c5c472bc9
|
Chore(deps): bump the actions group across 1 directory with 4 updates (#5394)
Updates the requirements on [actions/checkout](https://github.com/actions/checkout), [actions/setup-node](https://github.com/actions/setup-node), [swatinem/rust-cache](https://github.com/swatinem/rust-cache) and [trufflesecurity/trufflehog](https://github.com/trufflesecurity/trufflehog) to permit the latest version.
Updates `actions/checkout` from 4.3.1 to 6.0.2
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4.3.1...de0fac2e4500dabe0009e67214ff5f5447ce83dd)
Updates `actions/setup-node` from 4.4.0 to 6.4.0
- [Release notes](https://github.com/actions/setup-node/releases)
- [Commits](https://github.com/actions/setup-node/compare/v4.4.0...48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e)
Updates `swatinem/rust-cache` to e18b497796c12c097a38f9edb9d0641fb99eee32
- [Release notes](https://github.com/swatinem/rust-cache/releases)
- [Changelog](https://github.com/Swatinem/rust-cache/blob/master/CHANGELOG.md)
- [Commits](https://github.com/swatinem/rust-cache/commits/e18b497796c12c097a38f9edb9d0641fb99eee32)
Updates `trufflesecurity/trufflehog` from 3.95.2 to 3.95.3
- [Release notes](https://github.com/trufflesecurity/trufflehog/releases)
- [Commits](
|
||
|
|
0a54d001ec
|
Harden Tauri release flow (#5341)
Some checks are pending
Security audit / pip scan-packages :: extras (push) Waiting to run
Security audit / pip scan-packages :: studio (push) Waiting to run
Security audit / pip scan-packages :: hf-stack (push) Waiting to run
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
* Harden Tauri backend preflight and startup
Require managed Studio root IDs to match before attaching to existing backends, close the concurrent backend-start window, and tighten frontend Tauri detection to Tauri-specific signals.
* Add Tauri backend manageability guards
Gate desktop backend compatibility on explicit manageability fields, add external-conflict handling for unsafe backend states, and protect update/repair paths from mutating active non-owned Studio backends. Track Tauri-owned backends with local owner metadata for verified orphan cleanup only.
* Split Tauri preflight probes into modules
Move preflight types, version checks, managed install probing, and backend probing into focused submodules while preserving behavior and keeping implementation files under the release-readiness size target.
* Use desktop-specific Tauri updater channel
Point the desktop updater at a same-repo desktop-latest manifest and publish that channel from non-draft desktop releases after validating the Tauri-generated latest.json.
* Add Linux desktop update policy
* Add owned backend lifecycle guards
* Adopt verified desktop-owned backends
* Validate desktop backend readiness
* Trim Tauri release hardening code
* Require desktop backend 2026.5.3
* Handle desktop backend edge cases
* Fail stalled desktop backend startup
* Fix desktop update edge cases
* Avoid secret-gating adopted watchdog
* Fix desktop update comparison guards
* Automate desktop release versioning
* Serialize desktop release workflow
* tests: follow preflight.rs split into preflight/{backend,managed,types,version}.rs
PR #5341 splits studio/src-tauri/src/preflight.rs into a directory of
submodules. The cmd.env_remove("UNSLOTH_STUDIO_HOME") + STUDIO_HOME
calls now live in preflight/managed.rs instead of preflight.rs, so
test_tauri_preflight_scrubs_studio_home_env counted zero matches in
the old single-file location and failed with "assert 0 >= 2".
Read whichever shape is on disk: preflight.rs at the old path plus
every *.rs under preflight/ (current PR has 2 occurrences in
preflight/managed.rs). The guard intent is unchanged: at least 2
env_remove calls covering run_cli_probe and probe_cli_capability,
plus the single commands.rs scrub in check_install_status. Verified
locally: pytest tests/test_studio_install_workspace_guard.py::test_tauri_preflight_scrubs_studio_home_env passes.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Avoid browser Tauri hostname detection
* Restore shutdown flag after failed stop
---------
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
||
|
|
040b80a60e
|
studio/ci: harden HF_HOME cache against actions/cache v5 silent restore failures (#5396)
Some checks are pending
Security audit / pip scan-packages :: extras (push) Waiting to run
Security audit / pip scan-packages :: studio (push) Waiting to run
Security audit / pip scan-packages :: hf-stack (push) Waiting to run
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
* studio/ci: harden HF_HOME/GGUF cache against actions/cache@v5 silent restore failures actions/cache@v5 has a recurring flake where it logs "Cache hit for: <key>" and then exits non-zero in well under a second without actually extracting the archive (see actions/cache#1621 and github community discussion #163260). When that happens to the JSON, images job the cache step is marked failure, all downstream steps are skipped (only the if: always() ones run), and the job never even tries to install Studio. Example: run 25713577488 / job 75498714730 took 23 s total and bailed at the cache step despite the cache having been written successfully ~30 min earlier. Replace the single-step actions/cache usage in all three jobs with the documented restore + save split: - actions/cache/restore with continue-on-error: true on the way in - Prime/Download step gated on cache-hit != 'true' OR outcome != 'success' so the silent-failure path re-downloads from HF instead of skipping - actions/cache/save on the way out, gated on the Prime step's outcome so we only write a fresh entry when we actually rebuilt the directory Same SHA-pinned action (v5.0.5), same cache keys, same paths -- so existing cache entries keep matching. Only behavior change is that a transient restore-side failure now falls through to a re-download instead of failing the job. * studio/ci: add continue-on-error to the new actions/cache/save steps Per review of PR 5396: a save-side flake (upload timeout, 5xx from the cache backend, future-fatal ReserveCacheError) is strictly recoverable because next run just re-downloads, so it should never fail the job. Today actions/cache/save@v5.0.5 already swallows ReserveCacheError as a non-fatal warning, so this is defense in depth. Aligns the save steps with their matching restore steps which already mask transient failures via continue-on-error. * studio/ci: drop continue-on-error from cache/save steps Reverting the save-side continue-on-error addition from the previous commit. cache/save@v5.0.5 already swallows ReserveCacheError (the most common save flake) as a non-fatal core.info, so the mask was rarely doing anything in practice. A real save-side failure (cache backend outage, blob server 5xx storm) is signal we want to keep -- without it we would see slow CI for days without knowing the cache layer is broken. If save flakes start showing up in practice we add this back with concrete evidence. The restore-side continue-on-error stays -- that is the actual fix for the actions/cache#1621 silent-restore-failure mode. Also strip the now-stale "continue-on-error" comments above the three save blocks. * studio/ci: clarify cache split header comment Per re-review: the prior wording "the Save step re-uploads on the way out" implied actions/cache/save would replace a broken existing cache entry, which is wrong -- cache keys are immutable, so save logs a warning when the key already exists and the corrupted entry stays until the -v1 suffix is bumped. Rewrite to spell out the actual behavior and the escape hatch (bump the suffix). |
||
|
|
8ca0455be4
|
studio/ci: sweep actions/cache v5 hardening across sibling smoke workflows (#5399)
* studio/ci: sweep actions/cache@v5 hardening across sibling smoke workflows Follow-up to PR 5396, which fixed the same flake in studio-windows-inference-smoke.yml. actions/cache@v5 has a recurring mode where it logs `Cache hit for: <key>` and then exits non-zero without extracting the archive (see actions/cache#1621 and github community discussion #163260). 12 cache blocks across 8 sibling Studio smoke workflows remained on the vulnerable one-step pattern and would abort before priming HF_HOME / installing Studio on the same flake. Apply the same restore + save split mechanically to every block: - actions/cache/restore@<v5.0.5 sha> with continue-on-error: true - Prime/Download gate widened to also fire on outcome != 'success' so the silent-restore-failure path re-downloads - actions/cache/save@<v5.0.5 sha> with continue-on-error: true, gated on the Prime/Download outcome so we only write a fresh entry when we actually rebuilt the directory Same SHA-pinned action, same cache keys (character-identical), same paths. Existing cache entries keep matching. Only behavior change is that a transient restore-side or save-side failure now falls through to a re-download instead of failing the job. Files touched (12 cache blocks total): studio-api-smoke.yml (1 block) studio-mac-api-smoke.yml (1 block) studio-mac-ui-smoke.yml (1 block) studio-ui-smoke.yml (1 block) studio-windows-api-smoke.yml (1 block) studio-windows-ui-smoke.yml (1 block) studio-inference-smoke.yml (3 blocks: HF, GGUF flat, HF+mmproj) studio-mac-inference-smoke.yml (3 blocks: HF, GGUF flat, HF+mmproj) Verification: all 12 single-step actions/cache@ uses removed, replaced by 12 restore@ + 12 save@; every file parses as valid YAML. * studio/ci: drop continue-on-error from cache/save steps Reverting the save-side continue-on-error addition. Defensive masking of save failures was correct in principle but loses signal: - cache/save@v5.0.5 already swallows ReserveCacheError (the most common save flake) as a non-fatal core.info, so the mask was rarely doing anything today. - A real save-side failure (sustained cache backend outage, blob server 5xx storm) is something we want to see, not hide. Without the signal we would see slow CI for days without knowing the cache layer is broken. - If save flakes start showing up in practice we add this back with concrete evidence. The restore-side continue-on-error stays -- that is the actual fix for actions/cache#1621 silent-restore-failures and removing it would re-introduce the bug. |
||
|
|
e27cc0ab08
|
studio/ci: npm tarball content scanner (no-install, hostile-input safe) (#5393)
* studio/ci: npm tarball content scanner (no-install, hostile-input safe) Counterpart to scripts/scan_packages.py for the npm side. Pip-side scanner reads requirements files, downloads PyPI archives via `pip download --no-deps`, and pattern-scans them for malicious shapes. This change adds the equivalent for npm tarballs. Why === PR #5392 (lockfile_supply_chain_audit.py) catches injection-pattern attacks where the malicious metadata lives IN the lockfile -- e.g. the TanStack Shai-Hulud worm that injected an `optionalDependencies` entry pointing at a GitHub commit. It does not catch the broader class of "legit-registry tarball with malicious content but normal lockfile metadata": attacker steals a maintainer's npm publish token, publishes a malicious version to registry.npmjs.org with a valid integrity hash, and the lockfile entry looks normal -- the malicious code lives inside the tarball's dist/index.js or its own postinstall script. Today that gap is covered reactively by `npm audit` + OSV-Scanner once the GHSA lands; there is a real window before that. This scanner closes the window by inspecting tarball CONTENT. What it checks ============== For each entry in studio/frontend/package-lock.json: 1. Download the tarball directly from registry.npmjs.org. Refuse any non-allowlisted URL. Stream-bounded at 64 MiB. 2. Verify SHA-512 integrity against the lockfile entry BEFORE opening the tarball. 3. Safely extract into a sandboxed temp dir behind guards: - reject symlinks / hardlinks (LNKTYPE, SYMTYPE) - reject absolute paths and `..` traversal - reject character / block / FIFO devices - per-file size cap 8 MiB, cumulative cap 128 MiB, member count cap 50000 - stream open (mode='r|gz') so we abort mid-extract - extracted files set to non-executable mode (0o644) 4. Pattern-scan the extracted text content for: - lifecycle (preinstall/install/postinstall/prepare) scripts in any package.json that fetch + pipe-to-shell external content -- the install-time RCE vector - optionalDependencies pointing at github: / git+ / git: (TanStack worm injection shape) - C2 / exfiltration hosts: getsession.org, 169.254.169.254 (IMDS), 169.254.170.2 (ECS), metadata.google.internal, vault.svc.cluster.local, k8s ServiceAccount token paths, ACTIONS_ID_TOKEN_REQUEST_URL/TOKEN, npm publish-token enumeration endpoint - credential paths a frontend lib should never read: ~/.npmrc, ~/.aws/credentials, ~/.ssh/id_*, /.kube/config, /.docker/config.json - JS regex: Function/eval against base64-decoded payload, process.env.GITHUB_TOKEN / NPM_TOKEN / AWS_* access in package source - obfuscation: large base64-ish blob (>=2 KiB) fed into Function or eval (router_init.js dropper shape) - literal IOC substrings from public advisories Safety ====== Threat model: every tarball is hostile. The scanner: - never runs `npm install`, never executes anything from a downloaded tarball, never calls subprocess on extracted content - downloads only from registry.npmjs.org (defence-in-depth check at parse time AND inside download_tarball) - stdlib-only (no third-party deps -- adding one would itself be a supply-chain liability) - tempdir wiped via atexit on every termination path - exit codes: 0 clean, 1 HIGH/CRITICAL finding, 2 internal error Wiring ====== New job `npm-scan-packages` in security-audit.yml, parallel to `pip-scan-packages`. Triggers same as the existing audits (PR on manifest changes, push to main/pip, daily 04:13 UTC, dispatch). Initially `continue-on-error: true` so the baseline can settle -- matches the existing convention for the other audit steps. Drop that flag once the baseline is clean for a week. Verified locally ================ - AST parse OK. - Real-network 3-package smoke: 0 findings. - Real-network 25-package smoke (Babel + assistant-ui surface): 0 findings, no hard errors. - 9 fault-injection scenarios all pass: 1. zip-slip path traversal refused 2. symlink member refused 3. oversized member refused (size cap) 4. too-many-members refused (count cap) 5. router_init.js IOC + obfuscated-blob shape both detected in synthetic malicious tarball 6. lifecycle fetch-exec in scripts.preinstall detected as CRITICAL 7. AWS IMDS reference (169.254.169.254) detected 8. SRI integrity-parser accepts syntactically-valid SRI 9. download_tarball refuses non-allowlisted hostname Refs ==== - https://tanstack.com/blog/npm-supply-chain-compromise-postmortem - https://github.com/TanStack/router/issues/7383 - https://github.com/TanStack/router/security/advisories/GHSA-g7cv-rxg3-hmpx - https://www.aikido.dev/blog/mini-shai-hulud-is-back-tanstack-compromised - https://www.stepsecurity.io/blog/mini-shai-hulud-is-back-a-self-spreading-supply-chain-attack-hits-the-npm-ecosystem * scan_npm_packages: kill false positives + handle real native binaries First CI run on PR #5393 (run 25710423126 / job 75489317395) hit two false-positive classes plus one cap-too-tight class: False positives (7 findings): @langchain/core 1.1.44 ssrf.{cjs,js}: a SSRF *protection* module that ships a literal blocklist `const CLOUD_METADATA_IPS = [...]` of IMDS hosts as data the library REFUSES to dial. Our scanner saw the IPs as substrings and flagged 6 of them. object-treeify 1.1.33 package.json: a manual `docker` dev script that mounts `~/.npmrc` and `~/.aws` for local containerised builds. npm never runs `scripts.docker` automatically; it is only invoked when a developer runs `npm run docker`. Our bare substring scan flagged the `/.npmrc` reference anyway. Cap-too-tight class (10+ findings): next/swc, rolldown bindings, biome CLI, lightningcss, mermaid sourcemap, typescript.js. The 8 MiB per-file cap was calibrated for JS source and rejected legitimate precompiled native binaries (next-swc .node is 137 MB) and CLI executables (biome is 25-33 MB). Fixes ===== cred-surface-host detection split into two tiers: ALWAYS_BAD substrings have no legitimate use anywhere and still bare-match: `registry.npmjs.org/-/npm/v1/tokens`, `ACTIONS_ID_TOKEN_REQUEST_URL/TOKEN`. NEEDS_CONTEXT substrings (IMDS IPs, GCE metadata host, k8s ServiceAccount path, Vault endpoint) require co-occurrence with EITHER a fetch verb (fetch/axios/http.get/etc) within 200 chars OR an `http(s)?://HOST` URL prefix OR a `host:`/`hostname:` config field. A defensive blocklist literal does not match any of those rules; an actual outbound call always does. cred-surface-path detection moved out of the bare-text scan into `scan_package_json` and scoped to the 4 NPM lifecycle hooks (preinstall / install / postinstall / prepare). A `/.npmrc` reference in a `docker` dev script is silent; a `cat ~/.npmrc | curl ...` in a `postinstall` fires HIGH. Per-file size cap split by content type, sniffed via 16-byte magic header read (ELF / Mach-O / PE / WASM / archive formats), plus suffix list (.node/.wasm/.so/.dll/.dylib/.exe), plus regex for versioned shared libs (libfoo.so.8.17.3), plus a null-byte ratio fallback for extensionless binaries that headers do not catch. Text files: 16 MiB cap (still tight; typescript.js at 9.1 MB is the legitimate ceiling). Binary files: 256 MiB cap (next-swc .node is 137 MB; sharp libvips is ~18 MB; rolldown bindings are 18-26 MB each). Cumulative: 512 MiB per tarball. Tarball: 256 MiB compressed. Binary files are also skipped in the content scanner -- regex over compiled machine code is noise. The IOC substring fallback in `scan_extracted_tree` now uses the same magic-sniff to decide whether to grep. HTTP timeout bumped 30s -> 60s for large tarballs. Verified ======== - AST parse OK. - 11 fault-injection tests pass: * zip-slip, symlink, oversized-declared-size, count-cap * router_init.js IOC detected * IMDS-in-URL still detected (new contextual rule) * langchain SSRF blocklist no longer false-positive * object-treeify docker script no longer false-positive * lifecycle-script `cat ~/.npmrc | curl ...` detected * synthetic ELF (extensionless executable) extracts and is correctly skipped from text scan * versioned `.so.8.17.3` shared lib extracts cleanly - Real-network end-to-end on the full lockfile: 968 packages, 0 findings, 0 hard errors, 76 seconds. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
ac765d2efb
|
studio/ci: pre-install lockfile supply-chain audit (npm + cargo) (#5392)
* studio/ci: pre-install lockfile supply-chain audit (npm + cargo)
The Mini Shai-Hulud wave that hit @tanstack/* on 2026-05-11 19:20-19:26
UTC (GHSA-g7cv-rxg3-hmpx) pushed 84 malicious versions across 42
packages. Each compromised tarball carried an `optionalDependencies`
entry pointing at a GitHub-hosted prepare script that exfiltrated
GitHub / npm / AWS / Vault / SSH credentials on `npm install` / `npm
ci`. Our current lockfile pins ALL @tanstack/* at pre-malicious
versions so we were not exposed, but the only defense layer between
"dependabot opens a security-update PR during a malicious window" and
"a compromised package's postinstall runs on the CI runner" is the
advisory-DB latency. `npm audit` and OSV-Scanner are reactive: there
is a window between malicious publication and GHSA landing.
Add a pre-install lockfile audit that fires on the injection pattern
itself, BEFORE `npm ci` gets a chance to execute lifecycle scripts:
scripts/lockfile_supply_chain_audit.py
npm side (studio/frontend/package-lock.json, lockfileVersion 2/3):
1. every `resolved` URL must point to registry.npmjs.org;
direct GitHub / git+ / file: refs are the Shai-Hulud vector
2. every non-bundled entry must carry an `integrity` SHA
3. raw-text scan for known IOC strings (router_init.js,
tanstack_runner.js, router_runtime.js, @tanstack/setup,
the specific TanStack worm commit hash, getsession.org
exfiltration host, "A Mini Shai-Hulud has Appeared" marker)
4. nested `node_modules/.../node_modules/` fold-ins are
transparent -- they ride on the parent tarball's integrity
cargo side (studio/src-tauri/Cargo.lock):
5. every `source` must be the crates.io registry
6. registry crates must have a `checksum`
7. one allowlist entry: fix-path-env from
tauri-apps/fix-path-env-rs at pinned SHA c4c45d5. Any other
non-registry source -- or a bump of that pinned SHA --
re-fires the audit until reviewed + appended
Wire into four workflows:
.github/workflows/security-audit.yml -- new step inside the
advisory-audit job, immediately before `npm audit` so the
structural pass and the advisory-DB pass appear together in
the GitHub step summary.
.github/workflows/studio-frontend-ci.yml,
.github/workflows/wheel-smoke.yml,
.github/workflows/studio-tauri-smoke.yml -- new step immediately
BEFORE `npm ci`. If a future malicious bump lands in our lockfile,
the audit refuses and `npm ci` never runs, so no `prepare` /
`postinstall` from a compromised tarball can execute on the
runner.
Note on --ignore-scripts: every npm ci in our CI is followed directly
by `npm run build` or `tauri build`, both of which depend on package
install scripts (esbuild's native-binary postinstall, etc.). Blanket
--ignore-scripts breaks the build, so the pre-install structural
audit is the practical mitigation. The audit reads lockfiles only;
it never executes anything from them.
Verified:
- Clean state: 0 findings on the current tree (npm + cargo).
- Fault injection: synthetic `@tanstack/setup` IOC + non-registry
`resolved` URL both fire with exit code 1.
- YAML parses cleanly for all four modified workflows.
Refs:
- https://tanstack.com/blog/npm-supply-chain-compromise-postmortem
- https://github.com/TanStack/router/issues/7383
- https://github.com/TanStack/router/security/advisories/GHSA-g7cv-rxg3-hmpx
- https://www.aikido.dev/blog/mini-shai-hulud-is-back-tanstack-compromised
- https://www.stepsecurity.io/blog/mini-shai-hulud-is-back-a-self-spreading-supply-chain-attack-hits-the-npm-ecosystem
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
||
|
|
1794a544b5
|
ci: retry transient github.com 5xx on unsloth-zoo git fetches in CI (#5389)
Windows Studio API CI run 25676130116 / job 75374388468 failed at
"Install Studio (--local, --no-torch)" because github.com itself
returned HTTP 500 mid-clone:
remote: Internal Server Error
fatal: unable to access 'https://github.com/unslothai/unsloth-zoo/':
The requested URL returned error: 500
exit code: 128
The runner did nothing wrong. github.com served the same repo fine
seconds before and after, and adjacent commits on main were green.
Without a retry, every transient upstream blip turns one job red.
Scope the retry layer to CI workflows only, leaving install.sh and
install.ps1 unchanged so end-user installs keep their existing
behavior (a transient github.com hiccup will still surface verbatim
on a user's machine, where they can re-run interactively).
.github/workflows/{mlx-ci,version-compat-ci,consolidated-tests-ci}.yml
- inline 3-attempt retry loop around the four direct
`git clone` / `pip install git+...unsloth-zoo` invocations,
emitting GitHub Actions :⚠️:/::error:: annotations so
transient hits surface in the job summary
Only kicks in for upstream failures (5xx, exit 128, network errors)
and so does not mask genuine install errors -- a malformed pip spec,
a missing dependency, a real type error in the zoo's setup.py all
still fail on the first attempt.
|
||
|
|
a6462876de
|
dependabot: group security updates and cover /studio/frontend npm advisories (#5372)
Some checks are pending
Security audit / advisory audit (pip + npm + cargo) (push) Waiting to run
Security audit / pip scan-packages :: extras (push) Waiting to run
Security audit / pip scan-packages :: studio (push) Waiting to run
Security audit / pip scan-packages :: hf-stack (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
Every groups entry has an implicit applies-to: version-updates, which means security advisories bypass the group config and open one PR per affected package. The 11-PR backlog this week was driven by exactly this: four /studio/src-tauri cargo advisories (rustls-webpki, tauri, rand, openssl) opened individually instead of joining the cargo-tauri group PR, and one /studio/frontend npm group PR (hono + ip-address) opened outside the bun config because GitHub fires npm-package advisories under the npm_and_yarn ecosystem regardless of which package manager actually owns the lockfile. Two changes: 1. Sibling groups with applies-to: security-updates for each existing ecosystem (actions, bun, npm-oxc-validator, python, cargo-tauri). Same patterns: ["*"] coverage, so security advisories batch into a single PR per ecosystem per week alongside the version-update group. 2. New npm entry pointed at /studio/frontend with open-pull-requests-limit: 0 (suppress version-update PRs; bun handles those) but with a security-updates group so future hono-style advisories land in one batched PR instead of one PR per package. Doesn't retroactively regroup PRs already open; the existing 11 are unaffected and merge as-is. |
||
|
|
8c606a70b5
|
studio: authenticate HF downloads across Studio CI workflows (#5370)
The Mac json-images job (run 25664825326) hit the 30 min step budget while downloading 4 GiB of GGUF assets unauthenticated. The log shows the explicit "You are sending unauthenticated requests to the HF Hub" warning followed by 30 min of zero progress, then job cancellation. macos-14, ubuntu-latest, and windows-latest runners share NAT egress IP pools across the whole GitHub Actions fleet, so the anonymous per-IP rate limit kicks in well before the file size alone would suggest. An authenticated token shifts the budget to per-user. Add HF_TOKEN: secrets.HF_TOKEN to every hf download step across the nine studio CI workflows that pull from HF. The env is scoped to the download step only, not the job, so every other step still runs without HF_TOKEN in its environment and the GitHub secret-masking layer handles log scrubbing. For the Mac json-images step specifically, the model and mmproj downloads now run in parallel under wait, and an ls -lhL after the wait surfaces a partial download as an obvious failure instead of a silent 30 min timeout on the next inference/load call. |
||
|
|
6d4e6f2514
|
CI: scope GITHUB_TOKEN permissions, add MLX CI, unblock ~60 skipped tests (#5312)
* CI: scope GITHUB_TOKEN permissions and unblock ~60 skipped tests
permissions:
- All five PR-time workflows (backend, frontend, inference smoke, tauri,
wheel) now declare permissions: contents: read at the workflow level,
matching CodeQL's default-permissions guidance and the existing pattern
in release-desktop.yml. None of these workflows write to the repo.
skipped tests:
- Repo tests (CPU) job now installs node 22 and uv, which unblocks
~60 tests that were silently skipping on CI:
- 9 tests in tests/studio/test_chat_preset_builtin_invariants.py
skipped on "node not available". Fixed in this commit; an obsolete
"unsloth_repo/" prefix in WORKDIR was also pointing the source-file
existence check at a path that no longer exists.
- tests/python/test_e2e_no_torch_sandbox.py (47), test_studio_import_no_torch.py
(29), test_tokenizers_and_torch_constraint.py (most of 42) all spawn
fresh uv venvs and self-skip when uv is missing.
- Three test_tokenizers_and_torch_constraint.py cases are deselected
because they expose a real bug in studio/backend/requirements/no-torch-runtime.txt:
the unpinned tokenizers line resolves to 0.23.1, which transformers
rejects with "tokenizers>=0.22.0,<=0.23.0 is required". Tracked
separately as a no-torch install regression.
Locally: 760 passed, 1 skipped, 23 deselected (was 694 / 67 / 23).
* CI: add MLX CI workflow for the Studio dispatch matrix
Mirrors the three files documented in tests/studio/README.md (PR #5307)
into a dedicated workflow so MLX dispatch failures show up as their own
check on PRs rather than getting buried inside Backend CI:
- test_hardware_dispatch_matrix.py 7-profile parametrized matrix
+ 2 dispatch-priority canaries
- test_is_mlx_dispatch_gate.py AST + runtime guard on
unsloth._IS_MLX
- test_mlx_training_worker_behaviors.py worker.py contract checks
Triggers on pull_request when any of unsloth/__init__.py,
studio/backend/utils/hardware.py, studio/backend/core/training/worker.py,
or any of the three test files are touched. Runs on a Linux+CPU runner
with hardware spoofs; no Apple Silicon, real GPU, or real MLX install
required. Locally validated: 36 passed in 0.41s.
permissions: contents: read at the workflow level (matching the rest of
the PR-time CI surface).
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* ci(mlx): fix path filter that pointed at a non-existent file
The MLX CI workflow listed ``studio/backend/utils/hardware.py`` as a
path filter, but no such file exists. The actual layout is
studio/backend/utils/hardware/
__init__.py
amd.py
hardware.py
nvidia.py
vram_estimation.py
so the filter as written would never match. A reviewer modifying
``hardware/hardware.py`` (where ``detect_hardware``, ``DeviceType``,
and ``IS_ROCM`` actually live) would not trigger MLX CI, which
defeats the point of the focused PR gate.
Replace the broken filter with ``studio/backend/utils/hardware/**``
so any change in the hardware probe directory triggers MLX CI, and
add three sibling triggers that each materially affect dispatch:
- ``unsloth/_gpu_init.py``
Hosts ``from .models import *`` and the ``from .trainer import *``
chain. The trainer.py circular-import fix that landed in
``
|
||
|
|
a56c959233
|
Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke (#5298)
* Add Studio PR-time CI: pin enforcement, frontend, backend, wheel smoke
The repo currently has no PR-time CI; only release-desktop.yml (manual) and
stale.yml (issue pinger). studio/backend/tests/ has 35 test files (~860
tests collected) that never run automatically. Frontend lint/typecheck/build
scripts exist in package.json but are not gated on PRs either. This is the
gap that let 2026.5.1 ship with the broken Studio chat-history bundle.
Adds four ubuntu-latest workflows, all CPU-only and free for public repos:
studio-pin-enforce.yml
Greps studio/frontend/package.json for caret/tilde ranges on the
@assistant-ui surface (and assistant-stream). Blocks the exact regression
vector that produced 2026.5.1 (^0.12.19 resolving to a breaking 0.12.28).
studio-frontend-ci.yml
npm ci (strict lockfile), tree-clean check after, typecheck, vite build,
bundle grep for the Studio unstable_Provider call site (<= 3 hits = OK,
>= 4 = the 2026.5.1 regression), 75 MB dist budget, biome non-blocking.
Uploads dist on failure.
studio-backend-ci.yml
Runs the existing studio/backend/tests/ suite on Python 3.10/3.11/3.12.
Excludes test_studio_api.py (live model + GGUF download) and
llama_cpp_load_progress_live (spawns a real llama.cpp). Local run on this
branch: 861 pass, 4 skipped, 5 deselected. ruff non-blocking.
wheel-smoke.yml
python -m build, then verifies the produced wheel:
- ships studio/frontend/package-lock.json
- ships studio/frontend/dist/index.html
- does NOT ship studio/frontend/node_modules/
- does NOT ship studio/frontend/bun.lock
- main JS bundle has < 4 unstable_Provider hits
Then installs the wheel into a fresh venv with a lightweight dep set and
imports studio.backend.main. Locally validated against the wheel built
from this branch.
Each workflow has concurrency cancellation on the same ref. biome and ruff
are gated as non-blocking until the existing accumulated drift is cleared
(~470 biome errors today); remove the bypass in a follow-up.
Notes verified locally:
- pin enforcement: PASS (carets dropped on this branch)
- frontend npm ci -> typecheck -> build -> grep -> budget: PASS
- bundle: 48 MB, hits=1
- backend pytest: 861 pass, 1 GPU-pollution failure not reproducible on
GPU-less runners (won't reproduce on ubuntu-latest)
- wheel build: 13s, produces unsloth-2026.5.2-py3-none-any.whl
- wheel content sanity: all five checks PASS
* CI: install full backend dep set + refine pytest filter for CPU runners
First CI run on PR #5298 surfaced two real gaps:
1. pytest collection failed at `import yaml` in utils/models/model_config.
Locally my workspace venv had pyyaml from a transitive; CI's clean Python
3.10/3.11/3.12 didn't, so collection hit ModuleNotFoundError on the very
first test module. Same blew up the wheel-smoke `from studio.backend.main
import app` step.
2. Once the import chain was complete, ~9 tests still failed because they
exercise GPU-only paths or live transformers introspection that can't run
on a GPU-less `ubuntu-latest` runner regardless of code correctness:
- TestGpuAutoSelection
- TestPreSpawnGpuResolution
- TestPerGpuFitGuardAllCounts
- TestTransformersIntrospection
- test_returns_cuda_when_cuda_available
- test_calls_cuda_cache_when_cuda
Fix:
- Backend CI installs `studio/backend/requirements/studio.txt` (the
declared backend dep set) + the extras the import chain needs but
studio.txt omits (python-multipart, sqlalchemy, cryptography, pyyaml,
jinja2, mammoth, unpdf, requests, etc.) + torch CPU wheel + transformers.
- Refine the pytest -k filter to deselect the GPU/introspection-bound
classes by name. Deselections are commented inline with the reason.
- wheel-smoke uses the same dep set so the import smoke matches.
Locally validated against the freshly-built unsloth-2026.5.2 wheel:
831 passed, 5 skipped, 35 deselected, 0 failed in 47s
Studio backend imports cleanly in a fresh venv after the wheel install.
* CI: collapse multiline pytest -k expression to a single line
YAML's | block-scalar fed the newlines verbatim into the -k argument and
pytest rejected it as 'Wrong expression passed to -k'. Same logical filter
on one line.
* CI: rename jobs so the GitHub UI shows what each check actually does
Adds a per-job 'name:' to all four workflows so the PR check list reads:
Studio pin enforcement / @assistant-ui must be pinned exactly
Studio frontend CI / Frontend build + bundle sanity
Studio backend CI / Backend pytest (Python 3.10|3.11|3.12)
Studio backend CI / Backend ruff lint (non-blocking)
Wheel build + smoke / Wheel build + content sanity + import smoke
Instead of the default '<workflow> / <job-key>' which was opaque
('check', 'build', 'pytest (3.10)', 'ruff', 'wheel').
* CI: add Python 3.13 to backend pytest matrix
Verified locally: 831 backend tests pass under Python 3.13 with the same
filter set used for 3.10 / 3.11 / 3.12.
* CI: add Studio inference smoke + Tauri build smoke
Two new workflows. Both CPU-only, both free on `ubuntu-latest`.
studio-inference-smoke.yml
The only workflow we have that proves "Studio actually works", as opposed
to "the bundle parses" or "the imports succeed":
- runs install.sh --local --no-torch (lean Studio install)
- downloads unsloth/gemma-4-E2B-it-GGUF UD-IQ3_XXS into actions/cache
- boots Studio in api-only mode
- logs in with the bootstrap password, changes it, re-logs
- POST /api/inference/load on the GGUF
- POST /api/inference/chat/completions and asserts a non-empty
assistant response
Validated end-to-end locally on a fresh main install: model loaded,
chat completion returned `Hello!` against the same GGUF the workflow
uses.
studio-tauri-smoke.yml
PR-time variant of release-desktop.yml. Linux-only debug build
(`tauri build --debug --no-bundle`) on ubuntu-22.04. Catches
src-tauri Cargo.toml / Rust source breakage, tauri.conf.json drift,
and frontend-distDir wiring. Pinned to the same Tauri CLI version
(2.10.1) as release-desktop.yml so CLI bumps surface in CI before
they break the release pipeline. Mac and Windows desktop builds
stay manual via release-desktop.yml because they need code-signing
secrets.
* CI: use 'hf download' instead of deprecated 'huggingface-cli download'
huggingface_hub 1.13.0 dropped the huggingface-cli entrypoint. The
replacement is the 'hf' CLI shipped with the same package. Same args,
just s/huggingface-cli/hf/.
* CI: assert llama.cpp prebuilt path was used on ubuntu-latest
The inference-smoke job runs on ubuntu-latest (CPU-only, x86_64), which
is exactly the host shape that should pick up ggml-org/llama.cpp's
bin-ubuntu-x64.tar.gz prebuilt directly. If install.sh ever falls back
to a source build on this runner, the studio/setup.sh routing has
regressed and every CPU-only Linux user is paying a 3 minute compile
cost again.
Tee install.sh output to logs/install.log, then fail the job if the log
contains "falling back to source build" or is missing the success
marker "prebuilt installed and validated" / "prebuilt up to date and
validated".
Also include logs/install.log in the failure artifact so the prebuilt
diagnostics are uploaded alongside studio.log when the job fails.
* Tighten prebuilt-assertion comment in studio-inference-smoke
* CI: switch inference-smoke model to Qwen3.5-2B UD-IQ3_XXS
Drops the Gemma 4 E2B GGUF (~2.3 GB) for unsloth/Qwen3.5-2B-GGUF
(UD-IQ3_XXS, ~890 MiB). Cache-miss download is roughly a third of
what it was, and CPU inference on ubuntu-latest finishes well
inside the 25 minute job budget.
Verified locally: load via /api/inference/load returns
status=loaded, is_gguf=true, supports_reasoning=true,
supports_tools=true; chat completion returns a non-empty assistant
message ("Hello!").
* CI: add workflow_dispatch to inference-smoke for manual cache pre-warm
* CI: fold pin-enforce grep into studio-frontend-ci, drop standalone workflow
The "@assistant-ui must be pinned exactly" check was its own ~7 second
workflow, doing a single grep on studio/frontend/package.json. Move it
into studio-frontend-ci.yml as a pre-install step (right after
checkout, before any node setup so a violation fails fast). One fewer
top-level check row on every PR, same coverage.
Add a FIXME so this step is dropped once @assistant-ui/* and
assistant-stream leave 0.x: on 1.x, caret ranges are conventional and
this becomes overzealous.
* CI: add Repo tests (CPU) job, mirroring unsloth-zoo PR #624 conftest
The top-level tests/ tree was previously not run anywhere. 23 of its
files are CPU-friendly with the right harness: pure-Python helpers,
ast walks, installer logic, and CLI shape tests. Locally validated:
302 passed, 9 skipped, 12 deselected in ~7 seconds on Python 3.12.
Three pieces:
1. tests/conftest.py -- GPU-free harness, mirrors the conftest landed
in unslothai/unsloth-zoo PR #624. Pre-loads unsloth_zoo.device_type
and unsloth.device_type under a temporarily-mocked
torch.cuda.is_available() so each module's @cache permanently
captures "cuda" and the import chain succeeds on a CPU runner.
Also stubs torch.cuda.get_device_capability /
is_bf16_supported / mem_get_info, which unsloth/__init__.py and
unsloth_zoo.temporary_patches probe at import time when
DEVICE_TYPE == "cuda". On a real accelerator the harness is
skipped and detection runs normally.
2. Two existing tests were leaking sys.modules state across the
session because they injected stubs without an __spec__ and
without restoration:
- tests/test_raw_text.py shoved a "datasets" stub into
sys.modules. transformers' import_utils later did
importlib.util.find_spec("datasets") and got
ValueError: datasets.__spec__ is None.
- tests/python/test_fast_sentence_transformer_redirect_lifecycle.py
shoved "transformers", "sentence_transformers", and
"sentence_transformers.models" stubs in. Subsequent tests
that did `import transformers` got the non-package stub.
Fix: set __spec__ on stubs, plus an autouse fixture in the
sentence-transformer test file that restores the three keys
after each test.
3. .github/workflows/studio-backend-ci.yml gains a third job,
`Repo tests (CPU)`, that installs the same dep set as the
backend-pytest matrix (Python 3.12 only -- the tests are
version-independent), exports PYTHONPATH=studio so tests/python/*
can import install_python_stack, and runs the 23-file subset
above with `-m 'not server and not e2e'`.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* CI: install unsloth_zoo for Repo CPU tests, harden conftest fallback
The CPU job at run 25422050018 broke at conftest collection: the
preload of unsloth.device_type pulled in `from unsloth_zoo.utils import
Version` and ubuntu-latest didn't have unsloth_zoo on the path because
it is an optional dep of unsloth. Two fixes:
1. Install unsloth_zoo>=2026.5.1 alongside the other deps in the Repo
tests (CPU) job (it's also what unsloth's optional `huggingface`
extra pins).
2. Wrap the body of _preload_device_type in conftest.py in a try/except
so any import failure (missing prereq, broken module, etc.) cleanly
returns False instead of aborting the entire collection. The caller
already falls back to the stub device_type module on False, so the
net behavior is "best effort: real device_type if possible, stub
otherwise" instead of "abort the test session".
* kernels.utils: guard CUDA_STREAMS / XPU_STREAMS init for DEVICE_COUNT==0
When DEVICE_COUNT is 0 (CPU host: no visible NVIDIA / AMD / Intel GPU)
the dict comprehension {... for i in range(0)} was empty and the
subsequent max(_CUDA_STREAMS.keys()) raised
ValueError: max() iterable argument is empty
during module import. That made unsloth.kernels.utils unimportable on
any CPU runner, which in turn blocked all of tests/saving/**, three
top-level tests/test_*.py, and tests/qlora/test_unsloth_qlora_train_and_merge.py
from even collecting on CPU CI.
Wrap the per-device-index dict comprehension and max() machinery in
a DEVICE_COUNT > 0 guard. When DEVICE_COUNT is 0 fall back to empty
containers (CUDA_STREAMS = (), WEIGHT_BUFFERS = [], ABSMAX_BUFFERS = []).
The consumer functions further down in this module index these arrays
by device_index but only during real GPU work, so the empty fallbacks
never get touched on a CPU host.
GPU-safety verified locally: with 8 visible CUDA devices, CUDA_STREAMS
has 8 entries (identical to before this PR). With CUDA_VISIBLE_DEVICES=""
the module imports cleanly, CUDA_STREAMS is (), and the previously
blocked tests now collect (test_get_model_name passes 38 subtests,
test_resolve_model_class passes 9, test_model_registry collects all 8
parametrizations).
Same shape applied to the DEVICE_TYPE == "xpu" branch for symmetry.
* CI: switch Repo tests (CPU) to auto-discovery + isolate flakes
Three changes, locally validated end-to-end (779 passed, 11 skipped,
23 deselected, 0 failed across all three steps):
1. Repo tests (CPU, auto-discovered): replace the explicit 23-file
list with `pytest tests/` plus a small set of `--ignore` and
`--deselect` flags. New tests under tests/python, tests/studio
(excluding the two state-sensitive files), and top-level
tests/test_*.py are picked up automatically with no workflow edit.
--ignore covers:
- tests/qlora and tests/saving: GPU-bound by design
- tests/utils: helpers folder, not tests
- tests/sh: shell suite handled in its own step
- two state-polluting hardware-spoof files (next step)
-m 'not server and not e2e': honours markers already declared
in tests/python/conftest.py
--deselect: test_model_registration / test_all_model_registration
hit huggingface_hub live; they belong on a network job
2. Hardware-spoof tests (state-sensitive, run in isolation):
tests/studio/test_hardware_dispatch_matrix.py and
tests/studio/test_is_mlx_dispatch_gate.py mutate module globals
in studio.backend.utils.hardware.hardware (IS_ROCM, DEVICE) via
their spoof fixtures, and the leak crosses file boundaries.
Running them in their own pytest invocation avoids polluting the
main sweep. Both pass cleanly in isolation: 28 passed, 1 skipped.
3. Shell installer tests: explicitly enumerated subset that does not
depend on install.ps1 layout (test_install_host_defaults.sh has
drifted; that's a separate followup).
Test fixes folded in to keep the run green:
- tests/studio/install/test_rocm_support.py::TestAmdGpuMonitoring
::test_amd_primary_gpu_with_mock now clears
HIP/ROCR/CUDA_VISIBLE_DEVICES via monkeypatch so
_first_visible_amd_gpu_id() does not short-circuit when the runner
sets CUDA_VISIBLE_DEVICES="" to suppress CUDA.
- tests/studio/test_hardware_dispatch_matrix.py::spoof_hardware
fixture now stubs torch.cuda.get_device_properties when
cuda_available is True so detect_hardware()'s device_name probe
does not call into _cuda_init() on a CPU runner.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* CI: install torchvision (CPU) so unsloth_zoo.vision_utils can import
Run 25430652224 collected three test modules that import unsloth and
crashed at unsloth_zoo/vision_utils.py:68 with
ModuleNotFoundError: No module named 'torchvision'
unsloth_zoo.vision_utils unconditionally imports torchvision at module
scope, and unsloth.models._utils pulls vision_utils in. The Repo tests
(CPU) job installed torch from the CPU index but not torchvision, so
any test that imports unsloth.models.* failed at collection.
Add torchvision<0.26 to the same pip install --index-url
https://download.pytorch.org/whl/cpu line.
* CI: install bitsandbytes (CPU build) for unsloth.models._utils import
Run 25430982243 collected three test modules that import unsloth and
crashed at unsloth/models/_utils.py:1166 with
ModuleNotFoundError: No module named 'bitsandbytes'
The bnb import there is unconditional. Recent bnb versions (>=0.45)
ship a CPU build so the wheel installs on a free Linux runner and the
import resolves; the kernels still raise on use but the module
collects, which is enough for these CPU tests.
Add 'bitsandbytes>=0.45' to the Repo tests (CPU) deps.
* CI: rename workflows + guard kernels.utils CPU-torch binding
Workflow renames (top-level `name:` keys; affects PR check rows):
Studio backend CI -> Backend CI
Studio frontend CI -> Frontend CI
Studio inference smoke -> Studio GGUF CI
Studio Tauri smoke -> Studio Tauri CI
Wheel build + smoke -> Wheel CI
Backend CI's matrix job goes from "Backend pytest (Python 3.10)" to
just "(Python 3.10)" so the GitHub UI row reads
"Backend CI / (Python 3.10)" rather than the old verbose form.
Production guard for CPU torch (run 25431126138):
unsloth/kernels/utils.py:165 was an unconditional
_gpu_getCurrentRawStream = torch._C._cuda_getCurrentRawStream
which raised AttributeError on a CPU-only torch wheel because the
compiled CUDA backend is absent. Three test modules (test_get_model_name,
test_model_registry, test_resolve_model_class) crashed at collection
because their import chain reaches this line.
Add a hasattr probe: when torch is built without CUDA, fall through to
a no-op binding that returns 0. _get_tensor_stream is only invoked
during real GPU work, so the no-op is never executed on a CPU host.
GPU-safety verified locally: with 8 visible CUDA devices the binding
still resolves to the real torch._C._cuda_getCurrentRawStream
(behaviour identical to before this PR). The XPU branch is untouched.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
||
|
|
507417579f
|
Fix Studio desktop tray installer and titlebar and bux fixes (#5179)
* fix(tauri): dedupe tray and brand nsis installer * feat(tauri): add linux windows custom titlebar * Fix desktop auth gate after backend startup * Fix desktop installer assets and setup script skew * Scope setup failure exit to Tauri installer * fix desktop updater production channel * fix desktop auth runtime installer regressions * fix desktop dev cors retry * fix tauri process generation race * feat desktop diagnostics support report * fix tauri apt update best effort * Fix Windows desktop NSIS installer upgrades * Start managed backend after desktop install * Improve NSIS installer branding resolution * Fix assistant-ui internal import * Fix desktop release workflow * Keep desktop auth retry on cached backend --------- Co-authored-by: wasimysaid <wasimysaid@users.noreply.github.com> |
||
|
|
a5eb2e3d50
|
Add tauri (#5144)
* add unsloth studio desktop app
* Fix review findings
- studio/src-tauri/tauri.conf.json: retarget updater to staging repo
(danielhanchen/unsloth-staging-2); switch to unslothai/unsloth on upstream merge.
- studio/src-tauri/linux/postremove.sh: drop the interactive read loop and the
/home/* iteration. Package maintainer scripts must stay non-interactive and
must not touch other users' data.
- studio/frontend/src/app/auth-guards.ts: honor tauriAutoAuth() boolean. Failed
auto-auth now redirects to /login; requireGuest/requirePasswordChangeFlow
only redirect to /chat when auth succeeds. The new early-return on failed
auth is intentional so the login / change-password flows remain reachable
when desktop auth is not yet established.
- studio/frontend/src/config/env.ts: keep fetched=false on health failure so
later calls retry instead of caching the client-side platform guess.
- studio/src-tauri/src/install.rs: pick the available system package manager
(apt-get, dnf, zypper, pacman); AppImage bundles run on non-Debian distros.
- studio/frontend/src/lib/open-link.ts + markdown-text/sources callers: return
boolean from openLink so callers only preventDefault on handled URLs; relative
hrefs now navigate natively.
- studio/frontend/src/features/settings/tabs/about-tab.tsx: fetch(apiUrl(...))
so the version request targets the backend port in desktop mode. The bare
/api/health predates the Tauri webview (blame: the earlier onboarding commit,
which ran with same-origin frontend/backend); in desktop mode the webview
origin is tauri://localhost so the bare path fails.
- install.ps1: gate the install_python_stack.py hotfix on a sentinel comment
instead of a content regex; append the sentinel after applying so reruns
are unambiguous.
- unsloth_cli/commands/studio.py _write_auth_secret: use the atomic mkstemp +
os.replace path on Windows too; chmod calls are wrapped in try/except OSError.
- studio/src-tauri/src/preflight.rs probe_existing_backends: fan out the health
probes concurrently; desktop-auth status still runs sequentially per candidate.
reqwest::Client is internally Arc-wrapped so the in-loop .clone() is a
refcount bump, not a deep clone; annotated inline.
- studio/src-tauri/src/preflight.rs run_cli_probe: wait() after kill() to reap
the child, matching probe_cli_capability.
- studio/src-tauri/src/process.rs + main.rs: add stop_backend_detached and use
it from the tray quit handler so the 5s graceful-wait does not block the
Tauri main loop. RunEvent::Exit keeps the synchronous safety-net call.
- studio/backend/main.py: drop the permissive localhost CORS regex in
api-only mode; the explicit allow_origins list is sufficient.
- .github/workflows/release-desktop.yml: drop max-parallel: 1 so platform
builds run in parallel, and lift releaseBody to an env var so the three
tauri-action invocations share one source of truth.
* Fix review findings (loop 2)
- studio/backend/auth/storage.py update_password: clear_desktop_secret()
alongside clear_bootstrap_password() so rotating the admin password
also revokes any previously provisioned .desktop_secret. Without this,
an old local desktop credential keeps minting fresh admin tokens via
/api/auth/desktop-login after a password rotation.
- studio/src-tauri/src/desktop_auth.rs provision_desktop_auth: wrap
cmd.output().await in tokio::time::timeout(30s). DESKTOP_AUTH_LOCK is
held across the whole desktop_auth flow, and previously a hanging
`unsloth studio provision-desktop-auth` subprocess would pin the lock
indefinitely and freeze every subsequent desktop_auth call.
* Add review tests
* Consolidate review tests
Merge review-added tests into the existing studio/backend/tests/test_desktop_auth.py
(the PR's authoritative desktop-auth test file). Drops three scaffolding files under
tests/python/ in favor of five focused tests next to the tests they extend:
- test_update_password_clears_desktop_secret (runtime)
- test_update_password_on_unknown_user_leaves_desktop_secret_intact (runtime)
- test_cli_provisioning_delegates_to_storage_create_desktop_secret (source-level)
- test_cli_connect_auth_db_reads_storage_db_path (source-level)
- test_desktop_auth_provision_has_bounded_timeout (Rust source-level)
* Revert auth-guards.ts Tauri branches to unconditional form
The review loop on PR 5144 introduced a regression: the isTauri branch of
requireAuth redirected to /login when tauriAutoAuth() returned false, and
requireGuest / requirePasswordChangeFlow silently fell through on the same
condition. The Tauri desktop app authenticates via a local auto-generated
secret; it must never surface /login or /change-password to the user. A
failed auto-auth should let the startup layer retry, not expose a password
form.
Restore the three Tauri branches to the author's original unconditional
form (requireAuth: return; requireGuest / requirePasswordChangeFlow: throw
redirect({to: '/chat'})). Keep the rest of the review fixes -- the
apiUrl() fetch wrapping, authRedirect helper, and fetchAuthStatus refactor
are all legitimate improvements and are preserved.
* Revert release-desktop.yml to author's version
The review loop's workflow-file tweaks (drop max-parallel: 1, lift releaseBody
to an env var) are cosmetic. OAuth tokens cannot push workflow-file changes,
and fine-grained PATs cannot honor maintainerCanModify on a third-party fork.
Reverting the workflow file to wasimysaid's version lets the push go through
without needing a classic PAT with both repo and workflow scopes.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: Lee Jackson <130007945+Imagineer99@users.noreply.github.com>
Co-authored-by: Daniel Han <danielhanchen@gmail.com>
Co-authored-by: Daniel Han <unslothai@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
||
|
|
c3d2d58046
|
Update dependabot.yml (#4915) | ||
|
|
6872c6e850
|
Remove advanced CodeQL workflow in favor of default setup (#4584)
The repo has both the CodeQL "default setup" (configured in repo settings) and this advanced workflow file enabled. GitHub does not allow both simultaneously, causing all PR CI runs to fail with: "CodeQL analyses from advanced configurations cannot be processed when the default setup is enabled" Since the default setup already covers the same languages (Python, JavaScript/TypeScript) with the same build-mode (none), remove the redundant advanced workflow file. |
||
|
|
f294161e26
|
build(deps): bump the actions group with 2 updates (#4570)
Bumps the actions group with 2 updates: [actions/checkout](https://github.com/actions/checkout) and [github/codeql-action](https://github.com/github/codeql-action). Updates `actions/checkout` from 4 to 6 - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v4...v6) Updates `github/codeql-action` from 3 to 4 - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](https://github.com/github/codeql-action/compare/v3...v4) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions - dependency-name: github/codeql-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major dependency-group: actions ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
||
|
|
efedbe9740
|
Feature/add dependabot and codeql security checks (#4479)
* Add CodeQL analysis workflow configuration * Add Dependabot configuration for package updates Configure Dependabot to check for updates in various ecosystems weekly. * Fix dependabot.yml: bun ecosystem, missing dir, grouping for PR #4479 1. studio/frontend uses bun.lock not package-lock.json, so change npm to bun 2. Add missing studio/backend/requirements/ pip entry (consumed by studio/setup.sh) 3. Add groups with patterns ["*"] to all pip/bun/npm entries to batch updates and avoid 30+ individual Dependabot PRs on the first run * Consolidate pip blocks to fix overlapping directory violation GitHub Dependabot forbids multiple same-ecosystem entries with overlapping directories on the same branch. The root "/" directory overlapped the 3 nested pip dirs. Merge all 4 pip blocks into one using the `directories:` (plural) key. Also remove redundant open-pull-requests-limit from the bun block since grouping with patterns: ["*"] already limits PR count. --------- Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com> |
||
|
|
cd65584f19
|
Update issue template | ||
|
|
eb7637013e | Update CODEOWNERS | ||
|
|
96ff5c5f61
|
Update CODEOWNERS for studio and cli (#4266)
* Update CODEOWNERS for studio and cli * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
08bb85fcda | Create CODEOWNERS (#4039) | ||
|
|
a09bdb6adb | chore: Update outdated GitHub Actions version (#3936) | ||
|
|
b03b014336 | Update template.md | ||
|
|
f40fa7a0e8 | Update FUNDING.yml (#3792) | ||
|
|
23a7ac5d17 | Update FUNDING.yml (#3736) | ||
|
|
a3ed3c395d | remove pre-commit workflow (covered by pre-commit app) (#3618) | ||
|
|
d6bb89ad44 |
Formatting & bug fixes (#3563)
* Update rl.py * Fix CE Loss * Versioning * Update loader.py * Update loader.py * extract_model_type_from_config * Model types * Update loader.py * get_transformers_model_type * Update loader.py * Update loader.py * Update loader.py * Update rl.py * Update pyproject.toml * Update loader.py * Update loader.py * Update loader.py * Update loader.py * Versioning * Update _utils.py * Update _utils.py * Update _utils.py * Update _utils.py * Update vision.py * Update vision.py * Fix DataParallel * Update _utils.py * Update rl.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update mapper.py * Versioning * Update loader.py * Update loader.py * Update rl.py * Versioning * Update _utils.py * Fix auto_mapping * Update loader.py * Update loader.py * Update vision.py * Update vision.py * Update loader.py * Message * Update vision.py * Update loader.py * Update vision.py * cache_implementation * Update vision.py * Update loader.py * Update vision.py * Update vision.py * Update vision.py * Update loader.py * Update vision.py * Save max_seq_length * Update _utils.py * Update rl.py * Update vision.py * Update llama.py * Mistral3 vllm (#3349) * [WIP] use vLLM for vision language models * Update README.md Editing icon sizes * Update README.md Updating icon sizes * Update README.md (#2885) * MoE kernels AGPLv3 * versioning * Many bug fixes (#2908) * add deepseek v3 * add deepseek r1 base * add deepseek r1 zero * add deepseek distill llama * add deepseek distill models * remove redundant code when constructing model names * add mistral small to registry * rename model registration methods * rename deepseek registration methods * refactor naming for mistral and phi * add global register models * refactor model registration tests for new registry apis * add model search method * remove deprecated registration api * add quant type test * add registry readme * make llama registration more specific * clear registry when executing individual model registration file * more registry readme updates * Update _auto_install.py * Llama4 * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Synthetic data * Update mapper.py * Xet and Synthetic * Update synthetic.py * Update loader.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update pyproject.toml * Delete .gitignore * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update _utils.py * Update pyproject.toml * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update chat_templates.py * Seasame force float16 / float32 * Fix Seasame * Update loader.py * Update vision.py * Update vision.py * Update vision.py * Update loader.py * is_multimodal * Update loader.py * Update loader.py * Update loader.py * Update loader.py * Update vision.py * Update vision.py * Update vision.py * UNSLOTH_DISABLE_STATIC_GENERATION * Update vision.py * Auto vision detection * Sesame * Whisper * Update loader.py * Update loader.py * Update loader.py * Update mapper.py * Update vision.py * Update vision.py * Update vision.py * Update vision.py * Update vision.py * Update vision.py * Update loader.py * Update loader.py * Update loader.py * Update loader.py * Update _utils.py * Update rl.py * versioning * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * logging * Update pyproject.toml * Update rl.py * versioning * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * logits / temperature * Update rl_replacements.py * Update pyproject.toml * Update rl_replacements.py * Update rl_replacements.py * Debugging only * Update llama.py * Update llama.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Generic efficient GRPO * Update rl_replacements.py * Update rl_replacements.py * Remove debugging * Update rl_replacements.py * Update rl_replacements.py * Update vision.py * Update llama.py * Update rl_replacements.py * versioning * Update _utils.py * Update vision.py * Update mapper.py * Update loader.py * Update mapper.py * Update vision.py * Update loader.py * Update vision.py * Update loader.py * Update _utils.py * Update vision.py * gradient checkpointing * Gemma 3N fixes * Update loader.py * Versioning * Gemma 3N fixes * Update vision.py * Update vision.py * Update loader.py * Update vision.py * Fix setup.py * setup.py * Prints * Update setup.py * Update setup.py * Update setup.py * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update vision.py * Update vision.py * Update pyproject.toml * Update vision.py * Update _utils.py * Update __init__.py * Update __init__.py --------- Co-authored-by: jeromeku <jerome.ku@gmail.com> Co-authored-by: Michael Han <107991372+shimmyshimmer@users.noreply.github.com> * silienty skip falcon h1 import is transformers_version < 4.53.0 (#2912) * Dynamically adjust get_per_token_logps function and patch as well (#2911) * add intel gpu with vllm support (#2903) * [bugs] fix for casual mask (#2868) * fix for casual mask * use un_casual in sdpa * add missing mask * fix for type * Explicitly check if xformers exists for attention (#2889) * Update __init__.py * Update llama.py * if mlp doesn't exist in layer module check for feed_forward name for falcon h1 (#2913) * Move inputs to right devices. (#2919) * Move tensors to right devices * fix multi gpu for non mistral models * multi GPU RoPE for gemma2 * Finish up multi GPU inference * Make multiGPU rope a list * Remove unnecessary transfer to CPU * Remove unnecessary move to CPU * Donot move inputs to device yet will be handled separately in another PR * Move inputs to appropriate decoder device * Make device count global variable * Cleanup RoPE device code * Fixup num_gpu to device count * Cleanup device counts * Use device index for RoPE get_cache * Donot typecast * Use tuple instead of list for tensors. Use device index directly * fixup move to device logic * WIP VLM vLLM * Make vLLM patch a function * Add save and load lora functions * Make fast_inference setup depend on the flag * Improve fast inference patching mechanism * Make vision setting depend on checks in fastbasemodel * Check LoRA and vLLM intercompatibility for vision models * Comment pointing to vLLM LoRA check * Improve lora validation on vLLM * Error out on no vLLM and increase max lora rank * Bug fixes (#3017) * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update pyproject.toml * Delete .gitignore * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update _utils.py * Update pyproject.toml * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update synthetic.py * Update chat_templates.py * Seasame force float16 / float32 * Fix Seasame * Update loader.py * Update vision.py * Update vision.py * Update vision.py * Update loader.py * is_multimodal * Update loader.py * Update loader.py * Update loader.py * Update loader.py * Update vision.py * Update vision.py * Update vision.py * UNSLOTH_DISABLE_STATIC_GENERATION * Update vision.py * Auto vision detection * Sesame * Whisper * Update loader.py * Update loader.py * Update loader.py * Update mapper.py * Update vision.py * Update vision.py * Update vision.py * Update vision.py * Update vision.py * Update vision.py * Update loader.py * Update loader.py * Update loader.py * Update loader.py * Update _utils.py * Update rl.py * versioning * Update rl.py * Update rl.py * Update rl.py * Update rl.py * Update rl.py * logging * Update pyproject.toml * Update rl.py * versioning * Update rl.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * logits / temperature * Update rl_replacements.py * Update pyproject.toml * Update rl_replacements.py * Update rl_replacements.py * Debugging only * Update llama.py * Update llama.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Generic efficient GRPO * Update rl_replacements.py * Update rl_replacements.py * Remove debugging * Update rl_replacements.py * Update rl_replacements.py * Update vision.py * Update llama.py * Update rl_replacements.py * versioning * Update _utils.py * Update vision.py * Update mapper.py * Update loader.py * Update mapper.py * Update vision.py * Update loader.py * Update vision.py * Update loader.py * Update _utils.py * Update vision.py * gradient checkpointing * Gemma 3N fixes * Update loader.py * Versioning * Gemma 3N fixes * Update vision.py * Update vision.py * Update loader.py * Update vision.py * Fix setup.py * setup.py * Prints * Update setup.py * Update setup.py * Update setup.py * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update pyproject.toml * Update vision.py * Update vision.py * Update pyproject.toml * Update vision.py * Update _utils.py * Update __init__.py * Update __init__.py * Small fixes * Update vision.py * Update vision.py * versioning * Update __init__.py * Update llama.py * Update rl.py * Update rl.py * Update _utils.py * Update vision.py * Update vision.py * compiler stance * Update _utils.py * Update pyproject.toml * Update pyproject.toml * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Update rl_replacements.py * Revert "Revert "Add Qwen2.5-VL-32B-Instruct mapping to fix quantized model me…" (#2990) This reverts commit |
||
|
|
fba0bff2f4 | Remove stale bot | ||
|
|
b0088817cd | Update stale.yml | ||
|
|
95d2bdbec3 | Create stale.yml (#2836) | ||
|
|
550f19fc0d | Delete stale.yml |