unsloth/.github/workflows/studio-windows-api-smoke.yml
Daniel Han 54a86c3514
Some checks are pending
Core / Core (HF=default + TRL=default) (push) Waiting to run
Core / Core (HF=4.57.6 + TRL<1) (push) Waiting to run
Core / Core (HF=latest + TRL=latest) (push) Waiting to run
Core / llama.cpp build + smoke (push) Waiting to run
Lint CI / Source lint (Python + shell + YAML + JSON + safety nets) (push) Waiting to run
MLX CI on Mac M1 / dispatch (push) Waiting to run
Security audit / advisory audit (pip + npm + cargo) (push) Waiting to run
Security audit / pip scan-packages :: extras (push) Waiting to run
Security audit / pip scan-packages :: studio (push) Waiting to run
Security audit / pip scan-packages :: hf-stack (push) Waiting to run
Security audit / npm scan-packages (Studio frontend tarballs) (push) Waiting to run
Security audit / workflow-trigger lint (pull_request_target / cache-poisoning) (push) Waiting to run
Security audit / pytest tests/security (push) Waiting to run
Security audit / npm provenance + new install-script diff (push) Waiting to run
Studio API CI / Studio API & Auth Tests (push) Waiting to run
Backend CI / (Python 3.10) (push) Waiting to run
Backend CI / (Python 3.11) (push) Waiting to run
Backend CI / (Python 3.12) (push) Waiting to run
Backend CI / (Python 3.13) (push) Waiting to run
Backend CI / Repo tests (CPU) (push) Waiting to run
Frontend CI / Frontend build + bundle sanity (push) Waiting to run
Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Studio GGUF CI / Tool calling Tests (push) Waiting to run
Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio API CI / Studio API & Auth Tests (push) Waiting to run
Mac Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Mac Studio GGUF CI / Tool calling Tests (push) Waiting to run
Mac Studio GGUF CI / JSON, images (push) Waiting to run
Mac Studio UI CI / Chat UI Tests (push) Waiting to run
Mac Studio Update CI / Studio Updating Tests (push) Waiting to run
Studio Tauri CI / Tauri Linux debug build (no codesign) (push) Waiting to run
Studio UI CI / Chat UI Tests (push) Waiting to run
Studio Update CI / Studio Updating Tests (push) Waiting to run
Windows Studio API CI / Studio API & Auth Tests (push) Waiting to run
Windows Studio GGUF CI / OpenAI, Anthropic API tests (push) Waiting to run
Windows Studio GGUF CI / Tool calling Tests (push) Waiting to run
Windows Studio GGUF CI / JSON, images (push) Waiting to run
Windows Studio UI CI / Chat UI Tests (push) Waiting to run
Windows Studio Update CI / Studio Updating Tests (push) Waiting to run
Wheel CI / Wheel build + content sanity + import smoke (push) Waiting to run
ci: route every hf download through xet-tuned stall-retry wrapper (#5476)
Root cause of the Mac json-images 30 min timeout (run 25950714888 /
PR #5430): huggingface_hub>=1.15 deprecated `hf_transfer` and routes
every transfer through `hf-xet`. The CI step's unpinned
`pip install --upgrade huggingface_hub hf_transfer` jumped to 1.15.0
+ hf-xet 1.5.0, the 940 MB mmproj finished in ~21s, then the 3 GB
gemma-4 GGUF made it to ~46% and went completely silent for the
remaining 29 minutes -- no progress bytes, no error, no exit -- until
the job timeout fired.

This wraps every CI `hf download` in a new
`.github/scripts/hf-download-with-retry.sh`:

  * Drops the no-op `HF_HUB_ENABLE_HF_TRANSFER=1` prefix and the
    `hf_transfer` install (both are deprecated on 1.15+ and only
    emit a FutureWarning now).
  * Exports the hf-xet high-performance knobs Daniel asked for:
        HF_XET_HIGH_PERFORMANCE=1
        HF_XET_CHUNK_CACHE_SIZE_BYTES=0
        HF_XET_NUM_CONCURRENT_RANGE_GETS=64
        HF_XET_RECONSTRUCT_WRITE_SEQUENTIALLY=0
        HF_XET_CLIENT_READ_TIMEOUT=500
  * Watchdogs each attempt: if `hf download` has not exited after
    HF_DOWNLOAD_STALL_SECONDS (default 180s = 3 min), SIGTERM,
    sleep 2, SIGKILL, then loop. Retries are unbounded; the
    enclosing job's `timeout-minutes` is the real cap.
  * Optional 3rd positional `LOCAL_DIR` -- omitted lets `hf` use
    the default HF_HUB_CACHE, which is what the HF_HOME-priming
    jobs need.

19 call sites migrated across mlx-ci.yml + 9 studio-*-smoke.yml
workflows. The inline `python -c "from huggingface_hub import
hf_hub_download; ..."` block in mlx-ci.yml is also routed through
the wrapper so every hf transfer in CI gets the same treatment.

Also reverts the json-images timeout 45 -> 30 from #5475: the bump
was masking this hang, not fixing it.
2026-05-15 21:11:56 -07:00

246 lines
9.5 KiB
YAML

# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved.
# Windows counterpart to studio-api-smoke.yml / studio-mac-api-smoke.yml.
# Same tests/studio/studio_api_smoke.py exercise (CORS hardening, auth
# state machine, JWT expiry, API key lifecycle, /v1/models /
# /v1/embeddings / /v1/responses, endpoint-by-endpoint auth audit) but
# on the FREE windows-latest runner. The file-mode hardening section
# (Section 6) is Linux-only and short-circuits on non-POSIX; the rest
# is platform-portable.
name: Windows Studio API CI
on:
pull_request:
paths:
- 'studio/**'
- 'unsloth/**'
- 'unsloth_cli/**'
- 'install.ps1'
- 'pyproject.toml'
- 'tests/studio/**'
- '.github/workflows/studio-windows-api-smoke.yml'
push:
branches: [main, pip]
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
permissions:
contents: read
jobs:
api-smoke:
name: Studio API & Auth Tests
runs-on: windows-latest
timeout-minutes: 30
defaults:
run:
shell: bash
env:
GGUF_REPO: unsloth/gemma-3-270m-it-GGUF
GGUF_VARIANT: UD-Q4_K_XL
GGUF_FILE: gemma-3-270m-it-UD-Q4_K_XL.gguf
STUDIO_PORT: '18895'
HF_HOME: ${{ github.workspace }}/hf-cache
# Force UTF-8 for stdio (Windows defaults to cp1252; hf
# download prints a "✓" checkmark and crashes otherwise).
PYTHONIOENCODING: utf-8
PYTHONUTF8: '1'
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
persist-credentials: false
- uses: actions/setup-node@48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e # v6.4.0
with:
node-version: '22'
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.12'
- name: Restore HF_HOME for ${{ env.GGUF_REPO }}
id: cache-hf
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
continue-on-error: true
with:
path: hf-cache
key: ${{ runner.os }}-hf-${{ env.GGUF_REPO }}-${{ env.GGUF_VARIANT }}-v1
- name: Prime HF_HOME with the GGUF
id: prime-hf
if: steps.cache-hf.outputs.cache-hit != 'true' || steps.cache-hf.outcome != 'success'
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pip install --upgrade huggingface_hub
mkdir -p hf-cache
bash .github/scripts/hf-download-with-retry.sh "$GGUF_REPO" "$GGUF_FILE"
- name: Save HF_HOME for ${{ env.GGUF_REPO }}
if: always() && steps.prime-hf.outcome == 'success'
uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: hf-cache
key: ${{ runner.os }}-hf-${{ env.GGUF_REPO }}-${{ env.GGUF_VARIANT }}-v1
- name: Pre-install Windows tweaks (npm 11 + Defender exclusions)
shell: pwsh
# See studio-windows-update-smoke.yml for the full rationale.
# tl;dr: setup.ps1 needs npm >=11 to skip a 35 s winget Node
# reinstall, and Defender's real-time scan dominates the
# frontend / uv-pip-extract steps.
run: |
$ProgressPreference = 'SilentlyContinue'
Write-Host "npm version before upgrade: $(npm -v)"
npm install -g 'npm@^11' 2>&1 | Out-Host
Write-Host "npm version after upgrade: $(npm -v)"
# NOTE: do NOT pre-create these directories. See
# studio-windows-update-smoke.yml for the full rationale --
# creating an empty studio/frontend/dist trips setup.ps1's
# mtime-based staleness check into "frontend up to date, skip
# rebuild" and Studio boots with an empty dist directory.
# Add-MpPreference accepts paths that do not yet exist.
foreach ($p in @(
"$env:USERPROFILE\.unsloth",
"$env:USERPROFILE\AppData\Local\uv",
"$env:GITHUB_WORKSPACE\studio\frontend\node_modules",
"$env:GITHUB_WORKSPACE\studio\frontend\dist"
)) {
try {
Add-MpPreference -ExclusionPath $p -ErrorAction Stop
Write-Host "Defender exclusion added: $p"
} catch {
Write-Host "Defender exclusion skipped ($($_.Exception.Message)): $p"
}
}
- name: Install Studio (--local, --no-torch)
shell: pwsh
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
New-Item -ItemType Directory -Force -Path logs | Out-Null
# *>&1 captures Write-Host (Information stream) output;
# plain 2>&1 does not. setup.ps1 emits "prebuilt installed
# and validated" via Write-Host, and we grep for that.
$ProgressPreference = 'SilentlyContinue'
& ./install.ps1 --local --no-torch *>&1 | Tee-Object -FilePath logs/install.log
- name: Assert install.ps1 used the Windows llama.cpp prebuilt
run: |
# Filesystem-based check (setup.ps1's stream output isn't
# captured back through this parent step's pipeline; see
# studio-windows-ui-smoke.yml for full explanation).
LLAMA_DIR=~/.unsloth/llama.cpp
INFO="$LLAMA_DIR/UNSLOTH_PREBUILT_INFO.json"
BIN="$LLAMA_DIR/build/bin/Release/llama-server.exe"
if grep -q "falling back to source build" logs/install.log; then
echo "::error::install.ps1 fell back to source-build llama.cpp on Windows."
grep -E "llama-prebuilt|llama.cpp" logs/install.log | tail -60
exit 1
fi
if [ ! -f "$INFO" ]; then
echo "::error::no UNSLOTH_PREBUILT_INFO.json at $INFO."
ls -la "$LLAMA_DIR" || true
exit 1
fi
if [ ! -f "$BIN" ]; then
echo "::error::no llama-server.exe at $BIN."
ls -la "$LLAMA_DIR/build/bin" || true
exit 1
fi
echo "install.ps1 installed the Windows prebuilt llama.cpp:"
cat "$INFO"
- name: Add Studio shim to GITHUB_PATH
# install.ps1's User-PATH update doesn't propagate to a
# running Git Bash session; export the shim dir so the
# next `unsloth ...` invocation finds it.
run: |
SHIM_DIR=~/.unsloth/studio/bin
if [ ! -f "$SHIM_DIR/unsloth.exe" ]; then
echo "::error::unsloth.exe shim not found at $SHIM_DIR"
ls -la ~/.unsloth/studio/ || true
exit 1
fi
cygpath -w "$SHIM_DIR" >> "$GITHUB_PATH"
- name: Patch Studio venv with full typer / pydantic dep trees
# Belt-and-suspenders: install.ps1's --no-deps install of
# no-torch-runtime.txt drops typer's and pydantic's runtime
# deps unless explicitly pinned. Re-install the ones whose
# deps don't pull torch.
run: |
STUDIO_PY=~/.unsloth/studio/unsloth_studio/Scripts/python.exe
if [ ! -f "$STUDIO_PY" ]; then
echo "::error::Studio venv python not at $STUDIO_PY"
ls -la ~/.unsloth/studio/ || true
exit 1
fi
"$STUDIO_PY" -m pip install --upgrade typer pydantic huggingface_hub
- name: Install pyjwt for the JWT-expiry forge test
run: python -m pip install 'pyjwt>=2.6'
- name: Reset auth + boot Studio (API-only)
run: |
unsloth studio reset-password
mkdir -p logs
UNSLOTH_API_ONLY=1 unsloth studio -H 127.0.0.1 -p "$STUDIO_PORT" \
> logs/studio.log 2>&1 &
echo "STUDIO_PID=$!" >> "$GITHUB_ENV"
- name: Wait for /api/health
run: |
for i in $(seq 1 180); do
if curl -fs "http://127.0.0.1:${STUDIO_PORT}/api/health" > /tmp/health.json; then
jq -e '.status == "healthy"' /tmp/health.json && break
fi
sleep 1
done
jq -e '.status == "healthy"' /tmp/health.json
- name: Pass bootstrap password + rotated targets to the test
run: |
OLD=$(cat ~/.unsloth/studio/auth/.bootstrap_password)
NEW="ApiSmoke-$(python -c 'import secrets; print(secrets.token_urlsafe(16))')"
NEW2="ApiSmoke-$(python -c 'import secrets; print(secrets.token_urlsafe(16))')"
echo "::add-mask::$OLD"
echo "::add-mask::$NEW"
echo "::add-mask::$NEW2"
echo "STUDIO_OLD_PW=$OLD" >> "$GITHUB_ENV"
echo "STUDIO_NEW_PW=$NEW" >> "$GITHUB_ENV"
echo "STUDIO_NEW2_PW=$NEW2" >> "$GITHUB_ENV"
- name: Run Studio API & Auth tests
# Do NOT pin STUDIO_AUTH_DIR here. The Mac/Linux mirrors
# hardcode runner-specific paths (/Users/runner/...,
# /home/runner/...), but on Windows the path is
# C:\Users\runneradmin\.unsloth\studio\auth and varies by
# runner image. studio_api_smoke.py defaults to
# Path.home()/".unsloth"/"studio"/"auth" when the env is
# unset, which is correct on every OS.
env:
BASE_URL: http://127.0.0.1:18895
run: python tests/studio/studio_api_smoke.py
- name: Stop Studio
if: always()
run: |
kill "${STUDIO_PID}" 2>/dev/null || true
sleep 2
- name: Upload API smoke logs
if: always()
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
name: windows-studio-api-smoke-log
path: |
logs/install.log
logs/studio.log
retention-days: 7