studio: stream export worker output into the export dialog (#4897)

* studio: stream export worker output into the export dialog

The Export Model dialog only showed a spinner on the "Exporting..."
button while the worker subprocess was doing the actual heavy lifting.
For Merged to 16bit and GGUF / Llama.cpp exports this meant several
minutes (or more, for large models) of opaque silence, with no way to
tell whether save_pretrained_merged, convert_hf_to_gguf.py, or
llama-quantize was making progress.

This adds a live terminal-style output panel inside the export dialog,
rendered just above the Cancel / Start Export buttons and scrollable
with auto-follow-tail. It shows stdout and stderr from both the worker
process itself and any child process it spawns (GGUF converter,
llama-quantize), coloured by stream.

Backend

- core/export/worker.py: new _setup_log_capture(resp_queue) installed
  before LogConfig.setup_logging. It saves the original stdout/stderr
  fds, creates pipes, os.dup2's the write ends onto fds 1 and 2 (so
  every child process inherits the redirected fds), and spins up two
  daemon reader threads. Each thread reads bytes from a pipe, echoes
  them back to the original fd (so the server console keeps working),
  splits on \n and \r, and forwards each line to the resp queue as
  {"type":"log","stream":"stdout|stderr","line":...,"ts":...}.
  PYTHONUNBUFFERED=1 is set so nested Python converters flush
  immediately.

- core/export/orchestrator.py:
  - Thread-safe ring buffer (collections.deque, maxlen 4000) with a
    monotonically increasing seq counter. clear_logs(),
    get_logs_since(cursor), get_current_log_seq(), is_export_active().
  - _wait_response handles rtype == "log" by appending to the buffer
    and continuing the wait loop. Status messages are also surfaced as
    a "status" stream so users see high level progress alongside raw
    subprocess output.
  - load_checkpoint, _run_export, and cleanup_memory now wrap their
    bodies with the existing self._lock (previously unused), clear the
    log buffer at the start of each op, and flip _export_active in a
    try/finally so the SSE endpoint can detect idle.

- routes/export.py:
  - Wrapped every sync orchestrator call (load_checkpoint,
    cleanup_memory, export_merged_model, export_base_model,
    export_gguf, export_lora_adapter) in asyncio.to_thread so the
    FastAPI event loop stays free during long exports. Without this
    the new SSE endpoint could not be served concurrently with the
    blocking export POST.
  - New GET /api/export/logs/stream SSE endpoint. Honors
    Last-Event-ID and a since query param for reconnect, emits log /
    heartbeat / complete / error events, uses the id field to carry
    the log seq so clients can resume cleanly. On first connect
    without an explicit cursor it starts from the current seq so old
    lines from a previous run are not replayed.

Frontend

- features/export/api/export-api.ts: streamExportLogs() helper that
  authFetches the SSE endpoint and parses id / event / data fields
  manually (same pattern as streamTrainingProgress in train-api.ts).

- features/export/components/export-dialog.tsx:
  - Local useExportLogs(exporting) hook that opens the SSE stream on
    exporting transitions to true, accumulates up to 4000 lines in
    component state, and aborts on cleanup.
  - New scrollable output panel rendered above DialogFooter, only
    shown for Merged to 16bit and GGUF / Llama.cpp (LoRA adapter is
    a fast disk write with nothing to show). Dark terminal styling
    (bg-black/85, emerald text, rose for stderr, sky for status),
    max-height 14rem, auto-scrolls to the bottom on new output but
    stops following if the user scrolls up. A small streaming / idle
    indicator is shown next to the panel title.
  - DialogContent widens from sm:max-w-lg to sm:max-w-2xl when the
    output panel is visible so the logs have room to breathe.

Verified

- Python smoke test (tests/smoke_export_log_capture.py): spawns a
  real mp.get_context("spawn") process, installs _setup_log_capture,
  confirms that parent stdout prints, parent stderr prints, AND a
  child subprocess invoked via subprocess.run (both its stdout and
  stderr) are all captured in the resp queue. Passes.
- Orchestrator log helpers tested in isolation: _append_log,
  get_logs_since (with and without a cursor), clear_logs not
  resetting seq so reconnecting clients still progress. Passes.
- routes.export imports cleanly in the studio venv and /logs/stream
  shows up in router.routes.
- bun run build: tsc -b plus vite build, no TypeScript errors.

No existing export behavior is changed. If the subprocess, the SSE
endpoint, or the frontend hook fails, the export itself still runs to
completion the same way it did before, with or without logs visible.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* export dialog: trim bootstrap noise, scope logs per screen, show realpath

Several follow-ups to the live export log work:

1. Worker bootstrap noise (transformers venv activation, Unsloth banner,
   "Top GGUF/hub models" lists, vision detection, 2k-step weight load
   bar) is dropped from the export-dialog stream. A threading.Event
   gate in worker.py defaults closed and only opens once _handle_export
   actually starts; until then the reader thread still echoes lines to
   the saved console fd for debugging but does not push them onto the
   resp_queue. The orchestrator already spawns a fresh subprocess for
   every checkpoint load, so the gate is naturally reset between runs.

2. tqdm in non-tty mode defaults to a 10s mininterval, which makes
   multi-step bars look frozen in the panel. Set TQDM_MININTERVAL=0.5
   in the worker env so any tqdm-driven progress emits more often.

3. The dialog's useExportLogs hook now also clears its line buffer
   when exportMethod or open changes, so re-opening the dialog into a
   different action's screen no longer shows the previous action's
   saved output. A useElapsedSeconds tick + "Working Xs" badge in the
   log header gives users a visible sign that long single-step phases
   (cache copies, GGUF conversion) are still running when no new lines
   are arriving.

4. ExportBackend.export_{merged,base,gguf,lora} now return
   (success, message, output_path); the worker forwards output_path on
   each export_*_done response, the orchestrator's _run_export passes
   it to routes/export.py, which surfaces it via
   ExportOperationResponse.details.output_path. The dialog's Export
   Complete screen renders the resolved on-disk realpath under "Saved
   to" so users can find their exported model directly.

* fix(cli): unpack 3-tuple return from export backend

ExportOrchestrator.export_{merged,base,gguf,lora} now return
(success, message, output_path) so the studio dialog can show
the on-disk realpath. The CLI still unpacked 2 values, so every
`unsloth export --format ...` crashed with ValueError before
reporting completion. Update the four call sites and surface
output_path via a "Saved to:" echo.

* fix(studio): anchor export log SSE cursor at run start

The export dialog SSE defaulted its cursor to get_current_log_seq()
at connect time, so any line emitted between the POST that kicks
off the export and the client opening the stream was buffered with
seqs 1..k and then skipped (seq <= cursor). Long-running exports
looked silent during their first seconds.

Snapshot _log_seq into _run_start_seq inside clear_logs() and
expose it via get_run_start_seq(). The SSE default cursor now uses
that snapshot, so every line emitted since the current run began
is reachable regardless of when the client connects. Old runs
still can't leak in because their seqs are <= the snapshot.

* fix(studio): reconnect export log SSE on stream drop

useExportLogs launched streamExportLogs once per exporting
transition and recorded any drop in .catch(). Long GGUF exports
behind a proxy with an idle kill-timeout would silently lose the
stream for the rest of the run even though the backend already
supports Last-Event-ID resume. The "retry: 3000" directive emitted
by the backend is only meaningful to native EventSource; this
hook uses a manual fetch + ReadableStream parse so it had no
effect.

Wrap streamExportLogs in a retry loop that tracks lastSeq from
ExportLogEvent.id and passes it as since on reconnect. Backoff is
exponential with jitter, capped at 5s, reset on successful open.
The loop stops on explicit backend `complete` event or on effect
cleanup.

* fix(studio): register a second command so Typer keeps `export` as a subcommand

The CLI export unpacking tests wrap `unsloth_cli.commands.export.export`
in a fresh Typer app with a single registered command. Typer flattens a
single-command app into that command, so the test's
`runner.invoke(cli_app, ["export", ckpt, out, ...])` treats the leading
`"export"` token as an unexpected extra positional argument -- every
parametrized case failed with:

    Got unexpected extra argument (.../out)

Register a harmless `noop` second command so Typer preserves subcommand
routing and the tests actually exercise the 3-tuple unpack path they
were written to guard.

Before: 4 failed
After:  4 passed

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: studio-install <studio@local.install>
Co-authored-by: Roland Tannous <115670425+rolandtannous@users.noreply.github.com>
Co-authored-by: Lee Jackson <130007945+Imagineer99@users.noreply.github.com>
Co-authored-by: Roland Tannous <rolandtannous@gravityq.ai>
This commit is contained in:
Daniel Han 2026-04-14 08:55:43 -07:00 committed by GitHub
parent eca592effe
commit 7252410ccc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 1522 additions and 117 deletions

View file

@ -0,0 +1,179 @@
# SPDX-License-Identifier: AGPL-3.0-only
# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved. See /studio/LICENSE.AGPL-3.0
"""
Regression tests for the export log ring-buffer cursor semantics.
Context: the live export log SSE stream has a race where the frontend
opens the SSE connection AFTER the POST that starts the export. Any
lines the worker subprocess emits during the gap between POST and SSE
connect get buffered with seqs 1..k, and then the SSE default cursor
`get_current_log_seq()` returns k -- so lines 1..k are forever
unreachable to that client.
Fix: `clear_logs()` snapshots the pre-run seq into `_run_start_seq`
(exposed via `get_run_start_seq()`), and `routes/export.py` defaults
the SSE cursor to that snapshot instead of the current seq. Every line
appended during the current run has seq strictly greater than the
snapshot, so the client sees the full run regardless of when it
connects.
These tests exercise the orchestrator-side contract only (no
subprocess, no FastAPI, no frontend). The routes-level integration
with get_run_start_seq() is a one-line edit covered by manual testing
and the frontend build.
"""
from __future__ import annotations
import sys
import types
from pathlib import Path
import pytest
# Backend root on sys.path so `from core.export.orchestrator import ...`
# and friends resolve without the studio app bootstrap.
_BACKEND_DIR = Path(__file__).resolve().parent.parent
if str(_BACKEND_DIR) not in sys.path:
sys.path.insert(0, str(_BACKEND_DIR))
# ExportOrchestrator imports structlog and a few heavy modules at the
# top of orchestrator.py. Stub the ones we don't need in these unit
# tests so the import succeeds on machines without the full studio
# venv.
_loggers_stub = types.ModuleType("loggers")
_loggers_stub.get_logger = lambda name: __import__("logging").getLogger(name)
sys.modules.setdefault("loggers", _loggers_stub)
# structlog is only used for a module-level import; a bare stub is
# enough because we never call into it in these tests.
sys.modules.setdefault("structlog", types.ModuleType("structlog"))
# utils.paths.outputs_root is only called inside scan_checkpoints which
# we don't hit in these tests. Provide a stub module so the top-level
# import in orchestrator.py resolves.
_utils_pkg = types.ModuleType("utils")
_utils_pkg.__path__ = [] # mark as package
_utils_paths_stub = types.ModuleType("utils.paths")
_utils_paths_stub.outputs_root = lambda: Path("/tmp")
sys.modules.setdefault("utils", _utils_pkg)
sys.modules.setdefault("utils.paths", _utils_paths_stub)
@pytest.fixture
def orchestrator():
"""Fresh ExportOrchestrator with only the log-buffer state exercised."""
from core.export.orchestrator import ExportOrchestrator
return ExportOrchestrator()
def _append(orch, line: str, stream: str = "stdout") -> None:
"""Shortcut for simulating a worker log message."""
orch._append_log({"type": "log", "stream": stream, "line": line, "ts": 0.0})
# ---------------------------------------------------------------------------
# clear_logs() semantics
# ---------------------------------------------------------------------------
def test_run_start_seq_is_zero_before_any_logs(orchestrator) -> None:
"""A brand-new orchestrator must report run_start_seq == 0 so a
first SSE connection picks up every line from seq 1 onward."""
assert orchestrator.get_run_start_seq() == 0
def test_clear_logs_snapshots_current_seq(orchestrator) -> None:
"""clear_logs() must capture _log_seq BEFORE clearing the buffer,
so subsequent runs can anchor their SSE cursor at the snapshot."""
_append(orchestrator, "old run line 1")
_append(orchestrator, "old run line 2")
_append(orchestrator, "old run line 3")
assert orchestrator.get_current_log_seq() == 3
orchestrator.clear_logs()
assert orchestrator.get_run_start_seq() == 3
assert orchestrator.get_current_log_seq() == 3 # seq counter preserved
# ---------------------------------------------------------------------------
# Race regression: SSE connects AFTER lines have been emitted
# ---------------------------------------------------------------------------
def test_sse_default_cursor_catches_all_current_run_lines(orchestrator) -> None:
"""Simulate the POST-then-SSE race: worker starts emitting lines
immediately after clear_logs(), SSE connects several lines later.
Using get_run_start_seq() as the default cursor MUST return every
line emitted since clear_logs() ran.
Pre-fix, the SSE defaulted to get_current_log_seq() at connect
time, which would return the last-seen seq and miss lines N+1..M.
"""
# Previous run leaves some buffered lines.
_append(orchestrator, "previous run line A")
_append(orchestrator, "previous run line B")
# New run starts: orchestrator clears the buffer and snapshots seq.
orchestrator.clear_logs()
run_start = orchestrator.get_run_start_seq()
# Worker emits early lines BEFORE the SSE connects.
_append(orchestrator, "Importing Unsloth...")
_append(orchestrator, "Loading checkpoint: /foo/bar")
_append(orchestrator, "Starting export...")
# SSE connects now and asks "give me everything after the run
# start cursor".
entries, new_cursor = orchestrator.get_logs_since(run_start)
# All three early lines must be present. Pre-fix this was [].
lines = [e["line"] for e in entries]
assert lines == [
"Importing Unsloth...",
"Loading checkpoint: /foo/bar",
"Starting export...",
]
assert new_cursor == entries[-1]["seq"]
def test_sse_default_cursor_excludes_previous_run(orchestrator) -> None:
"""After clear_logs(), lines from the PREVIOUS run must not leak
into the new run's SSE stream. Pre-fix this worked correctly
(clear_logs cleared the deque); the fix must preserve it.
"""
_append(orchestrator, "previous run line 1")
_append(orchestrator, "previous run line 2")
_append(orchestrator, "previous run line 3")
assert orchestrator.get_current_log_seq() == 3
orchestrator.clear_logs()
run_start = orchestrator.get_run_start_seq()
_append(orchestrator, "new run line")
entries, _ = orchestrator.get_logs_since(run_start)
assert [e["line"] for e in entries] == ["new run line"]
def test_clear_logs_twice_advances_run_start(orchestrator) -> None:
"""Back-to-back clear_logs() calls (e.g. cleanup -> load ->
export in the same dialog session) must each re-anchor run_start
at the current seq, so successive runs each start with a fresh
low-water mark."""
_append(orchestrator, "run 1 line a")
_append(orchestrator, "run 1 line b")
orchestrator.clear_logs()
assert orchestrator.get_run_start_seq() == 2
_append(orchestrator, "run 2 line a")
_append(orchestrator, "run 2 line b")
_append(orchestrator, "run 2 line c")
orchestrator.clear_logs()
assert orchestrator.get_run_start_seq() == 5