studio: stream export worker output into the export dialog (#4897)

* studio: stream export worker output into the export dialog The Export Model dialog only showed a spinner on the "Exporting..." button while the worker subprocess was doing the actual heavy lifting. For Merged to 16bit and GGUF / Llama.cpp exports this meant several minutes (or more, for large models) of opaque silence, with no way to tell whether save_pretrained_merged, convert_hf_to_gguf.py, or llama-quantize was making progress. This adds a live terminal-style output panel inside the export dialog, rendered just above the Cancel / Start Export buttons and scrollable with auto-follow-tail. It shows stdout and stderr from both the worker process itself and any child process it spawns (GGUF converter, llama-quantize), coloured by stream. Backend - core/export/worker.py: new _setup_log_capture(resp_queue) installed before LogConfig.setup_logging. It saves the original stdout/stderr fds, creates pipes, os.dup2's the write ends onto fds 1 and 2 (so every child process inherits the redirected fds), and spins up two daemon reader threads. Each thread reads bytes from a pipe, echoes them back to the original fd (so the server console keeps working), splits on \n and \r, and forwards each line to the resp queue as {"type":"log","stream":"stdout|stderr","line":...,"ts":...}. PYTHONUNBUFFERED=1 is set so nested Python converters flush immediately. - core/export/orchestrator.py: - Thread-safe ring buffer (collections.deque, maxlen 4000) with a monotonically increasing seq counter. clear_logs(), get_logs_since(cursor), get_current_log_seq(), is_export_active(). - _wait_response handles rtype == "log" by appending to the buffer and continuing the wait loop. Status messages are also surfaced as a "status" stream so users see high level progress alongside raw subprocess output. - load_checkpoint, _run_export, and cleanup_memory now wrap their bodies with the existing self._lock (previously unused), clear the log buffer at the start of each op, and flip _export_active in a try/finally so the SSE endpoint can detect idle. - routes/export.py: - Wrapped every sync orchestrator call (load_checkpoint, cleanup_memory, export_merged_model, export_base_model, export_gguf, export_lora_adapter) in asyncio.to_thread so the FastAPI event loop stays free during long exports. Without this the new SSE endpoint could not be served concurrently with the blocking export POST. - New GET /api/export/logs/stream SSE endpoint. Honors Last-Event-ID and a since query param for reconnect, emits log / heartbeat / complete / error events, uses the id field to carry the log seq so clients can resume cleanly. On first connect without an explicit cursor it starts from the current seq so old lines from a previous run are not replayed. Frontend - features/export/api/export-api.ts: streamExportLogs() helper that authFetches the SSE endpoint and parses id / event / data fields manually (same pattern as streamTrainingProgress in train-api.ts). - features/export/components/export-dialog.tsx: - Local useExportLogs(exporting) hook that opens the SSE stream on exporting transitions to true, accumulates up to 4000 lines in component state, and aborts on cleanup. - New scrollable output panel rendered above DialogFooter, only shown for Merged to 16bit and GGUF / Llama.cpp (LoRA adapter is a fast disk write with nothing to show). Dark terminal styling (bg-black/85, emerald text, rose for stderr, sky for status), max-height 14rem, auto-scrolls to the bottom on new output but stops following if the user scrolls up. A small streaming / idle indicator is shown next to the panel title. - DialogContent widens from sm:max-w-lg to sm:max-w-2xl when the output panel is visible so the logs have room to breathe. Verified - Python smoke test (tests/smoke_export_log_capture.py): spawns a real mp.get_context("spawn") process, installs _setup_log_capture, confirms that parent stdout prints, parent stderr prints, AND a child subprocess invoked via subprocess.run (both its stdout and stderr) are all captured in the resp queue. Passes. - Orchestrator log helpers tested in isolation: _append_log, get_logs_since (with and without a cursor), clear_logs not resetting seq so reconnecting clients still progress. Passes. - routes.export imports cleanly in the studio venv and /logs/stream shows up in router.routes. - bun run build: tsc -b plus vite build, no TypeScript errors. No existing export behavior is changed. If the subprocess, the SSE endpoint, or the frontend hook fails, the export itself still runs to completion the same way it did before, with or without logs visible. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * export dialog: trim bootstrap noise, scope logs per screen, show realpath Several follow-ups to the live export log work: 1. Worker bootstrap noise (transformers venv activation, Unsloth banner, "Top GGUF/hub models" lists, vision detection, 2k-step weight load bar) is dropped from the export-dialog stream. A threading.Event gate in worker.py defaults closed and only opens once _handle_export actually starts; until then the reader thread still echoes lines to the saved console fd for debugging but does not push them onto the resp_queue. The orchestrator already spawns a fresh subprocess for every checkpoint load, so the gate is naturally reset between runs. 2. tqdm in non-tty mode defaults to a 10s mininterval, which makes multi-step bars look frozen in the panel. Set TQDM_MININTERVAL=0.5 in the worker env so any tqdm-driven progress emits more often. 3. The dialog's useExportLogs hook now also clears its line buffer when exportMethod or open changes, so re-opening the dialog into a different action's screen no longer shows the previous action's saved output. A useElapsedSeconds tick + "Working Xs" badge in the log header gives users a visible sign that long single-step phases (cache copies, GGUF conversion) are still running when no new lines are arriving. 4. ExportBackend.export_{merged,base,gguf,lora} now return (success, message, output_path); the worker forwards output_path on each export_*_done response, the orchestrator's _run_export passes it to routes/export.py, which surfaces it via ExportOperationResponse.details.output_path. The dialog's Export Complete screen renders the resolved on-disk realpath under "Saved to" so users can find their exported model directly. * fix(cli): unpack 3-tuple return from export backend ExportOrchestrator.export_{merged,base,gguf,lora} now return (success, message, output_path) so the studio dialog can show the on-disk realpath. The CLI still unpacked 2 values, so every `unsloth export --format ...` crashed with ValueError before reporting completion. Update the four call sites and surface output_path via a "Saved to:" echo. * fix(studio): anchor export log SSE cursor at run start The export dialog SSE defaulted its cursor to get_current_log_seq() at connect time, so any line emitted between the POST that kicks off the export and the client opening the stream was buffered with seqs 1..k and then skipped (seq <= cursor). Long-running exports looked silent during their first seconds. Snapshot _log_seq into _run_start_seq inside clear_logs() and expose it via get_run_start_seq(). The SSE default cursor now uses that snapshot, so every line emitted since the current run began is reachable regardless of when the client connects. Old runs still can't leak in because their seqs are <= the snapshot. * fix(studio): reconnect export log SSE on stream drop useExportLogs launched streamExportLogs once per exporting transition and recorded any drop in .catch(). Long GGUF exports behind a proxy with an idle kill-timeout would silently lose the stream for the rest of the run even though the backend already supports Last-Event-ID resume. The "retry: 3000" directive emitted by the backend is only meaningful to native EventSource; this hook uses a manual fetch + ReadableStream parse so it had no effect. Wrap streamExportLogs in a retry loop that tracks lastSeq from ExportLogEvent.id and passes it as since on reconnect. Backoff is exponential with jitter, capped at 5s, reset on successful open. The loop stops on explicit backend `complete` event or on effect cleanup. * fix(studio): register a second command so Typer keeps `export` as a subcommand The CLI export unpacking tests wrap `unsloth_cli.commands.export.export` in a fresh Typer app with a single registered command. Typer flattens a single-command app into that command, so the test's `runner.invoke(cli_app, ["export", ckpt, out, ...])` treats the leading `"export"` token as an unexpected extra positional argument -- every parametrized case failed with: Got unexpected extra argument (.../out) Register a harmless `noop` second command so Typer preserves subcommand routing and the tests actually exercise the 3-tuple unpack path they were written to guard. Before: 4 failed After: 4 passed --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: studio-install <studio@local.install> Co-authored-by: Roland Tannous <115670425+rolandtannous@users.noreply.github.com> Co-authored-by: Lee Jackson <130007945+Imagineer99@users.noreply.github.com> Co-authored-by: Roland Tannous <rolandtannous@gravityq.ai>
2026-04-28 11:29:57 +00:00 · 2026-04-14 08:55:43 -07:00 · 2026-04-14 08:55:43 -07:00 · 7252410ccc
commit 7252410ccc
parent eca592effe
10 changed files with 1522 additions and 117 deletions
--- a/studio/backend/tests/test_export_log_cursor.py
+++ b/studio/backend/tests/test_export_log_cursor.py
@ -0,0 +1,179 @@
+# SPDX-License-Identifier: AGPL-3.0-only
+# Copyright 2026-present the Unsloth AI Inc. team. All rights reserved. See /studio/LICENSE.AGPL-3.0
+
+"""
+Regression tests for the export log ring-buffer cursor semantics.
+
+Context: the live export log SSE stream has a race where the frontend
+opens the SSE connection AFTER the POST that starts the export. Any
+lines the worker subprocess emits during the gap between POST and SSE
+connect get buffered with seqs 1..k, and then the SSE default cursor
+`get_current_log_seq()` returns k -- so lines 1..k are forever
+unreachable to that client.
+
+Fix: `clear_logs()` snapshots the pre-run seq into `_run_start_seq`
+(exposed via `get_run_start_seq()`), and `routes/export.py` defaults
+the SSE cursor to that snapshot instead of the current seq. Every line
+appended during the current run has seq strictly greater than the
+snapshot, so the client sees the full run regardless of when it
+connects.
+
+These tests exercise the orchestrator-side contract only (no
+subprocess, no FastAPI, no frontend). The routes-level integration
+with get_run_start_seq() is a one-line edit covered by manual testing
+and the frontend build.
+"""
+
+from __future__ import annotations
+
+import sys
+import types
+from pathlib import Path
+
+import pytest
+
+
+# Backend root on sys.path so `from core.export.orchestrator import ...`
+# and friends resolve without the studio app bootstrap.
+_BACKEND_DIR = Path(__file__).resolve().parent.parent
+if str(_BACKEND_DIR) not in sys.path:
+    sys.path.insert(0, str(_BACKEND_DIR))
+
+# ExportOrchestrator imports structlog and a few heavy modules at the
+# top of orchestrator.py. Stub the ones we don't need in these unit
+# tests so the import succeeds on machines without the full studio
+# venv.
+_loggers_stub = types.ModuleType("loggers")
+_loggers_stub.get_logger = lambda name: __import__("logging").getLogger(name)
+sys.modules.setdefault("loggers", _loggers_stub)
+
+# structlog is only used for a module-level import; a bare stub is
+# enough because we never call into it in these tests.
+sys.modules.setdefault("structlog", types.ModuleType("structlog"))
+
+# utils.paths.outputs_root is only called inside scan_checkpoints which
+# we don't hit in these tests. Provide a stub module so the top-level
+# import in orchestrator.py resolves.
+_utils_pkg = types.ModuleType("utils")
+_utils_pkg.__path__ = []  # mark as package
+_utils_paths_stub = types.ModuleType("utils.paths")
+_utils_paths_stub.outputs_root = lambda: Path("/tmp")
+sys.modules.setdefault("utils", _utils_pkg)
+sys.modules.setdefault("utils.paths", _utils_paths_stub)
+
+
+@pytest.fixture
+def orchestrator():
+    """Fresh ExportOrchestrator with only the log-buffer state exercised."""
+    from core.export.orchestrator import ExportOrchestrator
+
+    return ExportOrchestrator()
+
+
+def _append(orch, line: str, stream: str = "stdout") -> None:
+    """Shortcut for simulating a worker log message."""
+    orch._append_log({"type": "log", "stream": stream, "line": line, "ts": 0.0})
+
+
+# ---------------------------------------------------------------------------
+# clear_logs() semantics
+# ---------------------------------------------------------------------------
+
+
+def test_run_start_seq_is_zero_before_any_logs(orchestrator) -> None:
+    """A brand-new orchestrator must report run_start_seq == 0 so a
+    first SSE connection picks up every line from seq 1 onward."""
+    assert orchestrator.get_run_start_seq() == 0
+
+
+def test_clear_logs_snapshots_current_seq(orchestrator) -> None:
+    """clear_logs() must capture _log_seq BEFORE clearing the buffer,
+    so subsequent runs can anchor their SSE cursor at the snapshot."""
+    _append(orchestrator, "old run line 1")
+    _append(orchestrator, "old run line 2")
+    _append(orchestrator, "old run line 3")
+    assert orchestrator.get_current_log_seq() == 3
+
+    orchestrator.clear_logs()
+
+    assert orchestrator.get_run_start_seq() == 3
+    assert orchestrator.get_current_log_seq() == 3  # seq counter preserved
+
+
+# ---------------------------------------------------------------------------
+# Race regression: SSE connects AFTER lines have been emitted
+# ---------------------------------------------------------------------------
+
+
+def test_sse_default_cursor_catches_all_current_run_lines(orchestrator) -> None:
+    """Simulate the POST-then-SSE race: worker starts emitting lines
+    immediately after clear_logs(), SSE connects several lines later.
+    Using get_run_start_seq() as the default cursor MUST return every
+    line emitted since clear_logs() ran.
+
+    Pre-fix, the SSE defaulted to get_current_log_seq() at connect
+    time, which would return the last-seen seq and miss lines N+1..M.
+    """
+    # Previous run leaves some buffered lines.
+    _append(orchestrator, "previous run line A")
+    _append(orchestrator, "previous run line B")
+
+    # New run starts: orchestrator clears the buffer and snapshots seq.
+    orchestrator.clear_logs()
+    run_start = orchestrator.get_run_start_seq()
+
+    # Worker emits early lines BEFORE the SSE connects.
+    _append(orchestrator, "Importing Unsloth...")
+    _append(orchestrator, "Loading checkpoint: /foo/bar")
+    _append(orchestrator, "Starting export...")
+
+    # SSE connects now and asks "give me everything after the run
+    # start cursor".
+    entries, new_cursor = orchestrator.get_logs_since(run_start)
+
+    # All three early lines must be present. Pre-fix this was [].
+    lines = [e["line"] for e in entries]
+    assert lines == [
+        "Importing Unsloth...",
+        "Loading checkpoint: /foo/bar",
+        "Starting export...",
+    ]
+    assert new_cursor == entries[-1]["seq"]
+
+
+def test_sse_default_cursor_excludes_previous_run(orchestrator) -> None:
+    """After clear_logs(), lines from the PREVIOUS run must not leak
+    into the new run's SSE stream. Pre-fix this worked correctly
+    (clear_logs cleared the deque); the fix must preserve it.
+    """
+    _append(orchestrator, "previous run line 1")
+    _append(orchestrator, "previous run line 2")
+    _append(orchestrator, "previous run line 3")
+    assert orchestrator.get_current_log_seq() == 3
+
+    orchestrator.clear_logs()
+    run_start = orchestrator.get_run_start_seq()
+
+    _append(orchestrator, "new run line")
+
+    entries, _ = orchestrator.get_logs_since(run_start)
+    assert [e["line"] for e in entries] == ["new run line"]
+
+
+def test_clear_logs_twice_advances_run_start(orchestrator) -> None:
+    """Back-to-back clear_logs() calls (e.g. cleanup -> load ->
+    export in the same dialog session) must each re-anchor run_start
+    at the current seq, so successive runs each start with a fresh
+    low-water mark."""
+    _append(orchestrator, "run 1 line a")
+    _append(orchestrator, "run 1 line b")
+
+    orchestrator.clear_logs()
+    assert orchestrator.get_run_start_seq() == 2
+
+    _append(orchestrator, "run 2 line a")
+    _append(orchestrator, "run 2 line b")
+    _append(orchestrator, "run 2 line c")
+
+    orchestrator.clear_logs()
+    assert orchestrator.get_run_start_seq() == 5