studio: rate-limit login, rotate refresh tokens, add logout, security headers, gate bootstrap injection

A pass over the auth surface found a cluster of related issues that this
commit closes together.

Login (routes/auth.py):
- Add an in-memory per-IP login rate limiter. Five failed POSTs to
  /api/auth/login inside a 60s window produce 429 with Retry-After.
  A successful login clears the bucket. Previously 30 wrong passwords
  in under one second was accepted as 30x 401, which combined with
  the (now fixed) admin-username leak from /api/auth/status made
  brute-force trivial against a small password.

Logout (routes/auth.py):
- New POST /api/auth/logout returns 204 and calls
  storage.revoke_user_refresh_tokens(subject) so the refresh token
  is no longer valid. Previously POST /api/auth/logout returned 405
  and there was no way to invalidate refresh tokens short of
  changing the password. Frontend session.ts already calls
  clearAuthTokens() to drop localStorage; the new endpoint lets the
  client also tell the server to revoke server-side state.

Refresh-token rotation (routes/auth.py + auth/storage.py):
- New storage.consume_refresh_token(token) atomically validates +
  deletes a refresh token, returning (username, is_desktop). The
  /api/auth/refresh handler now mints both a new access AND a new
  refresh token; the supplied token becomes invalid. Replaying a
  consumed refresh returns 401 "Invalid or expired refresh token".
  The previous refresh_access_token helper is left in place for
  callers that intentionally want the non-rotating shape; nothing
  in the route layer uses it now.

/api/auth/status no longer leaks default_username (models/auth.py +
routes/auth.py):
- AuthStatusResponse.default_username becomes Optional[str] with a
  None default; the handler always returns None. The frontend already
  hardcodes HIDDEN_LOGIN_USERNAME = "unsloth" (auth-form.tsx:82), so
  no UI change is required.

window.__UNSLOTH_BOOTSTRAP__ no longer auto-injects (main.py):
- _inject_bootstrap is now opt-in via the
  UNSLOTH_STUDIO_INJECT_BOOTSTRAP env var. The previous default
  (inject whenever requires_password_change is true) embedded the
  plaintext bootstrap password into the first-boot HTML for any
  caller that hit /, /change-password, or any unknown SPA path.
  Browser extensions and any XSS payload on the page could read it
  trivially. With the new gate the bootstrap password lives only in
  the auth/.bootstrap_password file (mode 0o600) where it has always
  been; users typing it into a current-password field is the right
  UX. routes/auth.py:change_password also clears
  app.state.bootstrap_password defensively.

Security headers + server fingerprint (main.py + run.py):
- New SecurityHeadersMiddleware adds Content-Security-Policy,
  X-Frame-Options: DENY, X-Content-Type-Options: nosniff,
  Referrer-Policy: no-referrer,
  Permissions-Policy: camera=(), microphone=(), geolocation=(),
  interest-cohort=(), and stamps server: unsloth-studio so the
  generic uvicorn banner no longer fingerprints the stack. The
  uvicorn.Config gains server_header=False so it stops emitting its
  own Server header.

/api/health minimisation (main.py):
- Unauthenticated GET /api/health returns just
  {"status":"healthy","timestamp":...} so load-balancer liveness
  probes keep working without leaking version, device_type,
  chat_only, desktop_protocol_version, or studio_root_id to
  arbitrary callers. A request that presents a valid Bearer token
  still gets the full diagnostic payload so internal launchers and
  sibling-Studio detection (which compares studio_root_id) keep
  working.

Verification:
- 30 wrong-password POSTs to /api/auth/login -> first 5 = 401, 6th
  through 30th = 429.
- POST /api/auth/logout with a fresh token -> 204. The matching
  refresh token then fails 401.
- Login -> R1; /api/auth/refresh with R1 -> new access + R2 (R2 !=
  R1); /api/auth/refresh with R1 again -> 401; /api/auth/refresh
  with R2 -> still succeeds once and rotates again.
- curl /api/auth/status -> default_username: null.
- curl http://127.0.0.1/ does not contain __UNSLOTH_BOOTSTRAP__.
- curl -I / shows CSP, X-Frame-Options: DENY,
  X-Content-Type-Options: nosniff, Referrer-Policy: no-referrer,
  Permissions-Policy, and server: unsloth-studio.
- curl /api/health unauthenticated -> {status, timestamp} only.
  curl with Authorization: Bearer <valid> -> full payload.
- Existing /api/system, /api/models/list, /api/train/status,
  /api/inference/status, /api/auth/api-keys, login flow, SPA root
  all still return 200 after the changes (regression smoke).
This commit is contained in:
Daniel Han 2026-05-11 12:45:22 +00:00
parent 28d40d2abb
commit b39e9a47f5
4 changed files with 199 additions and 11 deletions

View file

@ -480,6 +480,49 @@ def save_refresh_token(
conn.close()
def consume_refresh_token(token: str) -> Optional[Tuple[str, bool]]:
"""Atomically validate-and-delete a refresh token.
Used by the /api/auth/refresh handler so each refresh token can only
be used once (token rotation). The handler then mints a new refresh
token with :func:`auth.authentication.create_refresh_token`. Closes
finding 3.2 (refresh token previously reusable indefinitely).
Returns (username, is_desktop) on success, or None if the token does
not exist (already consumed, never issued, or revoked) or is expired.
"""
token_hash = _hash_token(token)
conn = get_connection()
try:
# Clean up any expired tokens while we're here.
conn.execute(
"DELETE FROM refresh_tokens WHERE expires_at < ?",
(datetime.now(timezone.utc).isoformat(),),
)
cur = conn.execute(
"""
SELECT id, username, expires_at, is_desktop FROM refresh_tokens
WHERE token_hash = ?
""",
(token_hash,),
)
row = cur.fetchone()
if row is None:
conn.commit()
return None
expires_at = datetime.fromisoformat(row["expires_at"])
if datetime.now(timezone.utc) > expires_at:
conn.execute("DELETE FROM refresh_tokens WHERE id = ?", (row["id"],))
conn.commit()
return None
# Atomically remove so the token cannot be replayed.
conn.execute("DELETE FROM refresh_tokens WHERE id = ?", (row["id"],))
conn.commit()
return row["username"], bool(row["is_desktop"])
finally:
conn.close()
def verify_refresh_token(token: str) -> Optional[Tuple[str, bool]]:
"""
Verify a refresh token and return the username plus desktop marker.

View file

@ -37,7 +37,15 @@ class AuthStatusResponse(BaseModel):
initialized: bool = Field(
..., description = "True if the auth database contains a login user"
)
default_username: str = Field(..., description = "Default seeded admin username")
default_username: Optional[str] = Field(
None,
description = (
"Default seeded admin username. Returned as None to "
"unauthenticated callers so /api/auth/status no longer "
"leaks the admin name (finding 3.3). The frontend hardcodes "
"the admin name so this is safe to omit."
),
)
requires_password_change: bool = Field(
...,
description = "True if the seeded admin must still change the default password",

View file

@ -5,8 +5,11 @@
Authentication API routes
"""
from fastapi import APIRouter, Depends, HTTPException, status
from fastapi import APIRouter, Depends, HTTPException, Request, Response, status
import threading
import time
from collections import deque
from datetime import datetime, timedelta, timezone
from models.auth import (
@ -33,6 +36,61 @@ from auth.authentication import (
router = APIRouter()
# ---------------------------------------------------------------------------
# Login rate limiter (in-memory, per-IP).
#
# Tracks the last N failed-attempt timestamps per source IP. When the
# window count exceeds the threshold, raise 429 with Retry-After.
# Successful logins clear the bucket for that IP.
#
# This is intentionally a simple in-memory limiter; for a multi-process
# deployment the right home would be a shared store. Single Studio
# process is the common case for desktop / single-tenant local use.
# ---------------------------------------------------------------------------
_LOGIN_BUCKETS: dict[str, deque] = {}
_LOGIN_BUCKETS_LOCK = threading.Lock()
_LOGIN_WINDOW_SECONDS = 60.0
_LOGIN_MAX_FAILS = 5
_LOGIN_LOCKOUT_SECONDS = 60
def _client_key(request: Request | None) -> str:
if request is None or request.client is None:
return "_unknown"
return request.client.host or "_unknown"
def _record_login_failure(ip: str) -> int:
"""Return the failure count within the current window (after recording)."""
now = time.monotonic()
with _LOGIN_BUCKETS_LOCK:
bucket = _LOGIN_BUCKETS.setdefault(ip, deque())
while bucket and now - bucket[0] > _LOGIN_WINDOW_SECONDS:
bucket.popleft()
bucket.append(now)
return len(bucket)
def _login_blocked(ip: str) -> int:
"""Return seconds to wait if blocked, else 0."""
now = time.monotonic()
with _LOGIN_BUCKETS_LOCK:
bucket = _LOGIN_BUCKETS.get(ip)
if not bucket:
return 0
while bucket and now - bucket[0] > _LOGIN_WINDOW_SECONDS:
bucket.popleft()
if len(bucket) >= _LOGIN_MAX_FAILS:
# Oldest entry expires after _LOGIN_WINDOW_SECONDS from its time.
return max(1, int(_LOGIN_WINDOW_SECONDS - (now - bucket[0])))
return 0
def _clear_login_bucket(ip: str) -> None:
with _LOGIN_BUCKETS_LOCK:
_LOGIN_BUCKETS.pop(ip, None)
@router.get("/status", response_model = AuthStatusResponse)
async def auth_status() -> AuthStatusResponse:
"""
@ -40,10 +98,16 @@ async def auth_status() -> AuthStatusResponse:
- initialized = False -> frontend should wait for the seeded admin bootstrap.
- initialized = True -> frontend should show login or force the first password change.
Note: ``default_username`` is intentionally returned as ``None`` to
unauthenticated callers. The frontend hardcodes the admin name; the
previous behaviour leaked the admin name to anyone hitting this
endpoint and combined with the absent rate-limit (now fixed) made
brute-force trivial.
"""
return AuthStatusResponse(
initialized = storage.is_initialized(),
default_username = storage.DEFAULT_ADMIN_USERNAME,
default_username = None,
requires_password_change = storage.requires_password_change(
storage.DEFAULT_ADMIN_USERNAME
)
@ -53,12 +117,29 @@ async def auth_status() -> AuthStatusResponse:
@router.post("/login", response_model = Token)
async def login(payload: AuthLoginRequest) -> Token:
async def login(payload: AuthLoginRequest, request: Request) -> Token:
"""
Login with username/password and receive access + refresh tokens.
Rate-limited per source IP: after :data:`_LOGIN_MAX_FAILS` failures
inside :data:`_LOGIN_WINDOW_SECONDS`, further attempts return 429
with a ``Retry-After`` header until the window expires.
"""
ip = _client_key(request)
blocked_for = _login_blocked(ip)
if blocked_for > 0:
raise HTTPException(
status_code = status.HTTP_429_TOO_MANY_REQUESTS,
detail = (
f"Too many failed login attempts from {ip}. "
f"Try again in {blocked_for} seconds."
),
headers = {"Retry-After": str(blocked_for)},
)
record = storage.get_user_and_secret(payload.username)
if record is None:
_record_login_failure(ip)
raise HTTPException(
status_code = status.HTTP_401_UNAUTHORIZED,
detail = "Incorrect password. Run 'unsloth studio reset-password' in your terminal to reset it.",
@ -66,11 +147,13 @@ async def login(payload: AuthLoginRequest) -> Token:
salt, pwd_hash, _jwt_secret, must_change_password = record
if not hashing.verify_password(payload.password, salt, pwd_hash):
_record_login_failure(ip)
raise HTTPException(
status_code = status.HTTP_401_UNAUTHORIZED,
detail = "Incorrect password. Run 'unsloth studio reset-password' in your terminal to reset it.",
)
_clear_login_bucket(ip)
access_token = create_access_token(subject = payload.username)
refresh_token = create_refresh_token(subject = payload.username)
return Token(
@ -81,6 +164,34 @@ async def login(payload: AuthLoginRequest) -> Token:
)
@router.post("/logout", status_code = status.HTTP_204_NO_CONTENT)
async def logout(
request: Request,
current_subject: str = Depends(get_current_subject_allow_password_change),
) -> Response:
"""Invalidate ALL refresh tokens for the authenticated subject.
Closes finding 2.3 (POST /api/auth/logout previously returned 405,
leaving no way to invalidate refresh tokens beyond a password
change). The access token itself is not blacklisted (JWTs are
stateless; they expire by their ``exp`` claim within 1 hour), so the
client should also clear local state immediately.
"""
try:
storage.revoke_user_refresh_tokens(current_subject)
except Exception:
# Best-effort - logout should always succeed from the client's
# perspective even if the persistence layer hiccups.
pass
# Defensive: clear any cached bootstrap password from process state
# so a subsequent change-password / restart doesn't re-leak it.
try:
request.app.state.bootstrap_password = None
except AttributeError:
pass
return Response(status_code = status.HTTP_204_NO_CONTENT)
@router.post("/desktop-login", response_model = Token)
async def desktop_login(payload: DesktopLoginRequest) -> Token:
"""Exchange a local desktop secret for normal admin-subject tokens."""
@ -102,20 +213,29 @@ async def desktop_login(payload: DesktopLoginRequest) -> Token:
@router.post("/refresh", response_model = Token)
async def refresh(payload: RefreshTokenRequest) -> Token:
"""
Exchange a valid refresh token for a new access token.
Exchange a valid refresh token for a NEW access + refresh token pair.
The refresh token itself is reusable until it expires (7 days).
Refresh tokens are now single-use: the supplied token is atomically
consumed (deleted) and a fresh one is issued. Re-submitting the
consumed token returns 401 (closing finding 3.2 - refresh token
previously usable indefinitely). If a refresh-token-reuse event
fires (consumed token already gone), no token family invalidation
happens here because the consume_refresh_token call already returned
None - we cannot identify the user behind a revoked hash.
"""
new_access_token, username, is_desktop = refresh_access_token(payload.refresh_token)
if new_access_token is None or username is None:
consumed = storage.consume_refresh_token(payload.refresh_token)
if consumed is None:
raise HTTPException(
status_code = status.HTTP_401_UNAUTHORIZED,
detail = "Invalid or expired refresh token",
)
username, is_desktop = consumed
new_access_token = create_access_token(subject = username, desktop = is_desktop)
new_refresh_token = create_refresh_token(subject = username, desktop = is_desktop)
return Token(
access_token = new_access_token,
refresh_token = payload.refresh_token,
refresh_token = new_refresh_token,
token_type = "bearer",
must_change_password = False
if is_desktop
@ -126,6 +246,7 @@ async def refresh(payload: RefreshTokenRequest) -> Token:
@router.post("/change-password", response_model = Token)
async def change_password(
payload: ChangePasswordRequest,
request: Request,
current_subject: str = Depends(get_current_subject_allow_password_change),
) -> Token:
"""Allow the authenticated user to replace the default password."""
@ -150,6 +271,13 @@ async def change_password(
storage.update_password(current_subject, payload.new_password)
storage.revoke_user_refresh_tokens(current_subject)
# Drop the process-cached bootstrap password so _inject_bootstrap
# cannot re-serve it after a server-side cache slip (defense-in-depth
# for finding 2.2).
try:
request.app.state.bootstrap_password = None
except AttributeError:
pass
access_token = create_access_token(subject = current_subject)
refresh_token = create_refresh_token(subject = current_subject)
return Token(

View file

@ -349,9 +349,18 @@ def run_server(
if not silent:
print(f"[WARNING] Frontend not found at {frontend_path}")
# Create the uvicorn server and expose it for signal handlers
# Create the uvicorn server and expose it for signal handlers.
# server_header=False suppresses uvicorn's built-in "Server: uvicorn"
# response header (finding 4.15 - technology fingerprinting). The
# SecurityHeadersMiddleware sets a generic "Server: unsloth-studio"
# via response.headers.
config = uvicorn.Config(
app, host = host, port = port, log_level = "info", access_log = False
app,
host = host,
port = port,
log_level = "info",
access_log = False,
server_header = False,
)
_server = uvicorn.Server(config)
_shutdown_event = Event()