spawn/packages/cli/src/shared/install-id.ts
Ahmed Abushagur f7652de45b
feat(cli): posthog feature flags + fast_provision experiment (#3366)
* feat(cli): posthog feature flags + fast_provision experiment

Wires PostHog `/decide` into the CLI so we can A/B-test provisioning
behaviors. First experiment: `fast_provision` — for users who didn't
pass --beta or --fast manually, the `test` variant turns on
`tarball + images` by default. Hypothesis: faster provisioning →
fewer drop-offs in the "VM ready → install completed" leg of the
funnel.

What's added:

- `shared/install-id.ts` — stable per-machine UUID, persisted at
  ~/.config/spawn/.telemetry-id. Reuses telemetry's existing path
  so existing users keep their PostHog identity. Falls back to an
  ephemeral UUID on disk-write failure.
- `shared/feature-flags.ts` — hand-rolled POST to PostHog /decide
  (no SDK dep). 1.5s timeout, fail-open. On-disk cache at
  $SPAWN_HOME/feature-flags-cache.json with 1h TTL so cold starts
  don't pay the network cost. SPAWN_FEATURE_FLAGS_DISABLED=1 kill
  switch. Captures `$feature_flag_called` exposure events for both
  arms so PostHog can compute conversion.
- `shared/telemetry.ts` — moves user-id loading into install-id.ts
  so flags and events share the same `distinct_id`.
- `index.ts` — `await initFeatureFlags()` at the top of `main()`,
  then applies `fast_provision`'s `test` variant by appending
  `tarball,images` to SPAWN_BETA — but only if the user didn't
  pass --beta or --fast (those always win, so opt-out is free).

Why tarball+images and not all four (`+parallel,docker`):
clean attribution. The hypothesis is about tarball/image; if we
ship the full --fast bundle we can't tell which feature moved the
metric. Keep --fast as the user-facing power-user knob.

Tests: 14 new (install-id roundtrip + format guard, feature-flags
fetch/timeout/HTTP500/malformed/disabled/idempotent/stale-cache,
exposure-event behavior). Full suite: 2183 pass, same 4 pre-existing
failures as upstream/main.

Bumps CLI to 1.0.23.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cli): skip feature-flag fetch in pick/feedback fast path; implement real SWR

Two review-fix commits from PR feedback squashed into one:

1. Move `await initFeatureFlags()` below the `spawn pick` and
   `spawn feedback` bypass clauses in `main()`. Both commands are called
   from bash scripts and must stay fast; neither gates on a flag, so
   there's no reason to pay up to 1.5s of network latency on cold cache.

2. Implement real stale-while-revalidate in `shared/feature-flags.ts`.
   The prior implementation did a synchronous fetch on stale cache,
   which contradicted the docstring and PR description. Now:
     - fresh cache (<TTL)  → use cache, no network
     - stale cache (>=TTL) → use cache immediately, refresh in background
     - no cache            → await sync fetch (first run only)

   Adds `_awaitBackgroundRefreshForTest()` so tests can deterministically
   wait for the background refresh before asserting. Updated the existing
   "stale cache" test to verify SWR semantics (stale served first, fresh
   lands next invocation) and added a "fresh cache does not fetch" test.

All 2127 tests pass; biome clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Claude <claude@anthropic.com>
2026-04-27 17:50:56 -07:00

61 lines
1.9 KiB
TypeScript

// shared/install-id.ts — Stable per-machine identifier for PostHog bucketing.
//
// Generated lazily on first call and persisted to $SPAWN_HOME/install-id.
// Used as the PostHog `distinct_id` for telemetry events and feature-flag
// evaluation, so the same machine reliably gets the same flag variant across
// runs (per-run session UUIDs would re-bucket every invocation).
import { existsSync, mkdirSync, readFileSync, writeFileSync } from "node:fs";
import { dirname } from "node:path";
import { getInstallIdPath } from "./paths.js";
import { tryCatch } from "./result.js";
const UUID_RE = /^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/;
let _cached: string | null = null;
/**
* Return the persistent install ID, creating it on first call.
* Falls back to an ephemeral UUID if the disk write fails (read-only home,
* permission errors). Never throws.
*/
export function getInstallId(): string {
if (_cached) {
return _cached;
}
const path = getInstallIdPath();
// Try to read existing
const readResult = tryCatch(() => readFileSync(path, "utf8").trim());
if (readResult.ok && UUID_RE.test(readResult.data)) {
_cached = readResult.data;
return _cached;
}
// Generate and persist
const id = crypto.randomUUID();
const writeResult = tryCatch(() => {
const dir = dirname(path);
if (!existsSync(dir)) {
mkdirSync(dir, {
recursive: true,
});
}
writeFileSync(path, id, {
mode: 0o600,
});
});
if (!writeResult.ok) {
// Disk-write failure: still return a UUID so flag evaluation works for
// this run. The user gets re-bucketed next time, but no breakage.
_cached = id;
return _cached;
}
_cached = id;
return _cached;
}
/** Test-only: reset the in-memory cache so a fresh getInstallId() reads disk. */
export function _resetInstallIdCache(): void {
_cached = null;
}