mirror of
https://github.com/supermemoryai/supermemory.git
synced 2026-05-17 21:11:04 +00:00
64 lines
1.5 KiB
Text
64 lines
1.5 KiB
Text
---
|
|
title: "Quick Start"
|
|
description: "Run your first benchmark evaluation in 3 steps"
|
|
sidebarTitle: "Quick Start"
|
|
---
|
|
|
|
## 1. Run Your First Benchmark
|
|
|
|
```bash
|
|
bun run src/index.ts run -p supermemory -b longmemeval -j gpt-4o -r my-first-run
|
|
```
|
|
|
|
## 2. View Results
|
|
|
|
### Option A: Web UI
|
|
|
|
```bash
|
|
bun run src/index.ts serve
|
|
```
|
|
|
|
Open [http://localhost:3000](http://localhost:3000) to see results visually.
|
|
|
|
### Option B: CLI
|
|
|
|
```bash
|
|
# Check run status
|
|
bun run src/index.ts status -r my-first-run
|
|
|
|
# View failed questions for debugging
|
|
bun run src/index.ts show-failures -r my-first-run
|
|
```
|
|
|
|
## 3. Compare Providers
|
|
|
|
Run the same benchmark across multiple providers:
|
|
|
|
```bash
|
|
bun run src/index.ts compare -p supermemory,mem0,zep -b locomo -j gpt-4o
|
|
```
|
|
|
|
## Sample Output
|
|
|
|
Each run produces a [MemScore](/memorybench/memscore) — a composite metric capturing quality, latency, and token efficiency:
|
|
|
|
```
|
|
SUMMARY:
|
|
Total Questions: 50
|
|
Correct: 36
|
|
Accuracy: 72.00%
|
|
|
|
Quality: 72%
|
|
Latency: 1250ms (avg)
|
|
Tokens: 1,823 (avg context sent to answering model)
|
|
|
|
MemScore: 72% / 1250ms / 1823tok
|
|
```
|
|
|
|
Full results are saved to `data/runs/{runId}/report.json` with detailed breakdowns by question type, latency percentiles, and per-question token counts.
|
|
|
|
## What's Next
|
|
|
|
- [MemScore](/memorybench/memscore) — understand the composite metric and how to compare providers
|
|
- [CLI Reference](/memorybench/cli) — all available commands
|
|
- [Architecture](/memorybench/architecture) — how MemoryBench works under the hood
|