Complete rewrite of the UI and significant feature additions since v1.6.1. UX Redesign (v2.0): - Single-view dashboard replaces 4-tab sidebar - Settings, Prompts, Troubleshooter, Memory Manager moved to center-screen modals - Activity log in slide-out drawer - Setup Wizard for first-run configuration - Prompt version tracking with update notifications - Health indicator in stats bar Injection Viewer (v1.6–v2.1.6): - Per-message injection data: see exactly what memories, lorebook entries, and extension prompts were injected for any generation - Context/Prompt Breakdown with per-category token counts (System, Char card, Lorebook, Data Bank, Examples, Chat history) via ST Prompt Itemization - Stacked bar visualization, token hints in headers, Tips popup - Context overflow and heavy injection warnings Memory Management: - Unified block editor across all 5 editing surfaces (Memory Manager, Consolidation, Conversion, Reformat, Data Bank browser) - Find & Replace with highlighting across all editors - Undo support for all edit operations - Group chat character picker in Memory Manager Other features: - Tablet & phone display modes with touch-friendly controls - Topic-tagged memory format for better vector retrieval - Self-closing memory tag handling (GLM-4.7 compatibility) - Protect recent messages from extraction feedback loop - 9-point health check system with retrieve chunks and score threshold - Shared editor factory (editor.js), pure utility library (lib.js) - Vitest test suite: unit, snapshot, and live LLM tests - Full documentation suite in docs/ See CHANGELOG.md for detailed per-version notes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3.7 KiB
Automated Testing Design
Goal
Add automated testing for CharMemory's extraction pipeline using a 1000-message test chat fixture. Two tiers: deterministic snapshot tests for the processing pipeline, and live integration tests against a real LLM.
Current State
- Vitest with 71 passing unit tests in
test/unit/coveringlib.jspure functions lib.jsexports pure functions (parsing, serialization, escaping, format detection)package.jsonhastest:snapshotandtest:livescripts wired up but no test filesindex.jshas extraction pipeline logic tightly coupled to SillyTavern globals- 1000-message JSONL test chat at
/Users/davidsayed/repos/st-test-chatlog/output/
Design
Step 1: Extract pure logic from index.js into lib.js
Three new functions:
stripNonDiegetic(text) — The 5 regex operations currently inline in collectRecentMessages() (lines 2031-2036). Removes code blocks, <details> sections, markdown tables, HTML tags, collapses excessive newlines.
formatChatMessages(chatArray, startIndex, endIndex) — Message filtering and formatting extracted from collectRecentMessages(). Takes a plain array of ST message objects, filters out empty/system-only messages, applies stripNonDiegetic(), returns formatted text. The caller (collectRecentMessages in index.js) handles reading from getContext() and passes the array in.
substitutePromptTemplate(template, vars) — Template variable substitution from buildExtractionPrompt(). Replaces {{charName}}, {{charCard}}, {{existingMemories}}, {{recentMessages}}, {{participants}}. The caller handles reading the template from settings and getting the character card from ST globals.
After extraction, index.js calls these lib.js functions. No behavior change.
Step 2: Snapshot tests (npm run test:snapshot)
File: test/integration/snapshot.test.js
Test fixture: Copy JSONL into test/fixtures/flux-chat.jsonl.
Tests:
- stripNonDiegetic — Feed messages containing code blocks, tables, HTML,
<details>sections. Snapshot the cleaned output. - formatChatMessages — Load the JSONL, process chunks (messages 0-20, 20-50). Snapshot the formatted text. Verifies filtering, stripping, and formatting stability.
- substitutePromptTemplate — Build a prompt using processed messages, mock character card, empty existing memories. Snapshot the final prompt. Verifies the prompt the LLM receives is correct.
- parseMemories round-trip — Parse a sample LLM response fixture, re-serialize, verify no data loss.
All deterministic. Run in milliseconds.
Step 3: Live LLM tests (npm run test:live)
File: test/integration/live.test.js
Flow: Load JSONL → formatChatMessages → substitutePromptTemplate → call LLM → parseMemories → assert quality.
LLM backend configured via env var: TEST_LLM_URL (default: http://127.0.0.1:1234/v1). Works with LM Studio, Ollama, KoboldCpp, llama.cpp.
Assertions (structural, not exact content):
- Response contains at least 1
<memory>block - Each block has
chatanddateattributes - Each block has at least 1 bullet
- No character card trait leakage (bullets don't parrot the character description)
- Total bullet count is reasonable for the input size
File structure
test/
fixtures/
flux-chat.jsonl
unit/ (existing, unchanged)
parsing.test.js
escaping.test.js
format-detection.test.js
utils.test.js
integration/
snapshot.test.js
live.test.js
Changes to existing files
lib.js— Add 3 exported functionsindex.js— Replace inline logic with lib.js calls (refactor, no behavior change)package.json— No changes needed (scripts already defined)