mirror of
https://github.com/openclaw/openclaw.git
synced 2026-04-28 06:31:11 +00:00
docs(qa): reorg, audit against code, and refresh stale content
Reorg - Rename the architecture page title to "QA overview" (slug stays /concepts/qa-e2e-automation so inbound links keep working). - Move "Adding a channel to QA" + scenario-helper-name reference from testing.md into qa-e2e-automation.md under "Transport adapters". Architecture belongs with the architecture page. - Drop the duplicate live-transport coverage table from testing.md; canonical copy stays in qa-e2e-automation.md under a new "Live transport coverage" heading so qa-matrix.md can deep-link to it. - Slim testing.md QA-specific runners section to ops only, with cross-links. Audit (against extensions/qa-lab/src/cli.ts, qa-channel/src/config-schema.ts, and live-transport runtimes) - qa-e2e-automation.md gains a "Command surface" table covering all 14 openclaw qa <subcommand> forms; previously only ~7 of 14 were named. - Document missing OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT and OPENCLAW_QA_DISCORD_CAPTURE_CONTENT env vars (Matrix already had it). - Cross-link qa coverage from the Reporting section. - qa-channel.md completes the config-key list (enabled, name, accounts, defaultAccount were missing from the schema doc) and pollTimeoutMs range. - Drop stale "Follow-up work" framing in qa-channel.md (provider/model matrix, scenario discovery, orchestration) — all three already shipped. - Replace "vertical slice" language with current behavior; fix misplaced debugger-UI paragraph. Discoverability - Add a Note callout to testing.md pointing at the three QA pages (QA overview, Matrix QA, QA channel) so maintainers landing on testing.md see the QA stack in the prologue. Glossary entries for the renamed/new doc titles.
This commit is contained in:
parent
abca187df5
commit
dd1a94f089
4 changed files with 165 additions and 187 deletions
|
|
@ -507,6 +507,14 @@
|
|||
"source": "Matrix QA",
|
||||
"target": "Matrix QA"
|
||||
},
|
||||
{
|
||||
"source": "QA overview",
|
||||
"target": "QA overview"
|
||||
},
|
||||
{
|
||||
"source": "QA channel",
|
||||
"target": "QA channel"
|
||||
},
|
||||
{
|
||||
"source": "Rich Output Protocol",
|
||||
"target": "富输出协议"
|
||||
|
|
|
|||
|
|
@ -7,27 +7,16 @@ read_when:
|
|||
- You are iterating on end-to-end QA automation
|
||||
---
|
||||
|
||||
`qa-channel` is a bundled synthetic message transport for automated OpenClaw QA.
|
||||
`qa-channel` is a bundled synthetic message transport for automated OpenClaw QA. It is not a production channel — it exists to exercise the same channel plugin boundary used by real transports while keeping state deterministic and fully inspectable.
|
||||
|
||||
It is not a production channel. It exists to exercise the same channel plugin
|
||||
boundary used by real transports while keeping state deterministic and fully
|
||||
inspectable.
|
||||
|
||||
## What it does today
|
||||
## What it does
|
||||
|
||||
- Slack-class target grammar:
|
||||
- `dm:<user>`
|
||||
- `channel:<room>`
|
||||
- `thread:<room>/<thread>`
|
||||
- HTTP-backed synthetic bus for:
|
||||
- inbound message injection
|
||||
- outbound transcript capture
|
||||
- thread creation
|
||||
- reactions
|
||||
- edits
|
||||
- deletes
|
||||
- search and read actions
|
||||
- Bundled host-side self-check runner that writes a Markdown report
|
||||
- HTTP-backed synthetic bus for inbound message injection, outbound transcript capture, thread creation, reactions, edits, deletes, and search/read actions.
|
||||
- Host-side self-check runner that writes a Markdown report to `.artifacts/qa-e2e/`.
|
||||
|
||||
## Config
|
||||
|
||||
|
|
@ -45,68 +34,53 @@ inspectable.
|
|||
}
|
||||
```
|
||||
|
||||
Supported account keys:
|
||||
Account keys:
|
||||
|
||||
- `baseUrl`
|
||||
- `botUserId`
|
||||
- `botDisplayName`
|
||||
- `pollTimeoutMs`
|
||||
- `allowFrom`
|
||||
- `defaultTo`
|
||||
- `actions.messages`
|
||||
- `actions.reactions`
|
||||
- `actions.search`
|
||||
- `actions.threads`
|
||||
- `enabled` — master toggle for this account.
|
||||
- `name` — optional display label.
|
||||
- `baseUrl` — synthetic bus URL.
|
||||
- `botUserId` — Matrix-style bot user id used in target grammar.
|
||||
- `botDisplayName` — display name for outbound messages.
|
||||
- `pollTimeoutMs` — long-poll wait window. Integer between 100 and 30000.
|
||||
- `allowFrom` — sender allowlist (user ids or `"*"`).
|
||||
- `defaultTo` — fallback target when none is supplied.
|
||||
- `actions.messages` / `actions.reactions` / `actions.search` / `actions.threads` — per-action tool gating.
|
||||
|
||||
## Runner
|
||||
Multi-account keys at the top level:
|
||||
|
||||
Current vertical slice:
|
||||
- `accounts` — record of named per-account overrides keyed by account id.
|
||||
- `defaultAccount` — preferred account id when multiple are configured.
|
||||
|
||||
## Runners
|
||||
|
||||
Host-side self-check (writes a Markdown report under `.artifacts/qa-e2e/`):
|
||||
|
||||
```bash
|
||||
pnpm qa:e2e
|
||||
```
|
||||
|
||||
This now routes through the bundled `qa-lab` extension. It starts the in-repo
|
||||
QA bus, boots the bundled `qa-channel` runtime slice, runs a deterministic
|
||||
self-check, and writes a Markdown report under `.artifacts/qa-e2e/`.
|
||||
This routes through `qa-lab`, starts the in-repo QA bus, boots the bundled `qa-channel` runtime slice, and runs a deterministic self-check.
|
||||
|
||||
Private debugger UI:
|
||||
|
||||
```bash
|
||||
pnpm qa:lab:up
|
||||
```
|
||||
|
||||
That one command builds the QA site, starts the Docker-backed gateway + QA Lab
|
||||
stack, and prints the QA Lab URL. From that site you can pick scenarios, choose
|
||||
the model lane, launch individual runs, and watch results live.
|
||||
|
||||
Full repo-backed QA suite:
|
||||
Full repo-backed scenario suite:
|
||||
|
||||
```bash
|
||||
pnpm openclaw qa suite
|
||||
```
|
||||
|
||||
That launches the private QA debugger at a local URL, separate from the
|
||||
shipped Control UI bundle.
|
||||
Runs scenarios in parallel against the QA gateway lane. See [QA overview](/concepts/qa-e2e-automation) for scenarios, profiles, and provider modes.
|
||||
|
||||
## Scope
|
||||
Docker-backed QA site (gateway + QA Lab debugger UI in one stack):
|
||||
|
||||
Current scope is intentionally narrow:
|
||||
```bash
|
||||
pnpm qa:lab:up
|
||||
```
|
||||
|
||||
- bus + plugin transport
|
||||
- threaded routing grammar
|
||||
- channel-owned message actions
|
||||
- Markdown reporting
|
||||
- Docker-backed QA site with run controls
|
||||
|
||||
Follow-up work will add:
|
||||
|
||||
- provider/model matrix execution
|
||||
- richer scenario discovery
|
||||
- OpenClaw-native orchestration later
|
||||
Builds the QA site, starts the Docker-backed gateway + QA Lab stack, and prints the QA Lab URL. From there you can pick scenarios, choose the model lane, launch individual runs, and watch results live. The QA Lab debugger is separate from the shipped Control UI bundle.
|
||||
|
||||
## Related
|
||||
|
||||
- [QA overview](/concepts/qa-e2e-automation) — overall stack, transport adapters, scenario authoring
|
||||
- [Matrix QA](/concepts/qa-matrix) — example live-transport runner that drives a real channel
|
||||
- [Pairing](/channels/pairing)
|
||||
- [Groups](/channels/groups)
|
||||
- [Channels overview](/channels)
|
||||
|
|
|
|||
|
|
@ -1,10 +1,11 @@
|
|||
---
|
||||
summary: "Private QA automation shape for qa-lab, qa-channel, seeded scenarios, and protocol reports"
|
||||
summary: "QA stack overview: qa-lab, qa-channel, repo-backed scenarios, live transport lanes, transport adapters, and reporting."
|
||||
read_when:
|
||||
- Extending qa-lab or qa-channel
|
||||
- Understanding how the QA stack fits together
|
||||
- Extending qa-lab, qa-channel, or a transport adapter
|
||||
- Adding repo-backed QA scenarios
|
||||
- Building higher-realism QA automation around the Gateway dashboard
|
||||
title: "QA E2E automation"
|
||||
title: "QA overview"
|
||||
---
|
||||
|
||||
The private QA stack is meant to exercise OpenClaw in a more realistic,
|
||||
|
|
@ -16,9 +17,37 @@ Current pieces:
|
|||
reaction, edit, and delete surfaces.
|
||||
- `extensions/qa-lab`: debugger UI and QA bus for observing the transcript,
|
||||
injecting inbound messages, and exporting a Markdown report.
|
||||
- `extensions/qa-matrix`, future runner plugins: live-transport adapters that
|
||||
drive a real channel inside a child QA gateway.
|
||||
- `qa/`: repo-backed seed assets for the kickoff task and baseline QA
|
||||
scenarios.
|
||||
|
||||
## Command surface
|
||||
|
||||
Every QA flow runs under `pnpm openclaw qa <subcommand>`. Many have `pnpm qa:*`
|
||||
script aliases; both forms are supported.
|
||||
|
||||
| Command | Purpose |
|
||||
| --------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `qa run` | Bundled QA self-check; writes a Markdown report. |
|
||||
| `qa suite` | Run repo-backed scenarios against the QA gateway lane. Aliases: `pnpm openclaw qa suite --runner multipass` for a disposable Linux VM. |
|
||||
| `qa coverage` | Print the markdown scenario-coverage inventory (`--json` for machine output). |
|
||||
| `qa parity-report` | Compare two `qa-suite-summary.json` files and write the agentic parity-gate report. |
|
||||
| `qa character-eval` | Run the character QA scenario across multiple live models with a judged report. See [Reporting](#reporting). |
|
||||
| `qa manual` | Run a one-off prompt against the selected provider/model lane. |
|
||||
| `qa ui` | Start the QA debugger UI and local QA bus (alias: `pnpm qa:lab:ui`). |
|
||||
| `qa docker-build-image` | Build the prebaked QA Docker image. |
|
||||
| `qa docker-scaffold` | Write a docker-compose scaffold for the QA dashboard + gateway lane. |
|
||||
| `qa up` | Build the QA site, start the Docker-backed stack, print the URL (alias: `pnpm qa:lab:up`; `:fast` variant adds `--use-prebuilt-image --bind-ui-dist --skip-ui-build`). |
|
||||
| `qa aimock` | Start only the AIMock provider server. |
|
||||
| `qa mock-openai` | Start only the scenario-aware `mock-openai` provider server. |
|
||||
| `qa credentials doctor` / `add` / `list` / `remove` | Manage the shared Convex credential pool. |
|
||||
| `qa matrix` | Live transport lane against a disposable Tuwunel homeserver. See [Matrix QA](/concepts/qa-matrix). |
|
||||
| `qa telegram` | Live transport lane against a real private Telegram group. |
|
||||
| `qa discord` | Live transport lane against a real private Discord guild channel. |
|
||||
|
||||
## Operator flow
|
||||
|
||||
The current QA operator flow is a two-pane QA site:
|
||||
|
||||
- Left: Gateway dashboard (Control UI) with the agent.
|
||||
|
|
@ -76,23 +105,7 @@ For a transport-real Matrix smoke lane, run:
|
|||
pnpm openclaw qa matrix --profile fast --fail-fast
|
||||
```
|
||||
|
||||
That lane provisions a disposable Tuwunel homeserver in Docker, registers
|
||||
temporary driver, SUT, and observer users, creates one private room, then runs
|
||||
the real Matrix plugin inside a QA gateway child. The live transport lane keeps
|
||||
the child config scoped to the transport under test, so Matrix runs without
|
||||
`qa-channel` in the child config. It writes the structured report artifacts and
|
||||
a combined stdout/stderr log into the selected Matrix QA output directory. To
|
||||
capture the outer `scripts/run-node.mjs` build/launcher output too, set
|
||||
`OPENCLAW_RUN_NODE_OUTPUT_LOG=<path>` to a repo-local log file.
|
||||
Matrix progress is printed by default. The CLI default profile is `all`, so
|
||||
plain `pnpm openclaw qa matrix` still runs the full catalog. Use `--profile
|
||||
fast` for the release-critical transport contract, or shard full coverage with
|
||||
`transport`, `media`, `e2ee-smoke`, `e2ee-deep`, and `e2ee-cli`. `--fail-fast`
|
||||
stops after the first failed scenario when you want a release gate instead of a
|
||||
full inventory. `OPENCLAW_QA_MATRIX_TIMEOUT_MS` bounds the full run,
|
||||
`OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS` can shorten no-reply quiet windows for
|
||||
CI, and `OPENCLAW_QA_MATRIX_CLEANUP_TIMEOUT_MS` bounds cleanup so a stuck
|
||||
Docker teardown reports the exact recovery command instead of hanging.
|
||||
The full CLI reference, profile/scenario catalog, env vars, and artifact layout for this lane live in [Matrix QA](/concepts/qa-matrix). At a glance: it provisions a disposable Tuwunel homeserver in Docker, registers temporary driver/SUT/observer users, runs the real Matrix plugin inside a child QA gateway scoped to that transport (no `qa-channel`), then writes a Markdown report, JSON summary, observed-events artifact, and combined output log under `.artifacts/qa-e2e/matrix-<timestamp>/`.
|
||||
|
||||
For a transport-real Telegram smoke lane, run:
|
||||
|
||||
|
|
@ -106,7 +119,8 @@ disposable server. It requires `OPENCLAW_QA_TELEGRAM_GROUP_ID`,
|
|||
`OPENCLAW_QA_TELEGRAM_SUT_BOT_TOKEN`, plus two distinct bots in the same
|
||||
private group. The SUT bot must have a Telegram username, and bot-to-bot
|
||||
observation works best when both bots have Bot-to-Bot Communication Mode
|
||||
enabled in `@BotFather`.
|
||||
enabled in `@BotFather`. Set `OPENCLAW_QA_TELEGRAM_CAPTURE_CONTENT=1` to keep
|
||||
message bodies in observed-message artifacts (default redacts).
|
||||
The command exits non-zero when any scenario fails. Use `--allow-failures` when
|
||||
you want artifacts without a failing exit code.
|
||||
The Telegram report and summary include per-reply RTT from the driver message
|
||||
|
|
@ -133,17 +147,17 @@ driver bot controlled by the harness and a SUT bot started by the child
|
|||
OpenClaw gateway through the bundled Discord plugin. It requires
|
||||
`OPENCLAW_QA_DISCORD_GUILD_ID`, `OPENCLAW_QA_DISCORD_CHANNEL_ID`,
|
||||
`OPENCLAW_QA_DISCORD_DRIVER_BOT_TOKEN`, `OPENCLAW_QA_DISCORD_SUT_BOT_TOKEN`,
|
||||
and `OPENCLAW_QA_DISCORD_SUT_APPLICATION_ID` when using env credentials.
|
||||
and `OPENCLAW_QA_DISCORD_SUT_APPLICATION_ID` when using env credentials. Set
|
||||
`OPENCLAW_QA_DISCORD_CAPTURE_CONTENT=1` to keep message bodies in
|
||||
observed-message artifacts (default redacts).
|
||||
The lane verifies channel mention handling and checks that the SUT bot has
|
||||
registered the native `/help` command with Discord.
|
||||
The command exits non-zero when any scenario fails. Use `--allow-failures` when
|
||||
you want artifacts without a failing exit code.
|
||||
|
||||
Live transport lanes now share one smaller contract instead of each inventing
|
||||
their own scenario list shape:
|
||||
## Live transport coverage
|
||||
|
||||
`qa-channel` remains the broad synthetic product-behavior suite and is not part
|
||||
of the live transport coverage matrix.
|
||||
Live transport lanes share one contract instead of each inventing their own scenario list shape. `qa-channel` is the broad synthetic product-behavior suite and is not part of the live transport coverage matrix.
|
||||
|
||||
| Lane | Canary | Mention gating | Allowlist block | Top-level reply | Restart resume | Thread follow-up | Thread isolation | Reaction observation | Help command | Native command registration |
|
||||
| -------- | ------ | -------------- | --------------- | --------------- | -------------- | ---------------- | ---------------- | -------------------- | ------------ | --------------------------- |
|
||||
|
|
@ -235,19 +249,79 @@ provider names.
|
|||
|
||||
## Transport adapters
|
||||
|
||||
`qa-lab` owns a generic transport seam for markdown QA scenarios.
|
||||
`qa-channel` is the first adapter on that seam, but the design target is wider:
|
||||
future real or synthetic channels should plug into the same suite runner
|
||||
instead of adding a transport-specific QA runner.
|
||||
`qa-lab` owns a generic transport seam for markdown QA scenarios. `qa-channel` is the first adapter on that seam, but the design target is wider: future real or synthetic channels should plug into the same suite runner instead of adding a transport-specific QA runner.
|
||||
|
||||
At the architecture level, the split is:
|
||||
|
||||
- `qa-lab` owns generic scenario execution, worker concurrency, artifact writing, and reporting.
|
||||
- the transport adapter owns gateway config, readiness, inbound and outbound observation, transport actions, and normalized transport state.
|
||||
- markdown scenario files under `qa/scenarios/` define the test run; `qa-lab` provides the reusable runtime surface that executes them.
|
||||
- The transport adapter owns gateway config, readiness, inbound and outbound observation, transport actions, and normalized transport state.
|
||||
- Markdown scenario files under `qa/scenarios/` define the test run; `qa-lab` provides the reusable runtime surface that executes them.
|
||||
|
||||
Maintainer-facing adoption guidance for new channel adapters lives in
|
||||
[Testing](/help/testing#adding-a-channel-to-qa).
|
||||
### Adding a channel
|
||||
|
||||
Adding a channel to the markdown QA system requires exactly two things:
|
||||
|
||||
1. A transport adapter for the channel.
|
||||
2. A scenario pack that exercises the channel contract.
|
||||
|
||||
Do not add a new top-level QA command root when the shared `qa-lab` host can own the flow.
|
||||
|
||||
`qa-lab` owns the shared host mechanics:
|
||||
|
||||
- the `openclaw qa` command root
|
||||
- suite startup and teardown
|
||||
- worker concurrency
|
||||
- artifact writing
|
||||
- report generation
|
||||
- scenario execution
|
||||
- compatibility aliases for older `qa-channel` scenarios
|
||||
|
||||
Runner plugins own the transport contract:
|
||||
|
||||
- how `openclaw qa <runner>` is mounted beneath the shared `qa` root
|
||||
- how the gateway is configured for that transport
|
||||
- how readiness is checked
|
||||
- how inbound events are injected
|
||||
- how outbound messages are observed
|
||||
- how transcripts and normalized transport state are exposed
|
||||
- how transport-backed actions are executed
|
||||
- how transport-specific reset or cleanup is handled
|
||||
|
||||
The minimum adoption bar for a new channel:
|
||||
|
||||
1. Keep `qa-lab` as the owner of the shared `qa` root.
|
||||
2. Implement the transport runner on the shared `qa-lab` host seam.
|
||||
3. Keep transport-specific mechanics inside the runner plugin or channel harness.
|
||||
4. Mount the runner as `openclaw qa <runner>` instead of registering a competing root command. Runner plugins should declare `qaRunners` in `openclaw.plugin.json` and export a matching `qaRunnerCliRegistrations` array from `runtime-api.ts`. Keep `runtime-api.ts` light; lazy CLI and runner execution should stay behind separate entrypoints.
|
||||
5. Author or adapt markdown scenarios under the themed `qa/scenarios/` directories.
|
||||
6. Use the generic scenario helpers for new scenarios.
|
||||
7. Keep existing compatibility aliases working unless the repo is doing an intentional migration.
|
||||
|
||||
The decision rule is strict:
|
||||
|
||||
- If behavior can be expressed once in `qa-lab`, put it in `qa-lab`.
|
||||
- If behavior depends on one channel transport, keep it in that runner plugin or plugin harness.
|
||||
- If a scenario needs a new capability that more than one channel can use, add a generic helper instead of a channel-specific branch in `suite.ts`.
|
||||
- If a behavior is only meaningful for one transport, keep the scenario transport-specific and make that explicit in the scenario contract.
|
||||
|
||||
### Scenario helper names
|
||||
|
||||
Preferred generic helpers for new scenarios:
|
||||
|
||||
- `waitForTransportReady`
|
||||
- `waitForChannelReady`
|
||||
- `injectInboundMessage`
|
||||
- `injectOutboundMessage`
|
||||
- `waitForTransportOutboundMessage`
|
||||
- `waitForChannelOutboundMessage`
|
||||
- `waitForNoTransportOutbound`
|
||||
- `getTransportSnapshot`
|
||||
- `readTransportMessage`
|
||||
- `readTransportTranscript`
|
||||
- `formatTransportTranscript`
|
||||
- `resetTransport`
|
||||
|
||||
Compatibility aliases remain available for existing scenarios — `waitForQaChannelReady`, `waitForOutboundMessage`, `waitForNoOutbound`, `formatConversationTranscript`, `resetBus` — but new scenario authoring should use the generic names. The aliases exist to avoid a flag-day migration, not as the model going forward.
|
||||
|
||||
## Reporting
|
||||
|
||||
|
|
@ -259,6 +333,8 @@ The report should answer:
|
|||
- What stayed blocked
|
||||
- What follow-up scenarios are worth adding
|
||||
|
||||
For the inventory of available scenarios — useful when sizing follow-up work or wiring a new transport — run `pnpm openclaw qa coverage` (add `--json` for machine-readable output).
|
||||
|
||||
For character and style checks, run the same scenario across multiple live model
|
||||
refs and write a judged Markdown report:
|
||||
|
||||
|
|
@ -314,6 +390,7 @@ When no `--judge-model` is passed, the judges default to
|
|||
|
||||
## Related docs
|
||||
|
||||
- [Testing](/help/testing)
|
||||
- [Matrix QA](/concepts/qa-matrix)
|
||||
- [QA Channel](/channels/qa-channel)
|
||||
- [Testing](/help/testing)
|
||||
- [Dashboard](/web/dashboard)
|
||||
|
|
|
|||
|
|
@ -15,6 +15,16 @@ of Docker runners. This doc is a "how we test" guide:
|
|||
- How live tests discover credentials and select models/providers.
|
||||
- How to add regressions for real-world model/provider issues.
|
||||
|
||||
<Note>
|
||||
**QA stack (qa-lab, qa-channel, live transport lanes)** is documented separately:
|
||||
|
||||
- [QA overview](/concepts/qa-e2e-automation) — architecture, command surface, scenario authoring.
|
||||
- [Matrix QA](/concepts/qa-matrix) — reference for `pnpm openclaw qa matrix`.
|
||||
- [QA channel](/channels/qa-channel) — the synthetic transport plugin used by repo-backed scenarios.
|
||||
|
||||
This page covers running the regular test suites and Docker/Parallels runners. The QA-specific runners section below ([QA-specific runners](#qa-specific-runners)) lists the concrete `qa` invocations and points back at the references above.
|
||||
</Note>
|
||||
|
||||
## Quick start
|
||||
|
||||
Most days:
|
||||
|
|
@ -248,17 +258,8 @@ gh workflow run package-acceptance.yml --ref main \
|
|||
- Starts only the local AIMock provider server for direct protocol smoke
|
||||
testing.
|
||||
- `pnpm openclaw qa matrix`
|
||||
- Runs the Matrix live QA lane against a disposable Docker-backed Tuwunel homeserver.
|
||||
- This QA host is repo/dev-only today. Packaged OpenClaw installs do not ship
|
||||
`qa-lab`, so they do not expose `openclaw qa`.
|
||||
- Repo checkouts load the bundled runner directly; no separate plugin install
|
||||
step is needed.
|
||||
- Provisions three temporary Matrix users (`driver`, `sut`, `observer`) plus one private room, then starts a QA gateway child with the real Matrix plugin as the SUT transport.
|
||||
- Defaults to `--profile all`. Use `--profile fast --fail-fast` for release-critical transport proof, or `--profile transport|media|e2ee-smoke|e2ee-deep|e2ee-cli` when sharding the full catalog.
|
||||
- Uses the pinned stable Tuwunel image `ghcr.io/matrix-construct/tuwunel:v1.5.1` by default. Override with `OPENCLAW_QA_MATRIX_TUWUNEL_IMAGE` when you need to test a different image.
|
||||
- Matrix does not expose shared credential-source flags because the lane provisions disposable users locally.
|
||||
- Writes a Matrix QA report, summary, observed-events artifact, and combined stdout/stderr output log under `.artifacts/qa-e2e/...`.
|
||||
- Emits progress by default and enforces a hard run timeout with `OPENCLAW_QA_MATRIX_TIMEOUT_MS` (default 30 minutes). `OPENCLAW_QA_MATRIX_NO_REPLY_WINDOW_MS` tunes negative no-reply quiet windows, and cleanup is bounded by `OPENCLAW_QA_MATRIX_CLEANUP_TIMEOUT_MS` with failures including the recovery `docker compose ... down --remove-orphans` command.
|
||||
- Runs the Matrix live QA lane against a disposable Docker-backed Tuwunel homeserver. Source-checkout only — packaged installs do not ship `qa-lab`.
|
||||
- Full CLI, profile/scenario catalog, env vars, and artifact layout: [Matrix QA](/concepts/qa-matrix).
|
||||
- `pnpm openclaw qa telegram`
|
||||
- Runs the Telegram live QA lane against a real private group using the driver and SUT bot tokens from env.
|
||||
- Requires `OPENCLAW_QA_TELEGRAM_GROUP_ID`, `OPENCLAW_QA_TELEGRAM_DRIVER_BOT_TOKEN`, and `OPENCLAW_QA_TELEGRAM_SUT_BOT_TOKEN`. The group id must be the numeric Telegram chat id.
|
||||
|
|
@ -269,16 +270,7 @@ gh workflow run package-acceptance.yml --ref main \
|
|||
- For stable bot-to-bot observation, enable Bot-to-Bot Communication Mode in `@BotFather` for both bots and ensure the driver bot can observe group bot traffic.
|
||||
- Writes a Telegram QA report, summary, and observed-messages artifact under `.artifacts/qa-e2e/...`. Replying scenarios include RTT from driver send request to observed SUT reply.
|
||||
|
||||
Live transport lanes share one standard contract so new transports do not drift:
|
||||
|
||||
`qa-channel` remains the broad synthetic QA suite and is not part of the live
|
||||
transport coverage matrix.
|
||||
|
||||
| Lane | Canary | Mention gating | Allowlist block | Top-level reply | Restart resume | Thread follow-up | Thread isolation | Reaction observation | Help command | Native command registration |
|
||||
| -------- | ------ | -------------- | --------------- | --------------- | -------------- | ---------------- | ---------------- | -------------------- | ------------ | --------------------------- |
|
||||
| Matrix | x | x | x | x | x | x | x | x | | |
|
||||
| Telegram | x | x | | | | | | | x | |
|
||||
| Discord | x | x | | | | | | | | x |
|
||||
Live transport lanes share one standard contract so new transports do not drift; the per-lane coverage matrix lives in [QA overview → Live transport coverage](/concepts/qa-e2e-automation#live-transport-coverage). `qa-channel` is the broad synthetic suite and is not part of that matrix.
|
||||
|
||||
### Shared Telegram credentials via Convex (v1)
|
||||
|
||||
|
|
@ -360,80 +352,7 @@ Payload shape for Telegram kind:
|
|||
|
||||
### Adding a channel to QA
|
||||
|
||||
Adding a channel to the markdown QA system requires exactly two things:
|
||||
|
||||
1. A transport adapter for the channel.
|
||||
2. A scenario pack that exercises the channel contract.
|
||||
|
||||
Do not add a new top-level QA command root when the shared `qa-lab` host can
|
||||
own the flow.
|
||||
|
||||
`qa-lab` owns the shared host mechanics:
|
||||
|
||||
- the `openclaw qa` command root
|
||||
- suite startup and teardown
|
||||
- worker concurrency
|
||||
- artifact writing
|
||||
- report generation
|
||||
- scenario execution
|
||||
- compatibility aliases for older `qa-channel` scenarios
|
||||
|
||||
Runner plugins own the transport contract:
|
||||
|
||||
- how `openclaw qa <runner>` is mounted beneath the shared `qa` root
|
||||
- how the gateway is configured for that transport
|
||||
- how readiness is checked
|
||||
- how inbound events are injected
|
||||
- how outbound messages are observed
|
||||
- how transcripts and normalized transport state are exposed
|
||||
- how transport-backed actions are executed
|
||||
- how transport-specific reset or cleanup is handled
|
||||
|
||||
The minimum adoption bar for a new channel is:
|
||||
|
||||
1. Keep `qa-lab` as the owner of the shared `qa` root.
|
||||
2. Implement the transport runner on the shared `qa-lab` host seam.
|
||||
3. Keep transport-specific mechanics inside the runner plugin or channel harness.
|
||||
4. Mount the runner as `openclaw qa <runner>` instead of registering a competing root command.
|
||||
Runner plugins should declare `qaRunners` in `openclaw.plugin.json` and export a matching `qaRunnerCliRegistrations` array from `runtime-api.ts`.
|
||||
Keep `runtime-api.ts` light; lazy CLI and runner execution should stay behind separate entrypoints.
|
||||
5. Author or adapt markdown scenarios under the themed `qa/scenarios/` directories.
|
||||
6. Use the generic scenario helpers for new scenarios.
|
||||
7. Keep existing compatibility aliases working unless the repo is doing an intentional migration.
|
||||
|
||||
The decision rule is strict:
|
||||
|
||||
- If behavior can be expressed once in `qa-lab`, put it in `qa-lab`.
|
||||
- If behavior depends on one channel transport, keep it in that runner plugin or plugin harness.
|
||||
- If a scenario needs a new capability that more than one channel can use, add a generic helper instead of a channel-specific branch in `suite.ts`.
|
||||
- If a behavior is only meaningful for one transport, keep the scenario transport-specific and make that explicit in the scenario contract.
|
||||
|
||||
Preferred generic helper names for new scenarios are:
|
||||
|
||||
- `waitForTransportReady`
|
||||
- `waitForChannelReady`
|
||||
- `injectInboundMessage`
|
||||
- `injectOutboundMessage`
|
||||
- `waitForTransportOutboundMessage`
|
||||
- `waitForChannelOutboundMessage`
|
||||
- `waitForNoTransportOutbound`
|
||||
- `getTransportSnapshot`
|
||||
- `readTransportMessage`
|
||||
- `readTransportTranscript`
|
||||
- `formatTransportTranscript`
|
||||
- `resetTransport`
|
||||
|
||||
Compatibility aliases remain available for existing scenarios, including:
|
||||
|
||||
- `waitForQaChannelReady`
|
||||
- `waitForOutboundMessage`
|
||||
- `waitForNoOutbound`
|
||||
- `formatConversationTranscript`
|
||||
- `resetBus`
|
||||
|
||||
New channel work should use the generic helper names.
|
||||
Compatibility aliases exist to avoid a flag day migration, not as the model for
|
||||
new scenario authoring.
|
||||
The architecture and scenario-helper names for new channel adapters live in [QA overview → Adding a channel](/concepts/qa-e2e-automation#adding-a-channel). The minimum bar: implement the transport runner on the shared `qa-lab` host seam, declare `qaRunners` in the plugin manifest, mount as `openclaw qa <runner>`, and author scenarios under `qa/scenarios/`.
|
||||
|
||||
## Test suites (what runs where)
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue