Bundle QA skill in pip package and auto-install during setup (#4985)

2026-04-28 03:30:10 +00:00 · 2026-03-04 17:23:50 -08:00 · 2026-03-04 17:23:50 -08:00 · 58d9b980d2
commit 58d9b980d2
parent bdd1a26361
7 changed files with 394 additions and 25 deletions
--- a/fern/integrations/cli.mdx
+++ b/fern/integrations/cli.mdx
@ -159,14 +159,17 @@ skyvern block validate --file block.json   # Validate a block definition
 Register Skyvern's MCP server with your AI coding tool in one command:

 ```bash
-skyvern setup claude-code        # Register with Claude Code (global)
+skyvern setup claude-code        # Register with Claude Code + install skills (/qa, etc.)
 skyvern setup claude-code --project  # Register with Claude Code (project-level)
+skyvern setup claude-code --skip-skills  # MCP only, no skills
 skyvern setup claude-desktop     # Register with Claude Desktop
 skyvern setup cursor             # Register with Cursor
 skyvern setup windsurf           # Register with Windsurf
 skyvern setup codex              # Register with Codex
 ```

+`skyvern setup claude-code` writes the MCP config **and** copies bundled skills (including `/qa`) into your project's `.claude/skills/` directory. Use `--skip-skills` to opt out.
+
 ### Other

 ```bash
@ -190,30 +193,34 @@ The CLI and MCP server share the same underlying logic. The CLI is for humans an

 ## Skills

-Skills are bundled reference markdown files that teach AI coding tools how to use Skyvern. They are **not** the same as MCP tools — they are documentation that an AI agent can load to learn the CLI and API.
+Skills are bundled markdown files that teach AI coding tools how to use Skyvern. They ship with `pip install skyvern` and are **automatically installed** when you run `skyvern setup claude-code`.
+
+| Skill | What it does |
+|-------|-------------|
+| **qa** | Reads your `git diff`, generates browser tests, runs them against your dev server, reports pass/fail with screenshots. Invoke with `/qa` in Claude Code. |
+| **skyvern** | CLI reference covering all browser automation, workflow, and credential commands. |
+| **testing** | Smoke tests for verifying Skyvern deployments. |

 ```bash
 skyvern skill list                  # List available skills
-skyvern skill show skyvern          # Render a skill in the terminal
-skyvern skill path skyvern          # Print the absolute path to a skill file
-skyvern skill path                  # Print the skills directory
-skyvern skill copy --output ./docs  # Copy all skills to a local directory
-skyvern skill copy skyvern -o .     # Copy a single skill
+skyvern skill show qa               # Render a skill in the terminal
+skyvern skill copy --output .claude/skills  # Copy all skills manually
+skyvern skill copy qa -o .claude/skills     # Copy a single skill
 ```

 <Accordion title="Loading skills into AI tools">

-Skills are plain markdown files. You can load them into any AI coding tool that supports custom instructions:
+Skills are plain markdown files. Load them into any AI coding tool that supports custom instructions:

-**Claude Code** — add the skill path as a custom instructions file or use `skyvern setup claude-code` which configures MCP (the richer integration path).
+**Claude Code** — run `skyvern setup claude-code`, which registers the MCP server **and** installs skills into `.claude/skills/`. The `/qa` skill is immediately available.

-**Codex** — copy the skill into your project's `.codex/skills/` directory:
+**Codex** — copy skills into your project's `.codex/skills/` directory:

 ```bash
-skyvern skill copy skyvern -o .codex/skills/
+skyvern skill copy -o .codex/skills/
 ```

-**Any tool** — point your tool at the file path returned by `skyvern skill path skyvern`.
+**Any tool** — point your tool at the file path returned by `skyvern skill path qa`.

 </Accordion>

--- a/fern/integrations/mcp.mdx
+++ b/fern/integrations/mcp.mdx
@ -267,6 +267,33 @@ This is especially powerful during development: make a code change, then ask you
 Always use `--api-key` when exposing your browser via tunnel. Without it, anyone with the ngrok URL has full browser control.
 </Warning>

+## QA your frontend changes
+
+If you install the Skyvern CLI, `skyvern setup claude-code` registers the MCP server **and** installs Claude Code skills — including `/qa`, which reads your git diff, generates browser tests, and runs them against your dev server.
+
+```bash
+pip install skyvern
+skyvern setup claude-code
+```
+
+Then make a frontend change and type `/qa` in Claude Code. It will:
+
+1. Read your `git diff` and the full source of each changed file
+2. Generate targeted test cases (plus regression tests for adjacent behavior)
+3. Open a browser against your running dev server
+4. Run the tests — navigate, click, fill forms, check the DOM
+5. Report pass/fail with screenshots
+
+For localhost testing, start a local browser first:
+
+```bash
+skyvern browser serve --tunnel
+```
+
+<Tip>
+The `/qa` skill uses `skyvern_evaluate` for fast DOM assertions (~10ms each) and `skyvern_act` for natural-language interactions. Tests are generated fresh from each diff — no test files to maintain.
+</Tip>
+
 ## Troubleshooting

 <Accordion title="Invalid API key or 401 errors">
--- a/skyvern/cli/mcp_tools/prompts.py
+++ b/skyvern/cli/mcp_tools/prompts.py
@ -458,8 +458,10 @@ def extract_data(
 # qa_test
 # ---------------------------------------------------------------------------

-# NOTE: This content is also maintained in .claude/skills/qa/SKILL.md
-# for the /qa Claude Code skill. Keep both in sync when updating.
+# NOTE: This content is maintained in three places — keep all in sync:
+# 1. skyvern/cli/skills/qa/SKILL.md         (bundled with pip package — canonical)
+# 2. .claude/skills/qa/SKILL.md              (project-local copy for this repo)
+# 3. skyvern/cli/mcp_tools/prompts.py        (QA_TEST_CONTENT — this file)
 QA_TEST_CONTENT = """\
 # QA — Test Frontend Changes in a Real Browser

--- a/skyvern/cli/setup_commands.py
+++ b/skyvern/cli/setup_commands.py
@ -5,6 +5,7 @@ from __future__ import annotations
 import json
 import os
 import platform
+import shutil
 import sys
 from pathlib import Path
 from urllib.parse import urlparse
@ -14,6 +15,7 @@ from dotenv import load_dotenv
 from rich.syntax import Syntax

 from skyvern.cli.console import console
+from skyvern.cli.skill_commands import get_skill_dirs
 from skyvern.utils.env_paths import resolve_backend_env_path

 # NOTE: skyvern/cli/mcp.py has older setup_*_config() helpers called from
@ -251,6 +253,47 @@ def _run_setup(
    _upsert_mcp_config(config_path, tool_name, entry, dry_run=dry_run, yes=yes)


+def _install_skills(project_dir: Path, dry_run: bool = False) -> None:
+    """Install bundled skills into a project's .claude/skills/ directory.
+
+    Skips skills that already exist at the destination (non-destructive).
+    """
+    skills_dst = project_dir / ".claude" / "skills"
+    dirs = get_skill_dirs()
+    if not dirs:
+        return
+
+    installed: list[str] = []
+    skipped: list[str] = []
+    failed: list[str] = []
+    ignore = shutil.ignore_patterns("__pycache__", "*.pyc")
+    for d in dirs:
+        target = skills_dst / d.name
+        if target.exists():
+            skipped.append(d.name)
+            continue
+        if not dry_run:
+            try:
+                target.parent.mkdir(parents=True, exist_ok=True)
+                shutil.copytree(d, target, ignore=ignore)
+            except OSError as e:
+                console.print(f"[yellow]Warning: failed to install skill '{d.name}': {e}[/yellow]")
+                failed.append(d.name)
+                continue
+        installed.append(d.name)
+
+    if installed:
+        names = ", ".join(installed)
+        if dry_run:
+            console.print(f"\n[yellow]Dry run — would install skills: {names}[/yellow]")
+        else:
+            console.print(f"\n[green]Installed skills to {skills_dst}: {names}[/green]")
+            if "qa" in installed:
+                console.print("[bold]Tip:[/bold] Make a frontend change and type /qa to test it in a real browser.")
+    if skipped:
+        console.print(f"[dim]Skills already installed: {', '.join(skipped)}[/dim]")
+
+
 # ---------------------------------------------------------------------------
 # Commands
 # ---------------------------------------------------------------------------
@ -288,11 +331,15 @@ def setup_claude_code(
    use_python_path: bool = _python_path_opt,
    url: str | None = _url_opt,
    project: bool = typer.Option(False, "--project", help="Write to .mcp.json in current dir instead of global config"),
+    skip_skills: bool = typer.Option(False, "--skip-skills", help="Don't install Claude Code skills (e.g. /qa)"),
 ) -> None:
-    """Register Skyvern MCP with Claude Code (remote by default)."""
+    """Register Skyvern MCP with Claude Code and install skills (remote by default)."""
    config_path = Path.cwd() / ".mcp.json" if project else _claude_code_global_config_path()
    _run_setup("Claude Code", config_path, api_key, dry_run, yes, local, use_python_path, url)

+    if not skip_skills:
+        _install_skills(Path.cwd(), dry_run=dry_run)
+

@setup_app.command("cursor")
 def setup_cursor(
--- a/skyvern/cli/skill_commands.py
+++ b/skyvern/cli/skill_commands.py
@ -19,7 +19,7 @@ SKILLS_DIR = Path(__file__).parent / "skills"
 _FRONTMATTER_RE = re.compile(r"^---\n(.*?)\n---", re.DOTALL)


-def _get_skill_dirs() -> list[Path]:
+def get_skill_dirs() -> list[Path]:
    """Return sorted list of skill directories (those containing SKILL.md)."""
    if not SKILLS_DIR.exists():
        return []
@ -60,7 +60,7 @@ def _extract_description(skill_md: Path) -> str:
@skill_app.command("list")
 def skill_list() -> None:
    """List all bundled skills."""
-    dirs = _get_skill_dirs()
+    dirs = get_skill_dirs()
    if not dirs:
        console.print("[red]No skills found in package. Re-install skyvern.[/red]")
        raise typer.Exit(code=1)
@ -120,7 +120,7 @@ def skill_copy(
        shutil.copytree(src, target, dirs_exist_ok=overwrite, ignore=_ignore)
        console.print(f"[green]Copied skill '{name}' to {target.resolve()}[/green]")
    else:
-        dirs = _get_skill_dirs()
+        dirs = get_skill_dirs()
        if not dirs:
            console.print("[red]No skills found in package. Re-install skyvern.[/red]")
            raise typer.Exit(code=1)
--- a/skyvern/cli/skills/README.md
+++ b/skyvern/cli/skills/README.md
@ -1,21 +1,29 @@
 # Skyvern Skills Package

-AI-powered browser automation skill for coding agents. Bundled with `pip install skyvern`.
+AI-powered browser automation skills for coding agents. Bundled with `pip install skyvern`.

 ## Quick Start

 ```bash
 pip install skyvern
 export SKYVERN_API_KEY="YOUR_KEY"   # get one at https://app.skyvern.com
+
+# Set up MCP + install skills in one step:
+skyvern setup claude-code
 ```

-The skill teaches CLI commands via `skyvern <command>` invocations. For richer
-AI-coding-tool integration, run `skyvern setup claude-code --project` to enable
-MCP (Model Context Protocol) with auto-tool-calling.
+`skyvern setup claude-code` registers the Skyvern MCP server and installs these
+skills into your project's `.claude/skills/` directory automatically.

 ## What's Included

-A single `skyvern` skill covering all browser automation capabilities:
+### qa
+QA test your frontend changes in a real browser. Reads your `git diff`, generates
+targeted browser tests, runs them against your local dev server, and reports
+pass/fail with screenshots. Invoke with `/qa` in Claude Code.
+
+### skyvern
+CLI reference covering all browser automation capabilities:

 - Browser session lifecycle (create, navigate, close)
 - AI actions: act, extract, validate, screenshot
@ -26,19 +34,27 @@ A single `skyvern` skill covering all browser automation capabilities:
 - Block schema discovery and validation
 - Debugging with screenshot + validate loops

+### testing
+Smoke-test skill for verifying Skyvern deployments.
+
 ## Structure

 ```
+qa/
+  SKILL.md              Diff-driven frontend QA testing
 skyvern/
  SKILL.md              Main skill file (CLI-first, all capabilities)
  references/           17 deep-dive reference files
  examples/             Workflow JSON examples
+testing/
+  SKILL.md              Deployment smoke testing
 ```

-## Install to a Project
+## Manual Install
+
+If you prefer to install skills without running setup:

 ```bash
-# Copy skill files to your project
 skyvern skill copy --output .claude/skills
 skyvern skill copy --output .agents/skills
 ```
--- a/skyvern/cli/skills/qa/SKILL.md
+++ b/skyvern/cli/skills/qa/SKILL.md
@ -0,0 +1,270 @@
+---
+name: qa
+description: "QA test your frontend changes in a real browser. Reads your git diff, generates targeted browser tests, runs them against your local dev server, and reports pass/fail with screenshots."
+---
+
+# QA — Test Your Frontend Changes in a Real Browser
+
+<!-- NOTE: This content is maintained in three places — keep all in sync:
+     1. skyvern/cli/skills/qa/SKILL.md         (bundled with pip package — canonical)
+     2. .claude/skills/qa/SKILL.md              (project-local copy for this repo)
+     3. skyvern/cli/mcp_tools/prompts.py        (QA_TEST_CONTENT for the MCP prompt) -->
+
+You changed some code. This skill reads your diff, understands what UI was affected, opens
+a real browser against your running dev server, and tests that your changes actually work.
+
+## Quick Start
+
+```
+/qa                              # Diff-based: test what you changed
+/qa http://localhost:3000        # Same, explicit URL
+/qa -- test the checkout flow    # Targeted: test specific behavior
+```
+
+## How It Works
+
+1. **Read your code changes** (`git diff`) to understand what you modified
+2. **Read the changed files** to understand the UI: routes, components, props, text
+3. **Generate targeted test cases** based on what the code actually does
+4. **Open a browser** against your running dev server
+5. **Run the tests** — navigate, interact, assert, screenshot
+6. **Report** pass/fail with evidence
+
+This is NOT a generic website crawler. It tests YOUR changes specifically.
+
+---
+
+## Step 1: Understand the Changes
+
+### Get the diff
+
+```bash
+# What files changed?
+git diff --name-only HEAD~1     # vs last commit (if changes are committed)
+git diff --name-only             # vs working tree (if uncommitted)
+
+# Full diff for context
+git diff HEAD~1                  # or git diff for uncommitted
+```
+
+Pick whichever diff has content. If both are empty, there's nothing to QA.
+
+### Read the changed files
+
+For every changed frontend file (`.tsx`, `.jsx`, `.ts`, `.js`, `.css`, `.html`):
+- Read the FULL file (not just the diff) to understand the component
+- Look for: route paths, component names, text labels, form fields, button labels,
+  API endpoints called, conditional rendering, error states
+
+### Classify the changes
+
+| Change Type | What to Test |
+|-------------|-------------|
+| New component/page | Navigate to it, verify it renders, interact with its elements |
+| Modified component | Navigate to it, verify the specific change works (new button, new text, new behavior) |
+| Styling changes | Navigate, screenshot, verify layout isn't broken |
+| API integration | Navigate, trigger the action, verify the API call works (check network, verify UI updates) |
+| Form changes | Fill the form, submit, verify validation and success states |
+| Route changes | Navigate to old and new routes, verify routing works |
+| Shared component (used in many places) | Test 2-3 pages that use it |
+| Bug fix | Reproduce the original bug scenario, verify it's fixed |
+
+### Generate test cases
+
+For each changed file, write specific test cases. Example:
+
+If the diff shows changes to `LoginForm.tsx` adding a "Forgot password" link:
+```
+Test 1: Login page renders the new "Forgot password" link
+  - Navigate to /login
+  - Assert: link with text "Forgot password" exists
+  - Click it
+  - Assert: navigated to /forgot-password (or modal appeared)
+
+Test 2: Login form still works (regression)
+  - Navigate to /login
+  - Verify email and password inputs exist
+  - Submit empty form, verify validation errors appear
+```
+
+**Be specific.** Don't write "verify the page works." Write "verify the 'Forgot password'
+link navigates to /forgot-password."
+
+---
+
+## Step 2: Find the Dev Server
+
+If the user provided a URL, use it. Otherwise, auto-detect:
+
+```
+# Try common dev server ports
+# 5173 (Vite), 3000 (Next/CRA), 3001, 8080, 8000, 4200 (Angular)
+```
+
+Navigate to each until one responds. If none respond, tell the user:
+"Start your dev server first, then run `/qa` again."
+
+---
+
+## Step 3: Connect to a Browser
+
+```
+skyvern_browser_session_create(local=true, headless=false, timeout=15)
+```
+
+Use `local=true` so it can reach `localhost`. Use `headless=false` so the user can watch.
+
+If local fails, fall back to cloud (warn that URL must be publicly accessible):
+```
+skyvern_browser_session_create(timeout=15)
+```
+
+---
+
+## Step 4: Run the Tests
+
+For each test case generated in Step 1:
+
+### Navigate
+```
+skyvern_navigate(url="http://localhost:<port>/<route>")
+```
+
+### Health gate (after every navigate, ~10ms)
+```
+skyvern_evaluate(expression="(() => {
+  const errors = [];
+  const body = document.body?.innerText || '';
+  if (body.includes('Something went wrong')) errors.push('error_message');
+  if (body.includes('Cannot read properties')) errors.push('js_error_in_ui');
+  if (/\bundefined\b/.test(body) && !/\bif\b|\btypeof\b|\bdocument|tutorial|example/i.test(body) && body.length < 5000) errors.push('undefined_text');
+  if (body.includes('connection refused')) errors.push('connection_refused');
+  if (/sign.?in|log.?in|auth/i.test(window.location.pathname)) errors.push('auth_redirect');
+  if (document.querySelector('[role=\"alert\"]')) errors.push('alert_element');
+  if (!document.querySelector('main, [role=\"main\"], nav, header, h1, h2, [class*=\"layout\" i], [class*=\"page\" i], [class*=\"app\" i]'))
+    errors.push('blank_page');
+  return JSON.stringify({ pass: errors.length === 0, errors });
+})()")
+```
+
+### Assert with DOM queries (prefer `skyvern_evaluate` — fast, deterministic)
+```
+# Element exists
+skyvern_evaluate(expression="!!document.querySelector('a[href=\"/forgot-password\"]')")
+
+# Text content
+skyvern_evaluate(expression="document.querySelector('h1')?.textContent?.trim()")
+
+# Element count
+skyvern_evaluate(expression="document.querySelectorAll('.card').length")
+
+# URL after navigation
+skyvern_evaluate(expression="window.location.pathname")
+```
+
+### Interact (use `skyvern_act` for natural language actions)
+```
+skyvern_act(prompt="Click the 'Forgot password' link")
+skyvern_act(prompt="Fill the email field with 'test@example.com' and click Submit")
+skyvern_act(prompt="Open the dropdown menu and select 'Settings'")
+```
+
+### Visual checks (use `skyvern_validate` only when DOM queries aren't enough)
+```
+skyvern_validate(prompt="The login form shows email and password fields with a blue Submit button")
+```
+
+### Screenshot (after every significant action)
+```
+skyvern_screenshot()
+```
+
+### Failed network requests (once per page)
+```
+skyvern_evaluate(expression="(() => {
+  const entries = performance.getEntriesByType('resource').filter(e => e.responseStatus >= 400);
+  return JSON.stringify({ failed: entries.map(e => ({ url: e.name, status: e.responseStatus })).slice(0, 5) });
+})()")
+```
+
+---
+
+## Step 5: Report Results
+
+```markdown
+## QA Report
+
+### Changes Tested
+Files: `LoginForm.tsx`, `ForgotPassword.tsx`
+Diff summary: Added "Forgot password" link to login form, new /forgot-password page
+
+### Results
+| # | Test | Result | Screenshot |
+|---|------|--------|------------|
+| 1 | Login page renders "Forgot password" link | PASS | screenshot_1 |
+| 2 | Clicking link navigates to /forgot-password | PASS | screenshot_2 |
+| 3 | Forgot password page renders form | PASS | screenshot_3 |
+| 4 | Login form still works (regression) | PASS | screenshot_4 |
+| 5 | Empty form shows validation errors | FAIL | screenshot_5 |
+
+### Issues Found
+1. **Empty login form submits without validation** — Submitting with no email/password
+   doesn't show error messages. The form submits and the page reloads.
+   Expected: validation errors. Screenshot: screenshot_5
+
+### Network
+- No failed requests detected
+
+### Verdict
+4/5 tests passed. 1 issue found: missing form validation on empty submit.
+```
+
+---
+
+## Tool Selection
+
+| What you need | Tool | Speed |
+|---------------|------|-------|
+| Check element exists, text, count, URL | `skyvern_evaluate` | ~10ms |
+| Click, type, fill forms, multi-step interaction | `skyvern_act` | 5-30s |
+| "Does this look right?" visual check | `skyvern_validate` | 15-50s |
+| Get structured data from a page | `skyvern_extract` | 15-50s |
+| Screenshot | `skyvern_screenshot` | ~1s |
+| Wait for async content | `skyvern_wait` | varies |
+
+**Default to `skyvern_evaluate` for assertions.** Only use `skyvern_validate` when you
+can't express the check as a DOM query (visual layout, "does this look like a dashboard").
+
+---
+
+## Error Handling
+
+| Problem | Action |
+|---------|--------|
+| No git diff found | Ask user what they want to test, fall back to explore mode |
+| Dev server not running | Tell user to start it. Suggest common commands (npm run dev, etc.) |
+| Auth redirect on page | Report it. Ask if they want to provide credentials or skip that route. |
+| Component doesn't render | Screenshot + check console. Report with the specific error. |
+| Session create fails | Try cloud fallback. Warn about URL accessibility. |
+
+## Session Cleanup
+
+ALWAYS close the session when done, even if errors occurred:
+```
+skyvern_browser_session_close()
+```
+
+---
+
+## Fallback: Explore Mode
+
+If there's no git diff (user just wants a general QA pass), fall back to exploring:
+
+1. Navigate to the root URL
+2. Extract nav links and page structure with `skyvern_extract`
+3. Visit each major route, health gate + screenshot
+4. Test interactive elements (forms, buttons, links)
+5. Report findings
+
+But the primary mode is **diff-driven**. The agent should always try to read the code
+changes first.