airi/services/computer-use-mcp/chrome-extension
刘梓恒 e145ce81c1
feat(computer-use-mcp): add read-only DOM tools parity to extension bridge (#1733)
Co-authored-by-agent: Antigravity <antigravity@gemini.com>
2026-04-27 15:24:27 +08:00
..
background.js feat(computer-use-mcp): add read-only DOM tools parity to extension bridge (#1733) 2026-04-27 15:24:27 +08:00
content.js feat(computer-use-mcp): add read-only DOM tools parity to extension bridge (#1733) 2026-04-27 15:24:27 +08:00
icon16.png [1/3] feat(desktop): add desktop observation and overlay baseline (#1647) 2026-04-24 23:43:40 +08:00
icon48.png [1/3] feat(desktop): add desktop observation and overlay baseline (#1647) 2026-04-24 23:43:40 +08:00
icon128.png [1/3] feat(desktop): add desktop observation and overlay baseline (#1647) 2026-04-24 23:43:40 +08:00
manifest.json feat(stage-tamagotchi,computer-use-mcp): implement browser-native DOM action routing (#1648) 2026-04-25 00:08:22 +08:00
msg_bridge.js feat(stage-tamagotchi,stage-ui,computer-use-mcp): intro agent-owned session and ghost pointer phases (#1649) 2026-04-25 04:14:33 +08:00
README.md feat(stage-tamagotchi,stage-ui,computer-use-mcp): intro agent-owned session and ghost pointer phases (#1649) 2026-04-25 04:14:33 +08:00

AIRI Desktop Grounding — Chrome Extension

Read-only Chrome DOM observation bridge for the AIRI Desktop Grounding layer.

What it does

  • Collects interactive elements (buttons, links, inputs, etc.) from all frames in the active Chrome tab
  • Reports element positions, ARIA roles, text, and rect coordinates
  • Feeds this data into the desktop grounding snap resolver for coordinate mapping

What it does NOT do

  • No DOM mutations (no clicking, typing, scrolling on DOM elements)
  • No eval / new Function / chrome.scripting.executeScript
  • No external network requests (no Python bridge, no offscreen documents)
  • No popup UI

All user interactions are performed via real macOS OS-level input events (CGEvent) through the desktop grounding executor.

Architecture

background.js (Service Worker)
    ↕ chrome.tabs.sendMessage
msg_bridge.js (ISOLATED world)
    ↕ window.postMessage
content.js (MAIN world, window.__AIRI_DG__)

Installation (development)

  1. Open chrome://extensions/
  2. Enable "Developer mode"
  3. Click "Load unpacked"
  4. Select this chrome-extension/ directory
  5. The extension will auto-inject into all pages

Supported commands

Command Description
getActiveTab Get active tab info (id, url, title)
getAllFrames List all frames in active tab
readAllFramesDOM Collect interactive elements from all frames
findElement Find single element by CSS selector
findElements Find multiple elements by CSS selector
getClickTarget Get element center point for click targeting
getElementAttributes Get all attributes of an element

Provenance

Adapted from the repository's Chrome extension source with DOM-action methods stripped.