dom0
dom0 is a browser automation toolkit for AI agents. It uses Chrome's DevTools Protocol (CDP) via chrome.debugger to provide stealth browser control that avoids detection by anti-bot systems.
Why dom0?
Traditional browser automation tools like Playwright and Puppeteer are easily detected. dom0 takes a different approach:
- Stealth by default — Uses native Chrome debugging, not injected scripts
- Accessibility-first — Works with the accessibility tree, not fragile CSS selectors
- Ref system — Elements get stable aliases (
@d1,@d2) that persist across interactions - Token-efficient — Outputs designed to minimize AI context usage
- Daemon architecture — Background process stays alive between commands
Quick Example
bash
# Get page snapshot with element refs $ dom0 snapshot URL: https://example.com/login Title: Login @d1 button "Sign In" @d2 textbox "Email" @d3 textbox "Password" @d4 link "Forgot password?" # Interact using refs $ dom0 type @d2 "user@example.com" $ dom0 type @d3 "secret123" $ dom0 click @d1
Architecture
dom0 consists of three packages:
| Package | Purpose |
|---|---|
@bot0/dom0 | Core types and WebSocket protocol |
@bot0/dom0-cli | CLI tool with background daemon |
@bot0/dom0-extension | Chrome extension (MV3) |
┌─────────────────┐ WebSocket ┌─────────────────────┐
│ dom0 CLI │◄──────────────────│ Chrome Extension │
│ │ │ │
│ • Commands │ │ • chrome.debugger │
│ • Daemon │ │ • Accessibility │
│ • Output │ │ • CDP commands │
└─────────────────┘ └─────────────────────┘
Documentation
- System Architecture — How dom0 works under the hood
- CLI Reference — All 20+ commands with examples
- Extension Setup — Installing and configuring the Chrome extension
- Agent Integration — Using dom0 as an AI agent skill
Key Concepts
Element Refs
Every interactable element gets a short alias (@d1, @d2, etc.) based on the accessibility tree. These refs:
- Are assigned fresh on each snapshot
- Target elements by
backendNodeId(Chrome's internal ID) - Fall back to CSS selectors if needed
- Include role and accessible name for context
Background Daemon
The CLI runs a background daemon that:
- Auto-starts on first command
- Stays alive between commands (3-minute timeout)
- Maintains WebSocket connection to extension
- Enables sequential command execution in same terminal
Stealth Approach
Unlike Playwright/Puppeteer which inject JavaScript and modify browser properties, dom0:
- Uses
chrome.debuggerAPI (native CDP) - Operates through accessibility tree (like a screen reader)
- Dispatches real input events (mouse, keyboard)
- Doesn't modify
navigatoror window properties
Installation
bash
# Build all dom0 packages pnpm dom0:build # Load extension in Chrome # 1. Go to chrome://extensions # 2. Enable "Developer mode" # 3. Click "Load unpacked" # 4. Select: packages/dom0-extension/dist # Link CLI globally (optional) cd packages/dom0-cli && pnpm link --global # Test connection dom0 ping
Usage with AI Agents
dom0 is designed to be used as a skill by AI agents like bot0:
markdown
## Browser Automation (dom0) Use the dom0 CLI for web interactions. ### Workflow 1. `dom0 snapshot` — See page elements with refs 2. Interact using refs (@d1, @d2, etc.) 3. Re-snapshot after navigation ### Key Commands - `dom0 navigate <url>` — Go to URL - `dom0 snapshot` — Get page state - `dom0 click @d1` — Click element - `dom0 type @d2 "text"` — Type into input - `dom0 screenshot` — Capture page
See the Agent Integration guide for complete SKILL.md instructions.