browser.md

Browser Tool

Status: Planned

Automate web browser interactions.

Capabilities

  • Navigate to URLs
  • Click elements
  • Fill forms
  • Take screenshots
  • Extract text/data
  • Execute JavaScript
  • Handle authentication

Interface

typescript
interface BrowserTool extends Tool { name: "browser"; actions: { // Navigation navigate(url: string): Promise<void>; goBack(): Promise<void>; goForward(): Promise<void>; reload(): Promise<void>; // Interaction click(selector: string): Promise<void>; type(selector: string, text: string): Promise<void>; select(selector: string, value: string): Promise<void>; hover(selector: string): Promise<void>; // Data extraction screenshot(): Promise<Buffer>; extractText(selector?: string): Promise<string>; extractHTML(selector?: string): Promise<string>; getUrl(): Promise<string>; // Waiting waitFor(selector: string, timeout?: number): Promise<void>; waitForNavigation(timeout?: number): Promise<void>; // Advanced evaluate(script: string): Promise<unknown>; setCookies(cookies: Cookie[]): Promise<void>; getCookies(): Promise<Cookie[]>; }; }

Implementation

Primary: Playwright

  • Supports Chromium, Firefox, WebKit
  • Great API
  • Handles modern web well

Alternative: Puppeteer

  • Chrome-only
  • Widely used

Browser Profiles

Each daemon maintains its own browser profile:

  • Cookies persist between sessions
  • Extensions can be installed
  • User data isolated from user's main browser
~/.bot0/browser/
├── chromium/
│   ├── Default/
│   └── ...
└── extensions/

Security Considerations

  • Never store passwords in plain text
  • Be careful with authenticated sessions
  • Consider site-specific permissions

Open Questions

  1. Should daemon use user's existing browser or separate?
  2. How to handle sites that block automation?
  3. Support for browser extensions?