Skip to main content
The Browser Tool enables AI-driven browser interactions. Launch browser sessions, click elements, type text, scroll pages, and capture screenshots through natural language commands.

What You’ll Learn

  • Session lifecycle: launch → interact → close
  • Browser actions: click, type, scroll
  • Use cases: UI testing, screenshots, navigation

Session Lifecycle

Every browser automation workflow follows a strict sequence:
  1. Launch - Start a browser session at a target URL
  2. Interact - Perform actions (click, type, scroll)
  3. Close - End the session to release resources
Browser state persists across actions within a session. You must close the browser before using other Verdent tools.
Each action returns a screenshot showing the current browser state. Review screenshots between actions to verify success before proceeding.

Browser Actions

Start a new browser session
  • Required: target URL
  • Opens browser at 1920x1080 resolution
  • Always the first action in any workflow
Launch browser at https://example.com
Coordinates are relative to the 1920x1080 viewport. Center is approximately (960, 540). Use screenshots to estimate element positions.

Common Use Cases

Test form submissions and navigation flowsLaunch at a login page, click input fields, type credentials, submit forms, and verify results through screenshots.
Launch browser at https://app.example.com/login
Click coordinates 450,280
Type "testuser@example.com"
Click coordinates 450,340
Type "password123"
Click coordinates 500,420
Close browser

Limitations

  • Tool exclusivity - Only browser_action can be used during active sessions
  • Coordinate-based - Requires x,y coordinates, not CSS selectors
  • Fixed resolution - Browser viewport locked at 1920x1080
  • Chrome only - Puppeteer supports Chrome/Chromium browsers
  • No persistence - Sessions don’t survive Verdent restarts
  • No WSL support - Browser Tool does not work in WSL environments
  • No saved state - Each session starts fresh without cookies or authentication
  • Single session - Only one browser session can be active at a time
Always close the browser session before using file operations, search tools, or bash commands. The browser locks other tools during active sessions.

See Also

Code Diff

Review and approve code changes