Browsing with Chrome Direct

Overview

Control Chrome via DevTools Protocol using the

use_browser

MCP tool. Single unified interface with auto-starting Chrome.

Announce:

"I'm using the browsing skill to control Chrome."

When to Use

Use this when:

Controlling authenticated sessions

Managing multiple tabs in running browser

Playwright MCP unavailable or excessive

Use Playwright MCP when:

Need fresh browser instances

Generating screenshots/PDFs

Prefer higher-level abstractions

Auto-Capture

Every DOM action (navigate, click, type, select, eval, keyboard_press) automatically saves:

{prefix}.png

— viewport screenshot

{prefix}.md

— page content as structured markdown

{prefix}.html

— full rendered DOM

{prefix}-console.txt

— browser console messages

Files are saved to the session directory with sequential prefixes (001-navigate, 002-click, etc.). You must check these before using extract or screenshot actions.

The use_browser Tool

Single MCP tool with action-based interface. Chrome auto-starts on first use.

Parameters:

action

(required): Operation to perform

tab_index

(optional): Tab to operate on (default: 0)

selector

(optional): CSS selector for element operations

payload

(optional): Action-specific data

timeout

(optional): Timeout in ms for await operations (default: 5000)

Actions Reference

Navigation

navigate

Navigate to URL

payload

URL string

Example:

{action: "navigate", payload: "https://example.com"}

await_element

Wait for element to appear

selector

CSS selector

timeout

Max wait time in ms

Example:

{action: "await_element", selector: ".loaded", timeout: 10000}

await_text

Wait for text to appear

payload

Text to wait for

Example:

{action: "await_text", payload: "Welcome"}

Interaction

click

Click element

selector

CSS selector

Example:

{action: "click", selector: "button.submit"}

type

Type text into input (append

\n

to submit)

selector

CSS selector

payload

Text to type

Example:

{action: "type", selector: "#email", payload: "user@example.com\n"}

select

Select dropdown option

selector

CSS selector

payload

Option value(s)

Example:

{action: "select", selector: "select[name=state]", payload: "CA"}

Extraction

extract

Get page content

payload

Format ('markdown'|'text'|'html')

selector

Optional - limit to element

Example:

{action: "extract", payload: "markdown"}

Example:

{action: "extract", payload: "text", selector: "h1"}

attr

Get element attribute

selector

CSS selector

payload

Attribute name

Example:

{action: "attr", selector: "a.download", payload: "href"}

eval

Execute JavaScript

payload

JavaScript code

Example:

{action: "eval", payload: "document.title"}

Export

screenshot

Capture screenshot of a specific element

payload

Filename

selector

Optional - screenshot specific element

Viewport screenshots are auto-captured after every DOM action. Use this only when you need a specific element.

Example:

{action: "screenshot", payload: "/tmp/chart.png", selector: ".chart"}

Tab Management

list_tabs

List all open tabs

Example:

{action: "list_tabs"}

new_tab

Create new tab

Example:

{action: "new_tab"}

close_tab

Close tab

tab_index

Tab to close

Example:

{action: "close_tab", tab_index: 2}

Browser Mode Control

show_browser

Make browser window visible (headed mode)

Example:

{action: "show_browser"}

⚠️

WARNING

Restarts Chrome, reloads pages via GET, loses POST state

hide_browser

Switch to headless mode (invisible browser)

Example:

{action: "hide_browser"}

⚠️

WARNING

Restarts Chrome, reloads pages via GET, loses POST state

browser_mode

Check current browser mode, port, and profile

Example:

{action: "browser_mode"}

Returns:

{"headless": true|false, "mode": "headless"|"headed", "running": true|false, "port": 9222, "profile": "name", "profileDir": "/path"}

Profile Management

set_profile

Change Chrome profile (must kill Chrome first)

Example:

{action: "set_profile", "payload": "browser-user"}

⚠️

WARNING

Chrome must be stopped first

get_profile

Get current profile name and directory

Example:

{action: "get_profile"}

Returns:

{"profile": "name", "profileDir": "/path"}

Default behavior

Chrome starts in

headless mode

with

"superpowers-chrome" profile

on a

dynamically allocated port

(range 9222-12111). Override with

CHROME_WS_PORT

env var or MCP

--port=N

flag.

Critical caveats when toggling modes

:

Chrome must restart

- Cannot switch headless/headed mode on running Chrome

Pages reload via GET

- All open tabs are reopened with GET requests

POST state is lost

- Form submissions, POST results, and POST-based navigation will be lost

Session state is lost

- Any client-side state (JavaScript variables, etc.) is cleared

Cookies/auth may persist

- Uses same user data directory, so logged-in sessions may survive

When to use headed mode

:

Debugging visual rendering issues

Demonstrating browser behavior to user

Testing features that only work with visible browser

Debugging issues that don't reproduce in headless mode

When to stay in headless mode

(default):

All other cases - faster, cleaner, less intrusive

Screenshots work perfectly in headless mode

Most automation works identically in both modes

Profile management

:

Profiles store persistent browser data (cookies, localStorage, extensions, auth sessions).

Profile locations

:

macOS:

~/Library/Caches/superpowers/browser-profiles/{name}/

Linux:

~/.cache/superpowers/browser-profiles/{name}/

Windows:

%LOCALAPPDATA%/superpowers/browser-profiles/{name}/

When to use separate profiles

:

Default profile ("superpowers-chrome")

General automation, shared sessions

Agent-specific profiles

Isolate different agents' browser state

Example: browser-user agent uses "browser-user" profile

Task-specific profiles

Testing with different user contexts

Example: "test-logged-in" vs "test-logged-out"

Profile data persists across

:

Chrome restarts

Mode toggles (headless ↔ headed)

System reboots (data is in cache directory)

To use a different profile

:

Kill Chrome if running:

await chromeLib.killChrome()

Set profile:

{action: "set_profile", "payload": "my-profile"}

Start Chrome: Next navigate/action will use new profile

Quick Start Pattern

Navigate and extract:

{action: "navigate", payload: "https://example.com"}

{action: "await_element", selector: "h1"}

{action: "extract", payload: "text", selector: "h1"}

Common Patterns

Fill and Submit Form

{action: "navigate", payload: "https://example.com/login"}

{action: "await_element", selector: "input[name=email]"}

{action: "type", selector: "input[name=email]", payload: "user@example.com"}

{action: "type", selector: "input[name=password]", payload: "pass123\n"}

{action: "await_text", payload: "Welcome"}

The

\n

at the end of the password submits the form.

Multi-Tab Workflow

{action: "list_tabs"}

{action: "click", tab_index: 2, selector: "a.email"}

{action: "await_element", tab_index: 2, selector: ".content"}

{action: "extract", tab_index: 2, payload: "text", selector: ".amount"}

Dynamic Content

{action: "navigate", payload: "https://example.com"}

{action: "type", selector: "input[name=q]", payload: "query"}

{action: "click", selector: "button.search"}

{action: "await_element", selector: ".results"}

{action: "extract", payload: "text", selector: ".result-title"}

Get Link Attribute

{action: "navigate", payload: "https://example.com"}

{action: "await_element", selector: "a.download"}

{action: "attr", selector: "a.download", payload: "href"}

Execute JavaScript

{action: "eval", payload: "document.querySelectorAll('a').length"}

{action: "eval", payload: "Array.from(document.querySelectorAll('a')).map(a => a.href)"}

Resize Viewport (Responsive Testing)

Use

eval

to resize the browser window for testing responsive layouts:

{action: "eval", payload: "window.resizeTo(375, 812); 'Resized to mobile'"}

{action: "eval", payload: "window.resizeTo(768, 1024); 'Resized to tablet'"}

{action: "eval", payload: "window.resizeTo(1920, 1080); 'Resized to desktop'"}

Note

This resizes the window, not device emulation. It won't change:
Device pixel ratio (retina displays)
Touch events
User-Agent string
For most responsive testing, window resize is sufficient.
Clear Cookies
Use
eval
to clear cookies accessible to JavaScript:
{action: "eval", payload: "document.cookie.split(';').forEach(c => { document.cookie = c.trim().split('=')[0] + '=;expires=Thu, 01 Jan 1970 00:00:00 GMT;path=/'; }); 'Cookies cleared'"}
Note: This clears cookies accessible to JavaScript. It won't clear: httpOnly cookies (server-side only) Cookies from other domains For most logout/reset scenarios, this is sufficient. Scroll Page {action: "eval", payload: "window.scrollTo(0, document.body.scrollHeight); 'Scrolled to bottom'"} {action: "eval", payload: "window.scrollTo(0, 0); 'Scrolled to top'"} {action: "eval", payload: "document.querySelector('.target').scrollIntoView(); 'Scrolled to element'"} Tips Always wait before interaction: Don't click or fill immediately after navigate - pages need time to load. // BAD - might fail if page slow {action: "navigate", payload: "https://example.com"} {action: "click", selector: "button"} // May fail! // GOOD - wait first {action: "navigate", payload: "https://example.com"} {action: "await_element", selector: "button"} {action: "click", selector: "button"} Use specific selectors: Avoid generic selectors that match multiple elements. // BAD - matches first button {action: "click", selector: "button"} // GOOD - specific {action: "click", selector: "button[type=submit]"} {action: "click", selector: "#login-button"} Submit forms with \n: Append newline to text to submit forms automatically. {action: "type", selector: "#search", payload: "query\n"} Check content first: Extract page content to verify selectors before building workflow. {action: "extract", payload: "html"} Troubleshooting Element not found: Use await_element before interaction Verify selector with extract action using 'html' format Timeout errors: Increase timeout: {timeout: 30000} for slow pages Wait for specific element instead of text Tab index out of range: Use list_tabs to get current indices Tab indices change when tabs close eval returns [object Object] : Use JSON.stringify() for complex objects: {action: "eval", payload: "JSON.stringify({name: 'test'})"} For async functions: {action: "eval", payload: "JSON.stringify(await yourAsyncFunction())"} Test Automation (Advanced) When building test automation, you have two approaches: Approach 1: use_browser MCP (Simple Tests) Best for: Single-step tests, direct Claude control during conversation { "action" : "navigate" , "payload" : "https://app.com" } { "action" : "click" , "selector" : "#test-button" } { "action" : "eval" , "payload" : "JSON.stringify({passed: document.querySelector('.success') !== null})" } Approach 2: chrome-ws CLI (Complex Tests) Best for: Multi-step test suites, standalone automation scripts Key insight : chrome-ws is the reference implementation showing proper Chrome DevTools Protocol usage. When use_browser doesn't work as expected, examine how chrome-ws handles the same operation.

Example: Automated form testing

./chrome-ws navigate

0

"https://app.com/form"

./chrome-ws fill

0

"#email"

"test@example.com"

./chrome-ws click

0

"button[type=submit]"

./chrome-ws wait-text

0

"Success"

When use_browser Fails

Check chrome-ws source code

- It shows the correct CDP pattern

Use chrome-ws to verify

- Test the same operation via CLI

Adapt the pattern

- Apply the working CDP approach to use_browser

Common Test Automation Patterns

Form validation

Fill forms, check error states

UI state testing

Click elements, verify DOM changes

Performance testing

Measure load times, capture metrics
Screenshot comparison: Capture before/after states Advanced Usage For command-line usage outside Claude Code, see COMMANDLINE-USAGE.md . For detailed examples, see EXAMPLES.md . Protocol Reference Full CDP documentation: https://chromedevtools.github.io/devtools-protocol/

安装

Example: Automated form testing