browsing

安装量: 131
排名: #6564

安装

npx skills add https://github.com/obra/superpowers-chrome --skill browsing
Browsing with Chrome Direct
Overview
Control Chrome via DevTools Protocol using the
use_browser
MCP tool. Single unified interface with auto-starting Chrome.
Announce:
"I'm using the browsing skill to control Chrome."
When to Use
Use this when:
Controlling authenticated sessions
Managing multiple tabs in running browser
Playwright MCP unavailable or excessive
Use Playwright MCP when:
Need fresh browser instances
Generating screenshots/PDFs
Prefer higher-level abstractions
Auto-Capture
Every DOM action (navigate, click, type, select, eval, keyboard_press) automatically saves:
{prefix}.png
— viewport screenshot
{prefix}.md
— page content as structured markdown
{prefix}.html
— full rendered DOM
{prefix}-console.txt
— browser console messages
Files are saved to the session directory with sequential prefixes (001-navigate, 002-click, etc.). You must check these before using extract or screenshot actions.
The use_browser Tool
Single MCP tool with action-based interface. Chrome auto-starts on first use.
Parameters:
action
(required): Operation to perform
tab_index
(optional): Tab to operate on (default: 0)
selector
(optional): CSS selector for element operations
payload
(optional): Action-specific data
timeout
(optional): Timeout in ms for await operations (default: 5000)
Actions Reference
Navigation
navigate
Navigate to URL
payload
URL string
Example:
{action: "navigate", payload: "https://example.com"}
await_element
Wait for element to appear
selector
CSS selector
timeout
Max wait time in ms
Example:
{action: "await_element", selector: ".loaded", timeout: 10000}
await_text
Wait for text to appear
payload
Text to wait for
Example:
{action: "await_text", payload: "Welcome"}
Interaction
click
Click element
selector
CSS selector
Example:
{action: "click", selector: "button.submit"}
type
Type text into input (append
\n
to submit)
selector
CSS selector
payload
Text to type
Example:
{action: "type", selector: "#email", payload: "user@example.com\n"}
select
Select dropdown option
selector
CSS selector
payload
Option value(s)
Example:
{action: "select", selector: "select[name=state]", payload: "CA"}
Extraction
extract
Get page content
payload
Format ('markdown'|'text'|'html')
selector
Optional - limit to element
Example:
{action: "extract", payload: "markdown"}
Example:
{action: "extract", payload: "text", selector: "h1"}
attr
Get element attribute
selector
CSS selector
payload
Attribute name
Example:
{action: "attr", selector: "a.download", payload: "href"}
eval
Execute JavaScript
payload
JavaScript code
Example:
{action: "eval", payload: "document.title"}
Export
screenshot
Capture screenshot of a specific element
payload
Filename
selector
Optional - screenshot specific element
Viewport screenshots are auto-captured after every DOM action. Use this only when you need a specific element.
Example:
{action: "screenshot", payload: "/tmp/chart.png", selector: ".chart"}
Tab Management
list_tabs
List all open tabs
Example:
{action: "list_tabs"}
new_tab
Create new tab
Example:
{action: "new_tab"}
close_tab
Close tab
tab_index
Tab to close
Example:
{action: "close_tab", tab_index: 2}
Browser Mode Control
show_browser
Make browser window visible (headed mode)
Example:
{action: "show_browser"}
⚠️
WARNING
Restarts Chrome, reloads pages via GET, loses POST state
hide_browser
Switch to headless mode (invisible browser)
Example:
{action: "hide_browser"}
⚠️
WARNING
Restarts Chrome, reloads pages via GET, loses POST state
browser_mode
Check current browser mode, port, and profile
Example:
{action: "browser_mode"}
Returns:
{"headless": true|false, "mode": "headless"|"headed", "running": true|false, "port": 9222, "profile": "name", "profileDir": "/path"}
Profile Management
set_profile
Change Chrome profile (must kill Chrome first)
Example:
{action: "set_profile", "payload": "browser-user"}
⚠️
WARNING
Chrome must be stopped first
get_profile
Get current profile name and directory
Example:
{action: "get_profile"}
Returns:
{"profile": "name", "profileDir": "/path"}
Default behavior
Chrome starts in
headless mode
with
"superpowers-chrome" profile
on a
dynamically allocated port
(range 9222-12111). Override with
CHROME_WS_PORT
env var or MCP
--port=N
flag.
Critical caveats when toggling modes
:
Chrome must restart
- Cannot switch headless/headed mode on running Chrome
Pages reload via GET
- All open tabs are reopened with GET requests
POST state is lost
- Form submissions, POST results, and POST-based navigation will be lost
Session state is lost
- Any client-side state (JavaScript variables, etc.) is cleared
Cookies/auth may persist
- Uses same user data directory, so logged-in sessions may survive
When to use headed mode
:
Debugging visual rendering issues
Demonstrating browser behavior to user
Testing features that only work with visible browser
Debugging issues that don't reproduce in headless mode
When to stay in headless mode
(default):
All other cases - faster, cleaner, less intrusive
Screenshots work perfectly in headless mode
Most automation works identically in both modes
Profile management
:
Profiles store persistent browser data (cookies, localStorage, extensions, auth sessions).
Profile locations
:
macOS:
~/Library/Caches/superpowers/browser-profiles/{name}/
Linux:
~/.cache/superpowers/browser-profiles/{name}/
Windows:
%LOCALAPPDATA%/superpowers/browser-profiles/{name}/
When to use separate profiles
:
Default profile ("superpowers-chrome")
General automation, shared sessions
Agent-specific profiles
Isolate different agents' browser state
Example: browser-user agent uses "browser-user" profile
Task-specific profiles
Testing with different user contexts
Example: "test-logged-in" vs "test-logged-out"
Profile data persists across
:
Chrome restarts
Mode toggles (headless ↔ headed)
System reboots (data is in cache directory)
To use a different profile
:
Kill Chrome if running:
await chromeLib.killChrome()
Set profile:
{action: "set_profile", "payload": "my-profile"}
Start Chrome: Next navigate/action will use new profile
Quick Start Pattern
Navigate and extract:
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "h1"}
{action: "extract", payload: "text", selector: "h1"}
Common Patterns
Fill and Submit Form
{action: "navigate", payload: "https://example.com/login"}
{action: "await_element", selector: "input[name=email]"}
{action: "type", selector: "input[name=email]", payload: "user@example.com"}
{action: "type", selector: "input[name=password]", payload: "pass123\n"}
{action: "await_text", payload: "Welcome"}
The
\n
at the end of the password submits the form.
Multi-Tab Workflow
{action: "list_tabs"}
{action: "click", tab_index: 2, selector: "a.email"}
{action: "await_element", tab_index: 2, selector: ".content"}
{action: "extract", tab_index: 2, payload: "text", selector: ".amount"}
Dynamic Content
{action: "navigate", payload: "https://example.com"}
{action: "type", selector: "input[name=q]", payload: "query"}
{action: "click", selector: "button.search"}
{action: "await_element", selector: ".results"}
{action: "extract", payload: "text", selector: ".result-title"}
Get Link Attribute
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "a.download"}
{action: "attr", selector: "a.download", payload: "href"}
Execute JavaScript
{action: "eval", payload: "document.querySelectorAll('a').length"}
{action: "eval", payload: "Array.from(document.querySelectorAll('a')).map(a => a.href)"}
Resize Viewport (Responsive Testing)
Use
eval
to resize the browser window for testing responsive layouts:
{action: "eval", payload: "window.resizeTo(375, 812); 'Resized to mobile'"}
{action: "eval", payload: "window.resizeTo(768, 1024); 'Resized to tablet'"}
{action: "eval", payload: "window.resizeTo(1920, 1080); 'Resized to desktop'"}
Note
This resizes the window, not device emulation. It won't change:
Device pixel ratio (retina displays)
Touch events
User-Agent string
For most responsive testing, window resize is sufficient.
Clear Cookies
Use
eval
to clear cookies accessible to JavaScript:
{action: "eval", payload: "document.cookie.split(';').forEach(c => { document.cookie = c.trim().split('=')[0] + '=;expires=Thu, 01 Jan 1970 00:00:00 GMT;path=/'; }); 'Cookies cleared'"}
Note
This clears cookies accessible to JavaScript. It won't clear: httpOnly cookies (server-side only) Cookies from other domains For most logout/reset scenarios, this is sufficient. Scroll Page {action: "eval", payload: "window.scrollTo(0, document.body.scrollHeight); 'Scrolled to bottom'"} {action: "eval", payload: "window.scrollTo(0, 0); 'Scrolled to top'"} {action: "eval", payload: "document.querySelector('.target').scrollIntoView(); 'Scrolled to element'"} Tips Always wait before interaction: Don't click or fill immediately after navigate - pages need time to load. // BAD - might fail if page slow {action: "navigate", payload: "https://example.com"} {action: "click", selector: "button"} // May fail! // GOOD - wait first {action: "navigate", payload: "https://example.com"} {action: "await_element", selector: "button"} {action: "click", selector: "button"} Use specific selectors: Avoid generic selectors that match multiple elements. // BAD - matches first button {action: "click", selector: "button"} // GOOD - specific {action: "click", selector: "button[type=submit]"} {action: "click", selector: "#login-button"} Submit forms with \n: Append newline to text to submit forms automatically. {action: "type", selector: "#search", payload: "query\n"} Check content first: Extract page content to verify selectors before building workflow. {action: "extract", payload: "html"} Troubleshooting Element not found: Use await_element before interaction Verify selector with extract action using 'html' format Timeout errors: Increase timeout: {timeout: 30000} for slow pages Wait for specific element instead of text Tab index out of range: Use list_tabs to get current indices Tab indices change when tabs close eval returns [object Object] : Use JSON.stringify() for complex objects: {action: "eval", payload: "JSON.stringify({name: 'test'})"} For async functions: {action: "eval", payload: "JSON.stringify(await yourAsyncFunction())"} Test Automation (Advanced) When building test automation, you have two approaches: Approach 1: use_browser MCP (Simple Tests) Best for: Single-step tests, direct Claude control during conversation { "action" : "navigate" , "payload" : "https://app.com" } { "action" : "click" , "selector" : "#test-button" } { "action" : "eval" , "payload" : "JSON.stringify({passed: document.querySelector('.success') !== null})" } Approach 2: chrome-ws CLI (Complex Tests) Best for: Multi-step test suites, standalone automation scripts Key insight : chrome-ws is the reference implementation showing proper Chrome DevTools Protocol usage. When use_browser doesn't work as expected, examine how chrome-ws handles the same operation.

Example: Automated form testing

./chrome-ws navigate
0
"https://app.com/form"
./chrome-ws fill
0
"#email"
"test@example.com"
./chrome-ws click
0
"button[type=submit]"
./chrome-ws wait-text
0
"Success"
When use_browser Fails
Check chrome-ws source code
- It shows the correct CDP pattern
Use chrome-ws to verify
- Test the same operation via CLI
Adapt the pattern
- Apply the working CDP approach to use_browser
Common Test Automation Patterns
Form validation
Fill forms, check error states
UI state testing
Click elements, verify DOM changes
Performance testing
Measure load times, capture metrics
Screenshot comparison
Capture before/after states Advanced Usage For command-line usage outside Claude Code, see COMMANDLINE-USAGE.md . For detailed examples, see EXAMPLES.md . Protocol Reference Full CDP documentation: https://chromedevtools.github.io/devtools-protocol/
返回排行榜