browser-use

安装量: 40
排名: #18031

安装

npx skills add https://github.com/browser-use/browser-use --skill browser-use
Browser Automation with browser-use CLI
The
browser-use
command provides fast, persistent browser automation. It maintains browser sessions across commands, enabling complex multi-step workflows.
Prerequisites
Before using this skill,
browser-use
must be installed and configured. Run diagnostics to verify:
browser-use doctor
For more information, see
https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md
Core Workflow
Navigate
:
browser-use open
- Opens URL (starts browser if needed)
Inspect
:
browser-use state
- Returns clickable elements with indices
Interact
Use indices from state to interact (
browser-use click 5
,
browser-use input 3 "text"
)
Verify
:
browser-use state
or
browser-use screenshot
to confirm actions
Repeat
Browser stays open between commands Browser Modes browser-use --browser chromium open < url

Default: headless Chromium

browser-use --browser chromium --headed open < url

Visible Chromium window

browser-use --browser real open < url

Real Chrome (no profile = fresh)

browser-use --browser real --profile "Default" open < url

Real Chrome with your login sessions

browser-use --browser remote open < url

Cloud browser

chromium
Fast, isolated, headless by default
real
Uses a real Chrome binary. Without
--profile
, uses a persistent but empty CLI profile at
~/.config/browseruse/profiles/cli/
. With
--profile "ProfileName"
, copies your actual Chrome profile (cookies, logins, extensions)
remote
Cloud-hosted browser with proxy support Essential Commands

Navigation

browser-use open < url

Navigate to URL

browser-use back

Go back

browser-use scroll down

Scroll down (--amount N for pixels)

Page State (always run state first to get element indices)

browser-use state

Get URL, title, clickable elements

browser-use screenshot

Take screenshot (base64)

browser-use screenshot path.png

Save screenshot to file

Interactions (use indices from state)

browser-use click < index

Click element

browser-use type "text"

Type into focused element

browser-use input < index

"text"

Click element, then type

browser-use keys "Enter"

Send keyboard keys

browser-use select < index

"option"

Select dropdown option

Data Extraction

browser-use eval "document.title"

Execute JavaScript

browser-use get text < index

Get element text

browser-use get html --selector "h1"

Get scoped HTML

Wait

browser-use wait selector "h1"

Wait for element

browser-use wait text "Success"

Wait for text

Session

browser-use sessions

List active sessions

browser-use close

Close current session

browser-use close --all

Close all sessions

AI Agent

browser-use -b remote run "task"

Run agent in cloud (async by default)

browser-use task status < id

Check cloud task progress

Commands Navigation & Tabs browser-use open < url

Navigate to URL

browser-use back

Go back in history

browser-use scroll down

Scroll down

browser-use scroll up

Scroll up

browser-use scroll down --amount 1000

Scroll by specific pixels (default: 500)

browser-use switch < tab

Switch to tab by index

browser-use close-tab

Close current tab

browser-use close-tab < tab

Close specific tab

Page State browser-use state

Get URL, title, and clickable elements

browser-use screenshot

Take screenshot (outputs base64)

browser-use screenshot path.png

Save screenshot to file

browser-use screenshot --full path.png

Full page screenshot

Interactions browser-use click < index

Click element

browser-use type "text"

Type text into focused element

browser-use input < index

"text"

Click element, then type text

browser-use keys "Enter"

Send keyboard keys

browser-use keys "Control+a"

Send key combination

browser-use select < index

"option"

Select dropdown option

browser-use hover < index

Hover over element (triggers CSS :hover)

browser-use dblclick < index

Double-click element

browser-use rightclick < index

Right-click element (context menu)

Use indices from browser-use state . JavaScript & Data browser-use eval "document.title"

Execute JavaScript, return result

browser-use get title

Get page title

browser-use get html

Get full page HTML

browser-use get html --selector "h1"

Get HTML of specific element

browser-use get text < index

Get text content of element

browser-use get value < index

Get value of input/textarea

browser-use get attributes < index

Get all attributes of element

browser-use get bbox < index

Get bounding box (x, y, width, height)

Cookies browser-use cookies get

Get all cookies

browser-use cookies get --url < url

Get cookies for specific URL

browser-use cookies set < name

< value

Set a cookie

browser-use cookies set name val --domain .example.com --secure --http-only browser-use cookies set name val --same-site Strict

SameSite: Strict, Lax, or None

browser-use cookies set name val --expires 1735689600

Expiration timestamp

browser-use cookies clear

Clear all cookies

browser-use cookies clear --url < url

Clear cookies for specific URL

browser-use cookies export < file

Export all cookies to JSON file

browser-use cookies export < file

--url < url

Export cookies for specific URL

browser-use cookies import < file

Import cookies from JSON file

Wait Conditions browser-use wait selector "h1"

Wait for element to be visible

browser-use wait selector ".loading" --state hidden

Wait for element to disappear

browser-use wait selector "#btn" --state attached

Wait for element in DOM

browser-use wait text "Success"

Wait for text to appear

browser-use wait selector "h1" --timeout 5000

Custom timeout in ms

Python Execution browser-use python "x = 42"

Set variable

browser-use python "print(x)"

Access variable (outputs: 42)

browser-use python "print(browser.url)"

Access browser object

browser-use python --vars

Show defined variables

browser-use python --reset

Clear Python namespace

browser-use python --file script.py

Execute Python file

The Python session maintains state across commands. The browser object provides: browser.url , browser.title , browser.html — page info browser.goto(url) , browser.back() — navigation browser.click(index) , browser.type(text) , browser.input(index, text) , browser.keys(keys) — interactions browser.screenshot(path) , browser.scroll(direction, amount) — visual browser.wait(seconds) , browser.extract(query) — utilities Agent Tasks Remote Mode Options When using --browser remote , additional options are available:

Specify LLM model

browser-use -b remote run "task" --llm gpt-4o browser-use -b remote run "task" --llm claude-sonnet-4-20250514

Proxy configuration (default: us)

browser-use -b remote run "task" --proxy-country uk

Session reuse

browser-use -b remote run "task 1" --keep-alive

Keep session alive after task

browser-use -b remote run "task 2" --session-id abc-123

Reuse existing session

Execution modes

browser-use -b remote run "task" --flash

Fast execution mode

browser-use -b remote run "task" --wait

Wait for completion (default: async)

Advanced options

browser-use -b remote run "task" --thinking

Extended reasoning mode

browser-use -b remote run "task" --no-vision

Disable vision (enabled by default)

Using a cloud profile (create session first, then run with --session-id)

browser-use session create --profile < cloud-profile-id

--keep-alive

→ returns session_id

browser-use -b remote run "task" --session-id < session-id

Task configuration

browser-use -b remote run "task" --start-url https://example.com

Start from specific URL

browser-use -b remote run "task" --allowed-domain example.com

Restrict navigation (repeatable)

browser-use -b remote run "task" --metadata key = value

Task metadata (repeatable)

browser-use -b remote run "task" --skill-id skill-123

Enable skills (repeatable)

browser-use -b remote run "task" --secret key = value

Secret metadata (repeatable)

Structured output and evaluation

browser-use -b remote run "task" --structured-output '{"type":"object"}'

JSON schema for output

browser-use -b remote run "task" --judge

Enable judge mode

browser-use -b remote run "task" --judge-ground-truth "expected answer" Task Management browser-use task list

List recent tasks

browser-use task list --limit 20

Show more tasks

browser-use task list --status finished

Filter by status (finished, stopped)

browser-use task list --session < id

Filter by session ID

browser-use task list --json

JSON output

browser-use task status < task-id

Get task status (latest step only)

browser-use task status < task-id

-c

All steps with reasoning

browser-use task status < task-id

-v

All steps with URLs + actions

browser-use task status < task-id

--last 5

Last N steps only

browser-use task status < task-id

--step 3

Specific step number

browser-use task status < task-id

--reverse

Newest first

browser-use task stop < task-id

Stop a running task

browser-use task logs < task-id

Get task execution logs

Cloud Session Management browser-use session list

List cloud sessions

browser-use session list --limit 20

Show more sessions

browser-use session list --status active

Filter by status

browser-use session list --json

JSON output

browser-use session get < session-id

Get session details + live URL

browser-use session get < session-id

--json browser-use session stop < session-id

Stop a session

browser-use session stop --all

Stop all active sessions

browser-use session create

Create with defaults

browser-use session create --profile < id

With cloud profile

browser-use session create --proxy-country uk

With geographic proxy

browser-use session create --start-url https://example.com browser-use session create --screen-size 1920x1080 browser-use session create --keep-alive browser-use session create --persist-memory browser-use session share < session-id

Create public share URL

browser-use session share < session-id

--delete

Delete public share

Tunnels browser-use tunnel < port

Start tunnel (returns URL)

browser-use tunnel < port

Idempotent - returns existing URL

browser-use tunnel list

Show active tunnels

browser-use tunnel stop < port

Stop tunnel

browser-use tunnel stop --all

Stop all tunnels

Session Management browser-use sessions

List active sessions

browser-use close

Close current session

browser-use close --all

Close all sessions

Profile Management Local Chrome Profiles ( --browser real ) browser-use -b real profile list

List local Chrome profiles

browser-use -b real profile cookies "Default"

Show cookie domains in profile

Cloud Profiles ( --browser remote ) browser-use -b remote profile list

List cloud profiles

browser-use -b remote profile list --page 2 --page-size 50 browser-use -b remote profile get < id

Get profile details

browser-use -b remote profile create

Create new cloud profile

browser-use -b remote profile create --name "My Profile" browser-use -b remote profile update < id

--name "New" browser-use -b remote profile delete < id

Syncing browser-use profile sync --from "Default" --domain github.com

Domain-specific

browser-use profile sync --from "Default"

Full profile

browser-use profile sync --from "Default" --name "Custom Name"

With custom name

Server Control browser-use server logs

View server logs

Common Workflows Exposing Local Dev Servers Use when you have a local dev server and need a cloud browser to reach it. Core workflow: Start dev server → create tunnel → browse the tunnel URL remotely.

1. Start your dev server

npm run dev &

localhost:3000

2. Expose it via Cloudflare tunnel

browser-use tunnel 3000

→ url: https://abc.trycloudflare.com

3. Now the cloud browser can reach your local server

browser-use --browser remote open https://abc.trycloudflare.com browser-use state browser-use screenshot Note: Tunnels are independent of browser sessions. They persist across browser-use close and can be managed separately. Cloudflared must be installed — run browser-use doctor to check. Authenticated Browsing with Profiles Use when a task requires browsing a site the user is already logged into (e.g. Gmail, GitHub, internal tools). Core workflow: Check existing profiles → ask user which profile and browser mode → browse with that profile. Only sync cookies if no suitable profile exists. Before browsing an authenticated site, the agent MUST: Ask the user whether to use real (local Chrome) or remote (cloud) browser List available profiles for that mode Ask which profile to use If no profile has the right cookies, offer to sync (see below) Step 1: Check existing profiles

Option A: Local Chrome profiles (--browser real)

browser-use -b real profile list

→ Default: Person 1 (user@gmail.com)

→ Profile 1: Work (work@company.com)

Option B: Cloud profiles (--browser remote)

browser-use -b remote profile list

→ abc-123: "Chrome - Default (github.com)"

→ def-456: "Work profile"

Step 2: Browse with the chosen profile

Real browser — uses local Chrome with existing login sessions

browser-use --browser real --profile "Default" open https://github.com

Cloud browser — uses cloud profile with synced cookies

browser-use --browser remote --profile abc-123 open https://github.com The user is already authenticated — no login needed. Note: Cloud profile cookies can expire over time. If authentication fails, re-sync cookies from the local Chrome profile. Step 3: Syncing cookies (only if needed) If the user wants to use a cloud browser but no cloud profile has the right cookies, sync them from a local Chrome profile. Before syncing, the agent MUST: Ask which local Chrome profile to use Ask which domain(s) to sync — do NOT default to syncing the full profile Confirm before proceeding Check what cookies a local profile has: browser-use -b real profile cookies "Default"

→ youtube.com: 23

→ google.com: 18

→ github.com: 2

Domain-specific sync (recommended): browser-use profile sync --from "Default" --domain github.com

Creates new cloud profile: "Chrome - Default (github.com)"

Only syncs github.com cookies

Full profile sync (use with caution): browser-use profile sync --from "Default"

Syncs ALL cookies — includes sensitive data, tracking cookies, every session token

Only use when the user explicitly needs their entire browser state. Fine-grained control (advanced):

Export cookies to file, manually edit, then import

browser-use
--browser
real
--profile
"Default"
cookies
export
/tmp/cookies.json
browser-use
--browser
remote
--profile
<
id
>
cookies
import
/tmp/cookies.json
Use the synced profile:
browser-use
--browser
remote
--profile
<
id
>
open
https://github.com
Running Subagents
Use cloud sessions to run autonomous browser agents in parallel.
Core workflow:
Launch task(s) with
run
→ poll with
task status
→ collect results → clean up sessions.
Session = Agent
Each cloud session is a browser agent with its own state
Task = Work
Jobs given to an agent; an agent can run multiple tasks sequentially
Session lifecycle
Once stopped, a session cannot be revived — start a new one Launching Tasks

Single task (async by default — returns immediately)

browser-use -b remote run "Search for AI news and summarize top 3 articles"

→ task_id: task-abc, session_id: sess-123

Parallel tasks — each gets its own session

browser-use -b remote run "Research competitor A pricing"

→ task_id: task-1, session_id: sess-a

browser-use -b remote run "Research competitor B pricing"

→ task_id: task-2, session_id: sess-b

browser-use -b remote run "Research competitor C pricing"

→ task_id: task-3, session_id: sess-c

Sequential tasks in same session (reuses cookies, login state, etc.)

browser-use -b remote run "Log into example.com" --keep-alive

→ task_id: task-1, session_id: sess-123

browser-use task status task-1

Wait for completion

browser-use -b remote run "Export settings" --session-id sess-123

→ task_id: task-2, session_id: sess-123 (same session)

Managing & Stopping browser-use task list --status finished

See completed tasks

browser-use task stop task-abc

Stop a task (session may continue if --keep-alive)

browser-use session stop sess-123

Stop an entire session (terminates its tasks)

browser-use session stop --all

Stop all sessions

Monitoring Task status is designed for token efficiency. Default output is minimal — only expand when needed: Mode Flag Tokens Use When Default (none) Low Polling progress Compact -c Medium Need full reasoning Verbose -v High Debugging actions

For long tasks (50+ steps)

browser-use task status < id

-c --last 5

Last 5 steps only

browser-use task status < id

-v --step 10

Inspect specific step

Live view
:
browser-use session get
returns a live URL to watch the agent.
Detect stuck tasks
If cost/duration in
task status
stops increasing, the task is stuck — stop it and start a new agent.
Logs
:
browser-use task logs
— only available after task completes.
Global Options
Option
Description
--session NAME
Use named session (default: "default")
--browser MODE
Browser mode: chromium, real, remote
--headed
Show browser window (chromium mode)
--profile NAME
Browser profile (local name or cloud ID). Works with
open
,
session create
, etc. — does NOT work with
run
(use
--session-id
instead)
--json
Output as JSON
--mcp
Run as MCP server via stdin/stdout
Session behavior
All commands without --session use the same "default" session. The browser stays open and is reused across commands. Use --session NAME to run multiple browsers in parallel. Tips Always run browser-use state first to see available elements and their indices Use --headed for debugging to see what the browser is doing Sessions persist — the browser stays open between commands Use --json for programmatic parsing Python variables persist across browser-use python commands within a session CLI aliases : bu , browser , and browseruse all work identically to browser-use Troubleshooting Run diagnostics first: browser-use doctor Browser won't start? browser-use close --all

Close all sessions

browser-use --headed open < url

Try with visible window

Element not found? browser-use state

Check current elements

browser-use scroll down

Element might be below fold

browser-use state

Check again

Session issues? browser-use sessions

Check active sessions

browser-use close --all

Clean slate

browser-use open < url

Fresh start

Session reuse fails after
task stop
:
If you stop a task and try to reuse its session, the new task may get stuck at "created" status. Create a new session instead:
browser-use session create
--profile
<
profile-id
>
--keep-alive
browser-use
-b
remote run
"new task"
--session-id
<
new-session-id
>
Task stuck at "started"
Check cost with
task status
— if not increasing, the task is stuck. View live URL with
session get
, then stop and start a new agent.
Sessions persist after tasks complete
Tasks finishing doesn't auto-stop sessions. Run browser-use session stop --all to clean up. Cleanup Always close the browser when done: browser-use close

Close browser session

browser-use session stop --all

Stop cloud sessions (if any)

browser-use tunnel stop --all

Stop tunnels (if any)

返回排行榜