remote-browser

安装量: 34
排名: #19910

安装

npx skills add https://github.com/browser-use/browser-use --skill remote-browser

Remote Browser Automation for Sandboxed Agents This skill is for agents running on sandboxed remote machines (cloud VMs, CI, coding agents) that need to control a browser. Install browser-use and drive a cloud browser — no local Chrome needed. Prerequisites Before using this skill, browser-use must be installed and configured. Run diagnostics to verify: browser-use doctor For more information, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md Core Workflow Commands use the cloud browser:

Step 1: Start session (automatically uses remote mode)

browser-use open https://example.com

Returns: url, live_url (view the browser in real-time)

Step 2+: All subsequent commands use the existing session

browser-use state

Get page elements with indices

browser-use click 5

Click element by index

browser-use type "Hello World"

Type into focused element

browser-use input 3 "text"

Click element, then type

browser-use screenshot

Take screenshot (base64)

browser-use screenshot page.png

Save screenshot to file

Done: Close the session

browser-use close

Close browser and release resources

Essential Commands

Navigation

browser-use open < url

Navigate to URL

browser-use back

Go back

browser-use scroll down

Scroll down (--amount N for pixels)

Page State (always run state first to get element indices)

browser-use state

Get URL, title, clickable elements

browser-use screenshot

Take screenshot (base64)

browser-use screenshot path.png

Save screenshot to file

Interactions (use indices from state)

browser-use click < index

Click element

browser-use type "text"

Type into focused element

browser-use input < index

"text"

Click element, then type

browser-use keys "Enter"

Send keyboard keys

browser-use select < index

"option"

Select dropdown option

Data Extraction

browser-use eval "document.title"

Execute JavaScript

browser-use get text < index

Get element text

browser-use get html --selector "h1"

Get scoped HTML

Wait

browser-use wait selector "h1"

Wait for element

browser-use wait text "Success"

Wait for text

Session

browser-use close

Close browser session

AI Agent

browser-use run "task"

Run agent (async by default)

browser-use task status < id

Check task progress

Commands Navigation & Tabs browser-use open < url

Navigate to URL

browser-use back

Go back in history

browser-use scroll down

Scroll down

browser-use scroll up

Scroll up

browser-use scroll down --amount 1000

Scroll by specific pixels (default: 500)

browser-use switch < tab

Switch tab by index

browser-use close-tab

Close current tab

browser-use close-tab < tab

Close specific tab

Page State browser-use state

Get URL, title, and clickable elements

browser-use screenshot

Take screenshot (base64)

browser-use screenshot path.png

Save screenshot to file

browser-use screenshot --full p.png

Full page screenshot

Interactions browser-use click < index

Click element

browser-use type "text"

Type into focused element

browser-use input < index

"text"

Click element, then type

browser-use keys "Enter"

Send keyboard keys

browser-use keys "Control+a"

Key combination

browser-use select < index

"option"

Select dropdown option

browser-use hover < index

Hover over element

browser-use dblclick < index

Double-click

browser-use rightclick < index

Right-click

Use indices from browser-use state . JavaScript & Data browser-use eval "document.title"

Execute JavaScript

browser-use get title

Get page title

browser-use get html

Get page HTML

browser-use get html --selector "h1"

Scoped HTML

browser-use get text < index

Get element text

browser-use get value < index

Get input value

browser-use get attributes < index

Get element attributes

browser-use get bbox < index

Get bounding box (x, y, width, height)

Cookies browser-use cookies get

Get all cookies

browser-use cookies get --url < url

Get cookies for specific URL

browser-use cookies set < name

< val

Set a cookie

browser-use cookies set name val --domain .example.com --secure browser-use cookies set name val --same-site Strict

SameSite: Strict, Lax, None

browser-use cookies set name val --expires 1735689600

Expiration timestamp

browser-use cookies clear

Clear all cookies

browser-use cookies clear --url < url

Clear cookies for specific URL

browser-use cookies export < file

Export to JSON

browser-use cookies import < file

Import from JSON

Wait Conditions browser-use wait selector "h1"

Wait for element

browser-use wait selector ".loading" --state hidden

Wait for element to disappear

browser-use wait text "Success"

Wait for text

browser-use wait selector "#btn" --timeout 5000

Custom timeout (ms)

Python Execution browser-use python "x = 42"

Set variable

browser-use python "print(x)"

Access variable (prints: 42)

browser-use python "print(browser.url)"

Access browser object

browser-use python --vars

Show defined variables

browser-use python --reset

Clear namespace

browser-use python --file script.py

Run Python file

The Python session maintains state across commands. The browser object provides: browser.url , browser.title , browser.html — page info browser.goto(url) , browser.back() — navigation browser.click(index) , browser.type(text) , browser.input(index, text) , browser.keys(keys) — interactions browser.screenshot(path) , browser.scroll(direction, amount) — visual browser.wait(seconds) , browser.extract(query) — utilities Agent Tasks browser-use run "Fill the contact form with test data"

AI agent

browser-use run "Extract all product prices" --max-steps 50

Specify LLM model

browser-use run "task" --llm gpt-4o browser-use run "task" --llm claude-sonnet-4-20250514

Proxy configuration (default: us)

browser-use run "task" --proxy-country uk

Session reuse

browser-use run "task 1" --keep-alive

Keep session alive after task

browser-use run "task 2" --session-id abc-123

Reuse existing session

Execution modes

browser-use run "task" --flash

Fast execution mode

browser-use run "task" --wait

Wait for completion (default: async)

Advanced options

browser-use run "task" --thinking

Extended reasoning mode

browser-use run "task" --no-vision

Disable vision (enabled by default)

Using a cloud profile (create session first, then run with --session-id)

browser-use session create --profile < cloud-profile-id

--keep-alive

→ returns session_id

browser-use run "task" --session-id < session-id

Task configuration

browser-use run "task" --start-url https://example.com

Start from specific URL

browser-use run "task" --allowed-domain example.com

Restrict navigation (repeatable)

browser-use run "task" --metadata key = value

Task metadata (repeatable)

browser-use run "task" --skill-id skill-123

Enable skills (repeatable)

browser-use run "task" --secret key = value

Secret metadata (repeatable)

Structured output and evaluation

browser-use run "task" --structured-output '{"type":"object"}'

JSON schema for output

browser-use run "task" --judge

Enable judge mode

browser-use run "task" --judge-ground-truth "answer" Task Management browser-use task list

List recent tasks

browser-use task list --limit 20

Show more tasks

browser-use task list --status finished

Filter by status (finished, stopped)

browser-use task list --session < id

Filter by session ID

browser-use task list --json

JSON output

browser-use task status < task-id

Get task status (latest step only)

browser-use task status < task-id

-c

All steps with reasoning

browser-use task status < task-id

-v

All steps with URLs + actions

browser-use task status < task-id

--last 5

Last N steps only

browser-use task status < task-id

--step 3

Specific step number

browser-use task status < task-id

--reverse

Newest first

browser-use task stop < task-id

Stop a running task

browser-use task logs < task-id

Get task execution logs

Cloud Session Management browser-use session list

List cloud sessions

browser-use session list --limit 20

Show more sessions

browser-use session list --status active

Filter by status

browser-use session list --json

JSON output

browser-use session get < session-id

Get session details + live URL

browser-use session get < session-id

--json browser-use session stop < session-id

Stop a session

browser-use session stop --all

Stop all active sessions

browser-use session create

Create with defaults

browser-use session create --profile < id

With cloud profile

browser-use session create --proxy-country uk

With geographic proxy

browser-use session create --start-url https://example.com browser-use session create --screen-size 1920x1080 browser-use session create --keep-alive browser-use session create --persist-memory browser-use session share < session-id

Create public share URL

browser-use session share < session-id

--delete

Delete public share

Cloud Profile Management browser-use profile list

List cloud profiles

browser-use profile list --page 2 --page-size 50 browser-use profile get < id

Get profile details

browser-use profile create

Create new profile

browser-use profile create --name "My Profile" browser-use profile update < id

--name "New Name" browser-use profile delete < id

Tunnels browser-use tunnel < port

Start tunnel (returns URL)

browser-use tunnel < port

Idempotent - returns existing URL

browser-use tunnel list

Show active tunnels

browser-use tunnel stop < port

Stop tunnel

browser-use tunnel stop --all

Stop all tunnels

Session Management browser-use sessions

List active sessions

browser-use close

Close current session

browser-use close --all

Close all sessions

Common Workflows Exposing Local Dev Servers Use when you have a dev server on the remote machine and need the cloud browser to reach it. Core workflow: Start dev server → create tunnel → browse the tunnel URL.

1. Start your dev server

python -m http.server 3000 &

2. Expose it via Cloudflare tunnel

browser-use tunnel 3000

→ url: https://abc.trycloudflare.com

3. Now the cloud browser can reach your local server

browser-use
open
https://abc.trycloudflare.com
browser-use state
browser-use screenshot
Note:
Tunnels are independent of browser sessions. They persist across
browser-use close
and can be managed separately. Cloudflared must be installed — run
browser-use doctor
to check.
Running Subagents
Use cloud sessions to run autonomous browser agents in parallel.
Core workflow:
Launch task(s) with
run
→ poll with
task status
→ collect results → clean up sessions.
Session = Agent
Each cloud session is a browser agent with its own state
Task = Work
Jobs given to an agent; an agent can run multiple tasks sequentially
Session lifecycle
Once stopped, a session cannot be revived — start a new one Launching Tasks

Single task (async by default — returns immediately)

browser-use run "Search for AI news and summarize top 3 articles"

→ task_id: task-abc, session_id: sess-123

Parallel tasks — each gets its own session

browser-use run "Research competitor A pricing"

→ task_id: task-1, session_id: sess-a

browser-use run "Research competitor B pricing"

→ task_id: task-2, session_id: sess-b

browser-use run "Research competitor C pricing"

→ task_id: task-3, session_id: sess-c

Sequential tasks in same session (reuses cookies, login state, etc.)

browser-use run "Log into example.com" --keep-alive

→ task_id: task-1, session_id: sess-123

browser-use task status task-1

Wait for completion

browser-use run "Export settings" --session-id sess-123

→ task_id: task-2, session_id: sess-123 (same session)

Managing & Stopping browser-use task list --status finished

See completed tasks

browser-use task stop task-abc

Stop a task (session may continue if --keep-alive)

browser-use session stop sess-123

Stop an entire session (terminates its tasks)

browser-use session stop --all

Stop all sessions

Monitoring Task status is designed for token efficiency. Default output is minimal — only expand when needed: Mode Flag Tokens Use When Default (none) Low Polling progress Compact -c Medium Need full reasoning Verbose -v High Debugging actions

For long tasks (50+ steps)

browser-use task status < id

-c --last 5

Last 5 steps only

browser-use task status < id

-v --step 10

Inspect specific step

Live view
:
browser-use session get
returns a live URL to watch the agent.
Detect stuck tasks
If cost/duration in
task status
stops increasing, the task is stuck — stop it and start a new agent.
Logs
:
browser-use task logs
— only available after task completes.
Global Options
Option
Description
--session NAME
Named session (default: "default")
--browser MODE
Browser mode (only if multiple modes installed)
--profile ID
Cloud profile ID for persistent cookies. Works with
open
,
session create
, etc. — does NOT work with
run
(use
--session-id
instead)
--json
Output as JSON
Tips
Run
browser-use doctor
to verify installation before starting
Always run
state
first
to see available elements and their indices
Sessions persist
across commands — the browser stays open until you close it
Tunnels are independent
— they persist across
browser-use close
Use
--json
for programmatic parsing
tunnel
is idempotent
— calling it again for the same port returns the existing URL
Troubleshooting
"Browser mode 'chromium' not installed"?
Expected for sandboxed agents — remote mode only supports cloud browsers
Run
browser-use doctor
to verify configuration
Cloud browser won't start?
Run
browser-use doctor
to check configuration
Tunnel not working?
Verify cloudflared is installed:
which cloudflared
browser-use tunnel list
to check active tunnels
browser-use tunnel stop
and retry
Element not found?
Run
browser-use state
to see current elements
browser-use scroll down
then
browser-use state
— element might be below fold
Session reuse fails after
task stop
:
Create a new session instead:
browser-use session create
--profile
<
profile-id
>
--keep-alive
browser-use run
"new task"
--session-id
<
new-session-id
>
Task stuck at "started"
Check cost with
task status
— if not increasing, the task is stuck. View live URL with
session get
, then stop and start a new agent.
Sessions persist after tasks complete
Run browser-use session stop --all to clean up. Cleanup Always close resources when done: browser-use close

Close browser session

browser-use session stop --all

Stop cloud sessions (if any)

browser-use tunnel stop --all

Stop tunnels (if any)

返回排行榜