Remote Browser Automation for Sandboxed Agents This skill is for agents running on sandboxed remote machines (cloud VMs, CI, coding agents) that need to control a browser. Install browser-use and drive a cloud browser — no local Chrome needed. Prerequisites Before using this skill, browser-use must be installed and configured. Run diagnostics to verify: browser-use doctor For more information, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md Core Workflow Commands use the cloud browser:

Step 1: Start session (automatically uses remote mode)

browser-use open https://example.com

Returns: url, live_url (view the browser in real-time)

Step 2+: All subsequent commands use the existing session

browser-use state

Get page elements with indices

browser-use click 5

Click element by index

browser-use type "Hello World"

Type into focused element

browser-use input 3 "text"

Click element, then type

browser-use screenshot

Take screenshot (base64)

browser-use screenshot page.png

Save screenshot to file

Done: Close the session

browser-use close

Close browser and release resources

Essential Commands

browser-use open < url

Navigate to URL

browser-use back

Go back

browser-use scroll down

Scroll down (--amount N for pixels)

Page State (always run state first to get element indices)

browser-use state

Get URL, title, clickable elements

browser-use screenshot

Take screenshot (base64)

browser-use screenshot path.png

Save screenshot to file

Interactions (use indices from state)

browser-use click < index

Click element

browser-use type "text"

Type into focused element

browser-use input < index

"text"

Click element, then type

browser-use keys "Enter"

Send keyboard keys

browser-use select < index

"option"

Data Extraction

browser-use eval "document.title"

Execute JavaScript

browser-use get text < index

Get element text

browser-use get html --selector "h1"

Get scoped HTML

Wait

browser-use wait selector "h1"

Wait for element

browser-use wait text "Success"

Wait for text

Session

browser-use close

Close browser session

AI Agent

browser-use run "task"

Run agent (async by default)

browser-use task status < id

Check task progress

Commands Navigation & Tabs browser-use open < url

Navigate to URL

browser-use back

Go back in history

browser-use scroll down

Scroll down

browser-use scroll up

Scroll up

browser-use scroll down --amount 1000

Scroll by specific pixels (default: 500)

browser-use switch < tab

Switch tab by index

browser-use close-tab

Close current tab

browser-use close-tab < tab

Close specific tab

Page State browser-use state

Get URL, title, and clickable elements

browser-use screenshot

Take screenshot (base64)

browser-use screenshot path.png

Save screenshot to file

browser-use screenshot --full p.png

Full page screenshot

Interactions browser-use click < index

Click element

browser-use type "text"

Type into focused element

browser-use input < index

"text"

Click element, then type

browser-use keys "Enter"

Send keyboard keys

browser-use keys "Control+a"

Key combination

browser-use select < index

"option"

browser-use hover < index

Hover over element

browser-use dblclick < index

Double-click

browser-use rightclick < index

Right-click

Use indices from browser-use state . JavaScript & Data browser-use eval "document.title"

Execute JavaScript

browser-use get title

Get page title

browser-use get html

Get page HTML

browser-use get html --selector "h1"

Scoped HTML

browser-use get text < index

Get element text

browser-use get value < index

Get input value

browser-use get attributes < index

Get element attributes

browser-use get bbox < index

Get bounding box (x, y, width, height)

Cookies browser-use cookies get

Get all cookies

browser-use cookies get --url < url

Get cookies for specific URL

browser-use cookies set < name

< val

browser-use cookies set name val --domain .example.com --secure browser-use cookies set name val --same-site Strict

SameSite: Strict, Lax, None

browser-use cookies set name val --expires 1735689600

Expiration timestamp

browser-use cookies clear

Clear all cookies

browser-use cookies clear --url < url

Clear cookies for specific URL

browser-use cookies export < file

Export to JSON

browser-use cookies import < file

Import from JSON

Wait Conditions browser-use wait selector "h1"

Wait for element

browser-use wait selector ".loading" --state hidden

Wait for element to disappear

browser-use wait text "Success"

Wait for text

browser-use wait selector "#btn" --timeout 5000

Custom timeout (ms)

Python Execution browser-use python "x = 42"

Set variable

browser-use python "print(x)"

Access variable (prints: 42)

browser-use python "print(browser.url)"

Access browser object

browser-use python --vars

Show defined variables

browser-use python --reset

Clear namespace

browser-use python --file script.py

Run Python file

The Python session maintains state across commands. The browser object provides: browser.url , browser.title , browser.html — page info browser.goto(url) , browser.back() — navigation browser.click(index) , browser.type(text) , browser.input(index, text) , browser.keys(keys) — interactions browser.screenshot(path) , browser.scroll(direction, amount) — visual browser.wait(seconds) , browser.extract(query) — utilities Agent Tasks browser-use run "Fill the contact form with test data"

AI agent

browser-use run "Extract all product prices" --max-steps 50

Specify LLM model

browser-use run "task" --llm gpt-4o browser-use run "task" --llm claude-sonnet-4-20250514

Proxy configuration (default: us)

browser-use run "task" --proxy-country uk

Session reuse

browser-use run "task 1" --keep-alive

Keep session alive after task

browser-use run "task 2" --session-id abc-123

Reuse existing session

Execution modes

browser-use run "task" --flash

Fast execution mode

browser-use run "task" --wait

Wait for completion (default: async)

Advanced options

browser-use run "task" --thinking

Extended reasoning mode

browser-use run "task" --no-vision

Disable vision (enabled by default)

Using a cloud profile (create session first, then run with --session-id)

browser-use session create --profile < cloud-profile-id

--keep-alive

→ returns session_id

browser-use run "task" --session-id < session-id

Task configuration

browser-use run "task" --start-url https://example.com

Start from specific URL

browser-use run "task" --allowed-domain example.com

browser-use run "task" --metadata key = value

Task metadata (repeatable)

browser-use run "task" --skill-id skill-123

Enable skills (repeatable)

browser-use run "task" --secret key = value

Secret metadata (repeatable)

Structured output and evaluation

browser-use run "task" --structured-output '{"type":"object"}'

JSON schema for output

browser-use run "task" --judge

Enable judge mode

browser-use run "task" --judge-ground-truth "answer" Task Management browser-use task list

List recent tasks

browser-use task list --limit 20

Show more tasks

browser-use task list --status finished

Filter by status (finished, stopped)

browser-use task list --session < id

Filter by session ID

browser-use task list --json

JSON output