opencli-browser

安装量: 2.5K
排名: #1462

安装

npx skills add https://github.com/jackwener/opencli --skill opencli-browser

OpenCLI Browser — Browser Automation for AI Agents Control Chrome step-by-step via CLI. Reuses existing login sessions — no passwords needed. Prerequisites opencli doctor

Verify extension + daemon connectivity

Requires: Chrome running + OpenCLI Browser Bridge extension installed. Critical Rules ALWAYS use state to inspect the page, NEVER use screenshot — state returns structured DOM with [N] element indices, is instant and costs zero tokens. screenshot requires vision processing and is slow. Only use screenshot when the user explicitly asks to save a visual. ALWAYS use click / type / select for interaction, NEVER use eval to click or type — eval "el.click()" bypasses scrollIntoView and CDP click pipeline, causing failures on off-screen elements. Use state to find the [N] index, then click . Verify inputs with get value , not screenshots — after type , run get value to confirm. Run state after every page change — after open , click (on links), scroll , always run state to see the new elements and their indices. Never guess indices. Chain commands aggressively with && — combine open + state , multiple type calls, and type + get value into single && chains. Each tool call has overhead; chaining cuts it. eval is read-only — use eval ONLY for data extraction ( JSON.stringify(...) ), never for clicking, typing, or navigating. Always wrap in IIFE to avoid variable conflicts: eval "(function(){ const x = ...; return JSON.stringify(x); })()" . Minimize total tool calls — plan your sequence before acting. A good task completion uses 3-5 tool calls, not 15-20. Combine open + state as one call. Combine type + type + click as one call. Only run state separately when you need to discover new indices. Prefer network to discover APIs — most sites have JSON APIs. API-based adapters are more reliable than DOM scraping. Command Cost Guide Cost Commands When to use Free & instant state , get * , eval , network , scroll , keys Default — use these Free but changes page open , click , type , select , back Interaction — run state after Expensive (vision tokens) screenshot ONLY when user needs a saved image Action Chaining Rules Commands can be chained with && . The browser persists via daemon, so chaining is safe. Always chain when possible — fewer tool calls = faster completion:

GOOD: open + inspect in one call (saves 1 round trip)

opencli browser open https://example.com && opencli browser state

GOOD: fill form in one call (saves 2 round trips)

opencli browser type 3 "hello" && opencli browser type 4 "world" && opencli browser click 7

GOOD: type + verify in one call

opencli browser type 5 "test@example.com" && opencli browser get value 5

GOOD: click + wait + state in one call (for page-changing clicks)

opencli browser click 12 && opencli browser wait time 1 && opencli browser state

BAD: separate calls for each action (wasteful)

opencli browser type 3 "hello"

Don't do this

opencli browser type 4 "world"

when you can chain

opencli browser click 7

all three together

Page-changing — always put last
in a chain (subsequent commands see stale indices):
open
,
back
,
click
Rule
Chain when you already know the indices. Run
state
separately when you need to discover indices first.
Core Workflow
Navigate
:
opencli browser open
Inspect
:
opencli browser state
→ elements with
[N]
indices
Interact
use indices —
click
,
type
,
select
,
keys
Wait
(if needed):
opencli browser wait selector ".loaded"
or
wait text "Success"
Verify
:
opencli browser state
or
opencli browser get value
Repeat
browser stays open between commands
Save
write a JS adapter to ~/.opencli/clis//.js Commands Navigation opencli browser open < url

Open URL (page-changing)

opencli browser back

Go back (page-changing)

opencli browser scroll down

Scroll (up/down, --amount N)

opencli browser scroll up --amount 1000 Inspect (free & instant) opencli browser state

Structured DOM with [N] indices — PRIMARY tool

opencli browser screenshot [ path.png ]

Save visual to file — ONLY for user deliverables

Get (free & instant) opencli browser get title

Page title

opencli browser get url

Current URL

opencli browser get text < index

Element text content

opencli browser get value < index

Input/textarea value (use to verify after type)

opencli browser get html

Full page HTML

opencli browser get html --selector "h1"

Scoped HTML

opencli browser get attributes < index

Element attributes

Interact opencli browser click < index

Click element [N]

opencli browser type < index

"text"

Type into element [N]

opencli browser select < index

"option"

Select dropdown

opencli browser keys "Enter"

Press key (Enter, Escape, Tab, Control+a)

Wait Three variants — use the right one for the situation: opencli browser wait time 3

Wait N seconds (fixed delay)

opencli browser wait selector ".loaded"

Wait until element appears in DOM

opencli browser wait selector ".spinner" --timeout 5000

With timeout (default 30s)

opencli browser wait text "Success"

Wait until text appears on page

When to wait
After open on SPAs, after click that triggers async loading, before eval on dynamically rendered content. Extract (free & instant, read-only) Use eval ONLY for reading data. Never use it to click, type, or navigate. opencli browser eval "document.title" opencli browser eval "JSON.stringify([...document.querySelectorAll('h2')].map(e => e.textContent))"

IMPORTANT: wrap complex logic in IIFE to avoid "already declared" errors

opencli browser
eval
"(function(){ const items = [...document.querySelectorAll('.item')]; return JSON.stringify(items.map(e => e.textContent)); })()"
Selector safety
Always use fallback selectors — querySelector returns null on miss:

BAD: crashes if selector misses

opencli browser eval "document.querySelector('.title').textContent"

GOOD: fallback with || or ?.

opencli browser eval "(document.querySelector('.title') || document.querySelector('h1') || {textContent:''}).textContent" opencli browser eval "document.querySelector('.title')?.textContent ?? 'not found'" Network (API Discovery) opencli browser network

Show captured API requests (auto-captured since open)

opencli browser network --detail 3

Show full response body of request #3

opencli browser network --all

Include static resources

Sedimentation (Save as CLI) opencli browser init hn/top

Generate adapter scaffold at ~/.opencli/clis/hn/top.js

opencli browser verify hn/top

Test the adapter (adds --limit 3 only if limit arg is defined)

init auto-detects the domain from the active browser session (no need to specify it) init creates the file + populates site , name , domain , and columns from current page verify runs the adapter end-to-end and prints output; if no limit arg exists in the adapter, it won't pass --limit 3 Session opencli browser close

Close automation window

Example: Extract HN Stories opencli browser open https://news.ycombinator.com opencli browser state

See [1] a "Story 1", [2] a "Story 2"...

opencli browser eval "JSON.stringify([...document.querySelectorAll('.titleline a')].slice(0,5).map(a => ({title: a.textContent, url: a.href})))" opencli browser close Example: Fill a Form opencli browser open https://httpbin.org/forms/post opencli browser state

See [3] input "Customer Name", [4] input "Telephone"

opencli browser type 3 "OpenCLI" && opencli browser type 4 "555-0100" opencli browser get value 3

Verify: "OpenCLI"

opencli browser close Saving as Reusable CLI — Complete Workflow Step-by-step sedimentation flow:

1. Explore the website

opencli browser open https://news.ycombinator.com opencli browser state

Understand DOM structure

2. Discover APIs (crucial for high-quality adapters)

opencli browser eval "fetch('/api/...').then(r=>r.json())"

Trigger API calls

opencli browser network

See captured API requests

opencli browser network --detail 0

Inspect response body

3. Generate scaffold

opencli browser init hn/top

Creates ~/.opencli/clis/hn/top.js

4. Edit the adapter (fill in func logic)

- If API found: use fetch() directly (Strategy.PUBLIC or COOKIE)

- If no API: use page.evaluate() for DOM extraction (Strategy.UI)

5. Verify

opencli browser verify hn/top

Runs the adapter and shows output

6. If verify fails, edit and retry

7. Close when done

opencli browser close Example adapter: // ~/.opencli/clis/hn/top.js import { cli , Strategy } from '@jackwener/opencli/registry' ; cli ( { site : 'hn' , name : 'top' , description : 'Top Hacker News stories' , domain : 'news.ycombinator.com' , strategy : Strategy . PUBLIC , browser : false , args : [ { name : 'limit' , type : 'int' , default : 5 } ] , columns : [ 'rank' , 'title' , 'score' , 'url' ] , func : async ( _page , kwargs ) => { const limit = Math . min ( Math . max ( 1 , kwargs . limit ?? 5 ) , 50 ) ; const resp = await fetch ( 'https://hacker-news.firebaseio.com/v0/topstories.json' ) ; const ids = await resp . json ( ) ; return Promise . all ( ids . slice ( 0 , limit ) . map ( async ( id : number , i : number ) => { const item = await ( await fetch ( https://hacker-news.firebaseio.com/v0/item/ ${ id } .json ) ) . json ( ) ; return { rank : i + 1 , title : item . title , score : item . score , url : item . url ?? '' } ; } ) ) ; } , } ) ; Save to ~/.opencli/clis//.js → immediately available as opencli . Strategy Guide Strategy When browser: Strategy.PUBLIC Public API, no auth false Strategy.COOKIE Needs login cookies true Strategy.UI Direct DOM interaction true Always prefer API over UI — if you discovered an API during browsing, use fetch() directly. Tips Always state first — never guess element indices, always inspect first Sessions persist — browser stays open between commands, no need to re-open Use eval for data extraction — eval "JSON.stringify(...)" is faster than multiple get calls Use network to find APIs — JSON APIs are more reliable than DOM scraping Alias : opencli op is shorthand for opencli browser Common Pitfalls form.submit() fails in automation — Don't use form.submit() or eval to submit forms. Navigate directly to the search URL instead:

BAD: form.submit() often silently fails

opencli browser eval "document.querySelector('form').submit()"

GOOD: construct the URL and navigate

opencli browser open "https://github.com/search?q=opencli&type=repositories" GitHub DOM changes frequently — Prefer data-testid attributes when available; they are more stable than class names or tag structure. SPA pages need wait before extraction — After open or click on single-page apps, the DOM isn't ready immediately. Always wait selector or wait text before eval . Use state before clicking — Run opencli browser state to inspect available interactive elements and their indices. Never guess indices from memory. evaluate runs in browser context — page.evaluate() in adapters executes inside the browser. Node.js APIs ( fs , path , process ) are NOT available. Use fetch() for network calls, DOM APIs for page data. Backticks in page.evaluate break JSON storage — When writing adapters that will be stored/transported as JSON, avoid template literals inside page.evaluate . Use string concatenation or function-style evaluate: // BAD: template literal backticks break when adapter is in JSON page . evaluate ( document.querySelector(" ${ selector } ") ) // GOOD: function-style evaluate page . evaluate ( ( sel ) => document . querySelector ( sel ) , selector ) Troubleshooting Error Fix "Browser not connected" Run opencli doctor "attach failed: chrome-extension://" Disable 1Password temporarily Element not found opencli browser scroll down && opencli browser state Stale indices after page change Run opencli browser state again to get fresh indices

返回排行榜