agentic-browser

安装量: 34
排名: #19975

安装

npx skills add https://github.com/inference-sh/skills --skill agentic-browser

Agentic Browser Browser automation for AI agents via inference.sh . Uses Playwright under the hood with a simple @e ref system for element interaction. Quick Start

Install CLI

curl -fsSL https://cli.inference.sh | sh && infsh login

Open a page and get interactive elements

infsh app run agentic-browser --function open --input '{"url": "https://example.com"}' --session new Core Workflow Every browser automation follows this pattern: Open - Navigate to URL, get @e refs for elements Interact - Use refs to click, fill, drag, etc. Re-snapshot - After navigation/changes, get fresh refs Close - End session (returns video if recording)

1. Start session

RESULT

$( infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com/login" }' ) SESSION_ID = $( echo $RESULT | jq -r '.session_id' )

Elements: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"

2. Fill and submit

infsh app run agentic-browser --function interact --session $SESSION_ID --input '{ "action": "fill", "ref": "@e1", "text": "user@example.com" }' infsh app run agentic-browser --function interact --session $SESSION_ID --input '{ "action": "fill", "ref": "@e2", "text": "password123" }' infsh app run agentic-browser --function interact --session $SESSION_ID --input '{ "action": "click", "ref": "@e3" }'

3. Re-snapshot after navigation

infsh app run agentic-browser --function snapshot --session $SESSION_ID --input '{}'

4. Close when done

infsh app run agentic-browser --function close --session $SESSION_ID --input '{}' Functions Function Description open Navigate to URL, configure browser (viewport, proxy, video recording) snapshot Re-fetch page state with @e refs after DOM changes interact Perform actions using @e refs (click, fill, drag, upload, etc.) screenshot Take page screenshot (viewport or full page) execute Run JavaScript code on the page close Close session, returns video if recording was enabled Interact Actions Action Description Required Fields click Click element ref dblclick Double-click element ref fill Clear and type text ref , text type Type text (no clear) text press Press key (Enter, Tab, etc.) text select Select dropdown option ref , text hover Hover over element ref check Check checkbox ref uncheck Uncheck checkbox ref drag Drag and drop ref , target_ref upload Upload file(s) ref , file_paths scroll Scroll page direction (up/down/left/right), scroll_amount back Go back in history - wait Wait milliseconds wait_ms goto Navigate to URL url Element Refs Elements are returned with @e refs: @e1 [a] "Home" href="/" @e2 [input type="text"] placeholder="Search" @e3 [button] "Submit" @e4 [select] "Choose option" @e5 [input type="checkbox"] name="agree" Important: Refs are invalidated after navigation. Always re-snapshot after: Clicking links/buttons that navigate Form submissions Dynamic content loading Features Video Recording Record browser sessions for debugging or documentation:

Start with recording enabled (optionally show cursor indicator)

SESSION

$( infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "record_video": true, "show_cursor": true }' | jq -r '.session_id' )

... perform actions ...

Close to get the video file

infsh app run agentic-browser --function close --session $SESSION --input '{}'

Returns:

Cursor Indicator Show a visible cursor in screenshots and video (useful for demos): infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "show_cursor": true, "record_video": true }' The cursor appears as a red dot that follows mouse movements and shows click feedback. Proxy Support Route traffic through a proxy server: infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "proxy_url": "http://proxy.example.com:8080", "proxy_username": "user", "proxy_password": "pass" }' File Upload Upload files to file inputs: infsh app run agentic-browser --function interact --session $SESSION --input '{ "action": "upload", "ref": "@e5", "file_paths": ["/path/to/file.pdf"] }' Drag and Drop Drag elements to targets: infsh app run agentic-browser --function interact --session $SESSION --input '{ "action": "drag", "ref": "@e1", "target_ref": "@e2" }' JavaScript Execution Run custom JavaScript: infsh app run agentic-browser --function execute --session $SESSION --input '{ "code": "document.querySelectorAll(\"h2\").length" }'

Returns:

Deep-Dive Documentation Reference Description references/commands.md Full function reference with all options references/snapshot-refs.md Ref lifecycle, invalidation rules, troubleshooting references/session-management.md Session persistence, parallel sessions references/authentication.md Login flows, OAuth, 2FA handling references/video-recording.md Recording workflows for debugging references/proxy-support.md Proxy configuration, geo-testing Ready-to-Use Templates Template Description templates/form-automation.sh Form filling with validation templates/authenticated-session.sh Login once, reuse session templates/capture-workflow.sh Content extraction with screenshots Examples Form Submission SESSION = $( infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com/contact" }' | jq -r '.session_id' )

Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea], @e4 [button] "Send"

infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}' infsh app run agentic-browser --function snapshot --session $SESSION --input '{}' infsh app run agentic-browser --function close --session $SESSION --input '{}' Search and Extract SESSION = $( infsh app run agentic-browser --function open --session new --input '{ "url": "https://google.com" }' | jq -r '.session_id' ) infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}' infsh app run agentic-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}' infsh app run agentic-browser --function snapshot --session $SESSION --input '{}' infsh app run agentic-browser --function close --session $SESSION --input '{}' Screenshot with Video SESSION = $( infsh app run agentic-browser --function open --session new --input '{ "url": "https://example.com", "record_video": true }' | jq -r '.session_id' )

Take full page screenshot

infsh app run agentic-browser --function screenshot --session $SESSION --input '{ "full_page": true }'

Close and get video

RESULT

$( infsh app run agentic-browser --function close --session $SESSION --input '{}' ) echo $RESULT | jq '.video' Sessions Browser state persists within a session. Always: Start with --session new on first call Use returned session_id for subsequent calls Close session when done

返回排行榜