Active Research
Analyze any topic, domain, or paper and generate a beautiful HTML report using
Actionbook Browser
— featuring SPA-aware navigation, network idle detection, batch operations, and intelligent page analysis.
Enhanced Browser Capabilities
Capability
Description
Page load wait
wait-idle
— monitors fetch/XHR until network settles
SPA content
wait-fn
— wait for JS conditions before extracting
Page understanding
snapshot --filter interactive --max-tokens N
— focused, budget-friendly
Popups blocking
--auto-dismiss-dialogs
— auto-handle alert/confirm/prompt
Load speed
--block-images
— skip images for faster text extraction
Page stability
--no-animations
— freeze CSS transitions
Error detection
console --level error
— check for page issues
Multi-step forms
batch
— execute multiple actions in one call
Element debugging
info
PREFERRED: One-shot fetch (I1) — handles open+wait+extract+close automatically
actionbook --block-images --rewrite-urls browser fetch
"
For interactive multi-step workflows, use explicit open:
actionbook --block-images --auto-dismiss-dialogs --no-animations --rewrite-urls browser
open
"
Single command: navigate → wait (domain-aware) → extract → close
actionbook --block-images --rewrite-urls browser fetch
"
For static pages (Wikipedia, docs, blogs), add --lite to skip browser entirely:
actionbook --rewrite-urls browser fetch
"
For accessibility tree:
actionbook --block-images --rewrite-urls browser fetch
"
Step 1: Navigate
actionbook browser
open
"
or: goto, click a link
Step 2: Wait for load (MANDATORY in v2)
actionbook browser wait-idle
Wait for fetch/XHR to settle
Step 3: Extract content
actionbook browser text [ selector ]
Extract text
OR
actionbook browser snapshot --filter interactive --max-tokens 500
Understand page structure
Why wait-idle is critical: SPAs (React, Vue, Next.js) load content via fetch/XHR after initial HTML Without waiting, text returns empty or incomplete content wait-idle monitors all pending network requests, waits until quiet for 500ms For pages that load content dynamically after network settles: actionbook browser wait-idle actionbook browser wait-fn "document.querySelector('.results')"
Wait for specific element
actionbook browser text ".results" Complete Workflow REMINDER: Every web access in this workflow MUST use actionbook browser commands. Using curl , wget , python requests , or any other HTTP tool is strictly forbidden . The bash tool should ONLY be used for actionbook CLI commands and local file operations (json-ui render, open ). Step 1: Plan Search Strategy Based on the topic, generate 5-8 search queries from different angles: Core definition / overview Latest developments / news Technical details / implementation Comparisons / alternatives Expert opinions / analysis Use cases / applications Search order — ALWAYS query Actionbook API first, then search: Step Action Why Step 2 (FIRST) Query Actionbook API Get verified selectors for arXiv, ar5iv, and other known sites BEFORE browsing. Step 3 (SECOND) arXiv Advanced Search Use Actionbook selectors for multi-field, filtered academic search. Step 4 (THIRD) Google / Bing search Supplement with blogs, news, code, discussions, non-academic sources. Step 2: Query Actionbook API for Selectors (ALWAYS DO THIS FIRST) BEFORE browsing any URL, query Actionbook's indexed selectors.
Search for indexed actions by domain
actionbook search
"
Get detailed selectors for a specific page
- actionbook get
- "
:/ :" - Pre-indexed sites useful for research:
- Site
- area_id
- Key Selectors
- arXiv Advanced Search
- arxiv.org:/search/advanced:default
- 40+ selectors
- field select, term input, category checkboxes, date range filters ar5iv paper ar5iv.labs.arxiv.org:/html/{paper_id}:default h1.ltx_title_document , div.ltx_authors , div.ltx_abstract , section.ltx_section Google Scholar scholar.google.com:/:default
gs_hdr_tsi
(search),
gs_hdr_tsb
(submit)
arXiv homepage
arxiv.org:/:default
Global search across 2.4M+ articles
For any URL you plan to visit
, run
actionbook search "
Simple keyword search
actionbook --block-images --auto-dismiss-dialogs --no-animations browser open "https://arxiv.org/search/?query=large+language+model+agent&searchtype=all" actionbook browser wait-idle actionbook browser text "#main-container"
Advanced URL search with filters
searchtype: all, title, author, abstract
start: result offset (0, 50, 100, ...)
actionbook browser open "https://arxiv.org/search/?query=Rust+machine+learning&searchtype=all&start=0" actionbook browser wait-idle actionbook browser text "#main-container" Search strategy: Start broad, then narrow: First search: broad terms (e.g., "Rust" "machine learning" ) — aim for 50+ results If too few results (< 10): broaden further, remove date/category filters If too many results (> 200): add more specific terms, use searchtype=title Try 2-3 different query angles (e.g., framework names, use cases, benchmarks) Option B: Form interaction via batch (BACKUP — use if URL search is insufficient):
Open arXiv with research flags
actionbook --block-images --auto-dismiss-dialogs --no-animations browser open "https://arxiv.org/search/advanced" actionbook browser wait-idle
Use batch for form — fewer round-trips, more reliable
cat << 'EOF' | actionbook browser batch --delay 150 { "actions": [ {"kind": "click", "selector": "#terms-0-field"}, {"kind": "click", "selector": "option[value='title']"}, {"kind": "type", "selector": "#terms-0-term", "text": "large language model agent"}, {"kind": "click", "selector": "#classification-computer_science"}, {"kind": "click", "selector": "#date-filter_by-3"}, {"kind": "type", "selector": "#date-from_date", "text": "2025-01-01"}, {"kind": "type", "selector": "#date-to_date", "text": "2026-02-23"}, {"kind": "click", "selector": "button:has-text('Search'):nth(2)"} ], "stopOnError": true } EOF actionbook browser wait-idle actionbook browser text "#main-container"
If batch form submission fails (page shows form again instead of results):
→ Fall back to Option A URL-based search immediately
→ Do NOT retry the form — it wastes time
arXiv search capabilities (from indexed selectors — for Option B): Capability Selector Search field (Title/Author/Abstract)
terms-0-field
select Search term
terms-0-term
input Add boolean terms button "Add another term +" Filter: Computer Science
classification-computer_science
Filter: Physics, Math, etc.
classification-physics
,
classification-mathematics
Date: past 12 months
date-filter_by-1
radio Date: specific year
date-filter_by-2
radio +
date-year
Date: custom range
date-filter_by-3
radio +
date-from_date
/
date-to_date
Show abstracts
abstracts-0
radio Step 4: Supplement with Google / Bing Search
Search via Google (with wait-idle for SPA results)
actionbook browser
open
"https://www.google.com/search?q=
Or search via Bing
actionbook browser
open
"https://www.bing.com/search?q=
Check if the page is a 404 or error page
actionbook browser wait-fn "!document.title.includes('404') && !document.title.includes('Not Found')" --timeout 3000
If timeout → page is dead, skip immediately. Do NOT retry.
Salvage info from Google snippets. If a URL is dead but the Google snippet had useful info: The snippet text you already extracted IS valid data Use it in the report with a note that the source is no longer available Search for the same content on alternative sites (archive.org, cached versions) Use 4+ diverse search queries. Don't rely on one search angle: Query 1: Core topic overview (e.g., "Rust AI ecosystem 2026") Query 2: Specific frameworks/tools (e.g., "Candle vs Burn Rust ML framework") Query 3: Use cases/benchmarks (e.g., "Rust LLM inference performance benchmark") Query 4: Recent news/developments (e.g., "Rust machine learning latest 2026") Query 5: Community/ecosystem (e.g., "Rust AI agent framework comparison") Step 5: Deep Read Sources PREFERRED: Use browser fetch for one-shot page extraction (handles wait + extract + cleanup):
Quick text extraction (most common)
actionbook --block-images --rewrite-urls browser fetch
"
Static pages (Wikipedia, docs, blogs) — skip browser entirely
actionbook --rewrite-urls browser fetch
"
Page structure analysis
actionbook --block-images --rewrite-urls browser fetch
"
With token budget for LLM context management
actionbook --block-images --rewrite-urls browser fetch
"
MANDATORY: wait for network
actionbook browser text
Full page text (fallback)
actionbook browser text
"
Use Actionbook selector if indexed
If page content seems incomplete, debug:
Check for JS errors that might block rendering
actionbook browser console --level error
Check if a specific element exists
actionbook browser wait-fn "document.querySelector('.content')" --timeout 5000
Inspect element properties
actionbook browser info ".content" For arXiv papers , try sources in this order:
1. arXiv abstract (most reliable) — use fetch
actionbook --block-images browser fetch
"https://arxiv.org/abs/
2. HuggingFace papers page
actionbook --block-images browser fetch
"https://huggingface.co/papers/
3. ar5iv HTML (structured, but fails on new papers) — use --lite for static HTML
actionbook browser fetch
"https://ar5iv.org/html/
NOTE: if content too short, ar5iv didn't render. Fall back.
4. GitHub repo (from search results) — use fetch
actionbook --block-images browser fetch
"
Method 1: Monorepo absolute path (most reliable if inside actionbook project)
node " $( git rev-parse --show-toplevel ) /packages/json-ui/dist/cli.js" render /absolute/path/to/report.json -o /absolute/path/to/report.html
Method 2: Global install (if user ran: cd packages/json-ui && npm link)
json-ui render /absolute/path/to/report.json -o /absolute/path/to/report.html
Method 3: npx (if published to npm)
npx @actionbookdev/json-ui render /absolute/path/to/report.json
-o
/absolute/path/to/report.html
NEVER give up silently.
If all methods fail, tell the user:
The JSON report is saved at
macOS
open < report.html
Linux
xdg-open < report.html
Step 10: Close Browser Always close the browser when done: actionbook browser close Error Recovery Patterns Intelligent error recovery using advanced browser capabilities: Pattern: Page Load Failure
1. Open page
actionbook browser
open
"
2. Check for JS errors
actionbook browser console --level error
If errors found → page is broken, skip to next source
3. Check if content rendered
actionbook browser wait-fn "document.body.innerText.length > 100" --timeout 5000
If timeout → content didn't render, try fallback
Pattern: Selector Not Found
1. Use snapshot to discover actual page structure
actionbook browser snapshot --filter interactive --max-tokens 800
2. Or inspect a specific area
actionbook browser info
"
Returns: suggested selectors, visibility, tag info
3. Adjust selector and retry
Pattern: Anti-Bot Detection
1. If initial load returns CAPTCHA or access denied:
actionbook browser close
2. Reopen with stealth
actionbook
--stealth
--no-animations --auto-dismiss-dialogs browser
open
"
3. If still blocked, rotate fingerprint
actionbook browser fingerprint rotate
--os
windows
actionbook browser
open
"
1. Wait for network
actionbook browser wait-idle --idle-time 1000 --timeout 15000
2. Wait for specific element
actionbook browser wait-fn "document.querySelector('.results')" --timeout 10000
3. If still empty, check console
actionbook browser console --level error
4. Try clicking a loading trigger
actionbook browser snapshot --filter interactive --max-tokens 300
Look for "Load More", "Show Results", etc.
Full Error Handling Reference
Error
Recovery Strategy
Browser fails to open
actionbook browser status
, retry + check
console --level error
Page load timeout
wait-idle --timeout 15000
, then
console --level error
to diagnose
URL returns 404
wait-fn "!document.title.includes('404')"
to detect fast.
Skip immediately, do NOT retry.
Use Google snippet text as backup data.
arXiv form submission fails
Fall back to URL-based search:
arxiv.org/search/?query=...&searchtype=all
ar5iv content truncated
Fall back to arxiv abstract +
wait-fn "document.body.innerText.length > 5000"
to verify
Selector not found
snapshot --filter interactive
to discover actual structure
Dynamic content missing
wait-idle
+
wait-fn
for specific conditions
Alert popup blocking
--auto-dismiss-dialogs
prevents this entirely
Anti-bot detection
--stealth
+
fingerprint rotate
Slow media-heavy page
--block-images
or
--block-media
for 2-5x speedup
CSS animation interference
--no-animations
freezes all transitions
json-ui render crash
Check MetricsGrid —
suffix
/
value
must be plain strings
npx json-ui
404
Try all 3 methods (monorepo, global, npx)
No search results
Start broad (50+ results), then narrow. Use 4+ query angles.
IMPORTANT:
Always run
actionbook browser close
before finishing, even on errors.
Feature Usage Checklist
Before finalizing research, verify you used these capabilities:
Feature
When to Use
Check
browser fetch
Read-only page extraction (preferred over open+wait+text)
Use for most page reads
--lite
Static pages (Wikipedia, docs, blogs) — skip browser entirely
Add to
fetch
for static sites
--rewrite-urls
Always (avoids anti-bot on x.com, reddit)
Set in initial browser launch
--wait-hint
Domain-aware wait tuning (fast/slow/heavy)
Use with
fetch
or manual flow
--session-tag
Multi-step operations needing log correlation
Set for debugging sessions
wait-idle
After EVERY
open
/
goto
/
click
that triggers navigation
Must be used on every page
--block-images
Always (research doesn't need images)
Set in initial browser launch
--auto-dismiss-dialogs
Always (prevents blocking)
Set in initial browser launch
--no-animations
Always (stable snapshots)
Set in initial browser launch
wait-fn
When content loads asynchronously after network settles
Use on SPAs, dynamic pages
console --level error
When page content seems incomplete or broken
Use for debugging
batch
When filling multi-step forms (arXiv, Google Scholar)
Replaces 5+ sequential commands
snapshot --filter interactive
When discovering unknown page structure
Use on unindexed sites
info
If wait-fn times out → content didn't render, fall back to other sources
- Recommended Source Priority
- Priority
- Source
- What you get
- Reliability
- 1
- arxiv.org/abs/
- Abstract, metadata, submission history
- Very high
- 2
- huggingface.co/papers/
- Abstract, community, related models
- Very high
- 3
- GitHub repo
- README, code, model zoo
- High
- 4
- HuggingFace model card
- Training recipe, benchmarks
- High
- 5
- ar5iv.org/html/
- Full paper HTML
- Medium
- 6
- Google Scholar / Semantic Scholar
- Citations, related work
- Medium
- Other Academic Sources
- Google Scholar (
- scholar.google.com
- ) — Actionbook indexed
- Semantic Scholar (
- semanticscholar.org
- )
- Papers With Code (
- paperswithcode.com
- )
- Conference proceedings sites
- Quality Guidelines
- Breadth
-
- Research from at least 3-5 diverse sources
- Depth
-
- Read full articles, not just snippets
- Accuracy
-
- Cross-reference facts across sources
- Structure
-
- Use appropriate json-ui components for each content type
- Attribution
-
- Always include source links in the report
- Freshness
- Prefer recent sources when relevance is equal