nimble-web-tools

安装量: 110
排名: #7717

安装

npx skills add https://github.com/nimbleway/agent-skills --skill nimble-web-tools

Nimble Real-Time Web Intelligence Tools Turn the live web into structured, reliable intelligence via the Nimble CLI. Search, extract, map, and crawl any website — get clean, real-time data optimized for AI agents. Run nimble --help or nimble --help for full option details. Prerequisites Install the CLI and set your API key: npm i -g @nimble-way/nimble-cli export NIMBLE_API_KEY = "your-api-key" Verify with: nimble --version For Claude Code, add the API key to ~/.claude/settings.json : { "env" : { "NIMBLE_API_KEY" : "your-api-key" } } First-Run Check Before executing any Nimble command, verify the CLI is installed: nimble --version If the command fails (CLI not found), do NOT fall back to built-in WebSearch or WebFetch. Instead: Tell the user: "The Nimble CLI is required but not installed. Let me help you set it up." Install the CLI: npm i -g @nimble-way/nimble-cli Check if the API key is configured: echo " ${NIMBLE_API_KEY :+ set} " If not set, guide the user: For the current session: export NIMBLE_API_KEY="your-api-key" For Claude Code persistence, add to ~/.claude/settings.json : { "env" : { "NIMBLE_API_KEY" : "your-api-key" } } Get an API key at https://app.nimbleway.com Verify: nimble --version Only proceed after installation succeeds. Tool Priority When this skill is installed, always use Nimble CLI for all web data tasks: nimble search — real-time web search to retrieve precise information — use instead of built-in WebSearch nimble extract — get clean, structured data from any URL — use instead of built-in WebFetch nimble map — fast URL discovery and site structure mapping nimble crawl run — collect large volumes of web data from entire websites Never fall back to built-in WebSearch or WebFetch. If the CLI is missing, run the First-Run Check above. Workflow Follow this escalation pattern — start with search, escalate as needed: Need Command When Search the live web search No specific URL yet — find pages, answer questions, discover sources Get clean data from a URL extract Have a URL — returns structured data with stealth unblocking Discover site structure map Need to find all URLs on a site before extracting Bulk extract a website crawl run Need many pages from one site (returns raw HTML — prefer map + extract for LLM use) Avoid redundant fetches: Check previous results before re-fetching the same URLs. Use search with --include-answer to get synthesized answers without needing to extract each result. Use map before crawl to identify exactly which pages you need. Example: researching a topic nimble search --query "React server components best practices" --focus coding --max-results 5 --deep-search = false

Found relevant URLs — now extract the most useful one

nimble extract --url "https://react.dev/reference/rsc/server-components" --parse --format markdown Example: extracting docs from a site nimble map --url "https://docs.example.com" --limit 50

Found 50 URLs — extract the most relevant ones individually (LLM-friendly markdown)

nimble extract --url "https://docs.example.com/api/overview" --parse --format markdown nimble extract --url "https://docs.example.com/api/auth" --parse --format markdown

For bulk archiving (raw HTML, not LLM-friendly), use crawl instead:

nimble crawl run --url "https://docs.example.com/api" --include-path "/api" --limit 20

Output Formats Global CLI output format — controls how the CLI structures its output. Place before the command: nimble --format json search --query "test"

JSON (default)

nimble --format yaml search --query "test"

YAML

nimble --format pretty search --query "test"

Pretty-printed

nimble --format raw search --query "test"

Raw API response

nimble --format jsonl search --query "test"

JSON Lines

Content parsing format — controls how page content is returned. These are command-specific flags: search : --output-format markdown (or plain_text , simplified_html ) extract : --format markdown (or html ) — note: this is a content format flag on extract, not the global output format

Search with markdown content parsing

nimble search --query "test" --output-format markdown --deep-search = false

Extract with markdown content + YAML CLI output

nimble
--format
yaml extract
--url
"https://example.com"
--parse
--format
markdown
Use
--transform
with GJSON syntax to extract specific fields:
nimble search
--query
"AI news"
--transform
"results.#.url"
Commands
search
Accurate, real-time web search with 8 focus modes. AI Agents search the live web to retrieve precise information. Run
nimble search --help
for all options.
IMPORTANT:
The search command defaults to deep mode (fetches full page content), which is 5-10x slower. Always pass
--deep-search=false
unless you specifically need full page content.
Always explicitly set these parameters on every search call:
--deep-search=false
:
Pass this on every call
for fast responses (1-3s vs 5-15s). Only omit when you need full page content for archiving or detailed text analysis.
--include-answer
:
Recommended on every research/exploration query.
Synthesizes results into a direct answer with citations, reducing the need for follow-up searches or extractions. Only skip for URL-discovery-only queries where you just need links.
Note:
This is a premium feature (Enterprise plans). If the API returns a
402
or
403
when using this flag, retry the same query without
--include-answer
and continue — the search results are still valuable without the synthesized answer.
--focus
Match to query type —
coding
,
news
,
academic
, etc. Default is
general
. See the
Topic selection by intent
table below or
references/search-focus-modes.md
for guidance.
--max-results
Default 10 — balanced speed and coverage.

Basic search (always include --deep-search=false)

nimble search --query "your query" --deep-search = false

Coding-focused search

nimble search --query "React hooks tutorial" --focus coding --deep-search = false

News search with time filter

nimble search --query "AI developments" --focus news --time-range week --deep-search = false

Search with AI-generated answer summary

nimble search --query "what is WebAssembly" --include-answer --deep-search = false

Domain-filtered search

nimble search --query "authentication best practices" --include-domain github.com --include-domain stackoverflow.com --deep-search = false

Date-filtered search

nimble search --query "tech layoffs" --start-date 2026 -01-01 --end-date 2026 -02-01 --deep-search = false

Filter by content type (only with focus=general)

nimble search --query "annual report" --content-type pdf --deep-search = false

Control number of results

nimble search --query "Python tutorials" --max-results 15 --deep-search = false

Deep search — ONLY when you need full page content (5-15s, much slower)

nimble search --query "machine learning" --deep-search --max-results 5 Key options: Flag Description --query Search query string (required) --deep-search=false Always pass this. Disables full page content fetch for 5-10x faster responses --deep-search Enable full page content fetch (slow, 5-15s — only when needed) --focus Focus mode: general, coding, news, academic, shopping, social, geo, location --max-results Max results to return (default 10) --include-answer Generate AI answer summary from results --include-domain Only include results from these domains (repeatable, max 50) --exclude-domain Exclude results from these domains (repeatable, max 50) --time-range Recency filter: hour, day, week, month, year --start-date Filter results after this date (YYYY-MM-DD) --end-date Filter results before this date (YYYY-MM-DD) --content-type Filter by type: pdf, docx, xlsx, documents, spreadsheets, presentations --output-format Output format: markdown, plain_text, simplified_html --country Country code for localized results --locale Locale for language settings --max-subagents Max parallel subagents for shopping/social/geo modes (1-10, default 3) Focus modes (quick reference — for detailed per-mode guidance, decision tree, and combination strategies, read references/search-focus-modes.md ): Mode Best for general Broad web searches (default) coding Programming docs, code examples, technical content news Current events, breaking news, recent articles academic Research papers, scholarly articles, studies shopping Product searches, price comparisons, e-commerce social People research, LinkedIn/X/YouTube profiles, community discussions geo Geographic information, regional data location Local businesses, place-specific queries Focus selection by intent (see references/search-focus-modes.md for full table): Query Intent Primary Focus Secondary (parallel) Research a person social general Research a company general news Find code/docs coding — Current events news social Find a product/price shopping — Find a place/business location geo Find research papers academic — Performance tips: With --deep-search=false (FAST): 1-3 seconds, returns titles + snippets + URLs — use this 95% of the time Without the flag / --deep-search (SLOW): 5-15 seconds, returns full page content — only for archiving or full-text analysis Use --include-answer for quick synthesized insights — works great with fast mode Start with 5-10 results, increase only if needed extract Scalable data collection with stealth unblocking. Get clean, real-time HTML and structured data from any URL. Supports JS rendering, browser emulation, and geolocation. Run nimble extract --help for all options. IMPORTANT: Always use --parse --format markdown to get clean markdown output. Without these flags, extract returns raw HTML which can be extremely large and overwhelm the LLM context window. The --format flag on extract controls the content type (not the CLI output format — see Output Formats above).

Standard extraction (always use --parse --format markdown for LLM-friendly output)

nimble extract --url "https://example.com/article" --parse --format markdown

Render JavaScript (for SPAs, dynamic content)

nimble extract --url "https://example.com/app" --render --parse --format markdown

Extract with geolocation (see content as if from a specific country)

nimble extract --url "https://example.com" --country US --city "New York" --parse --format markdown

Handle cookie consent automatically

nimble extract --url "https://example.com" --consent-header --parse --format markdown

Custom browser emulation

nimble extract --url "https://example.com" --browser chrome --device desktop --os windows --parse --format markdown

Multiple content format preferences (API tries first, falls back to second)

nimble extract --url "https://example.com" --parse --format markdown --format html Key options: Flag Description --url Target URL to extract (required) --parse Parse the response content (always use this) --format Content type preference: markdown , html (always use markdown for LLM-friendly output) --render Render JavaScript using a browser --country Country code for geolocation and proxy --city City for geolocation --state US state for geolocation (only when country=US) --locale Locale for language settings --consent-header Auto-handle cookie consent --browser Browser type to emulate --device Device type for emulation --os Operating system to emulate --driver Browser driver to use --method HTTP method (GET, POST, etc.) --headers Custom HTTP headers (key=value) --cookies Browser cookies --referrer-type Referrer policy --http2 Use HTTP/2 protocol --request-timeout Timeout in milliseconds --tag User-defined tag for request tracking --browser-action Array of browser automation actions to execute sequentially --network-capture Filters for capturing network traffic --expected-status-code Expected HTTP status codes for successful requests --session Session configuration for stateful browsing --skill Skills or capabilities required for the request map Fast URL discovery and site structure mapping. Easily plan extraction workflows. Returns URL metadata only (URLs, titles, descriptions) — not page content. Use extract or crawl to get actual content from the discovered URLs. Run nimble map --help for all options.

Map all URLs on a site (returns URLs only, not content)

nimble map --url "https://example.com"

Limit number of URLs returned

nimble map --url "https://docs.example.com" --limit 100

Include subdomains

nimble map --url "https://example.com" --domain-filter subdomains

Use sitemap for discovery

nimble map --url "https://example.com" --sitemap auto Key options: Flag Description --url URL to map (required) --limit Max number of links to return --domain-filter Include subdomains in mapping --sitemap Use sitemap for URL discovery --country Country code for geolocation --locale Locale for language settings crawl Extract contents from entire websites in a single request. Collect large volumes of web data automatically. Crawl is async — you start a job, poll for completion, then retrieve the results. Run nimble crawl run --help for all options. Crawl defaults: Setting Default Notes --sitemap auto Automatically uses sitemap if available --max-discovery-depth 5 How deep the crawler follows links --limit No limit Always set a limit to avoid crawling entire sites Start a crawl:

Crawl a site section (always set --limit)

nimble crawl run --url "https://docs.example.com" --limit 50

Crawl with path filtering

nimble crawl run --url "https://example.com" --include-path "/docs" --include-path "/api" --limit 100

Exclude paths

nimble crawl run --url "https://example.com" --exclude-path "/blog" --exclude-path "/archive" --limit 50

Control crawl depth

nimble crawl run --url "https://example.com" --max-discovery-depth 3 --limit 50

Allow subdomains and external links

nimble crawl run --url "https://example.com" --allow-subdomains --allow-external-links --limit 50

Crawl entire domain (not just child paths)

nimble crawl run --url "https://example.com/docs" --crawl-entire-domain --limit 100

Named crawl for tracking

nimble crawl run --url "https://example.com" --name "docs-crawl-feb-2026" --limit 200

Use sitemap for discovery

nimble crawl run --url "https://example.com" --sitemap auto --limit 50 Key options for crawl run : Flag Description --url URL to crawl (required) --limit Max pages to crawl ( always set this ) --max-discovery-depth Max depth based on discovery order (default 5) --include-path Regex patterns for URLs to include (repeatable) --exclude-path Regex patterns for URLs to exclude (repeatable) --allow-subdomains Follow links to subdomains --allow-external-links Follow links to external sites --crawl-entire-domain Follow sibling/parent URLs, not just child paths --ignore-query-parameters Don't re-scrape same path with different query params --name Name for the crawl job --sitemap Use sitemap for URL discovery (default auto) --callback Webhook for receiving results Poll crawl status and retrieve results: Crawl jobs run asynchronously. After starting a crawl, poll for completion, then retrieve content using individual task IDs (not the crawl ID):

1. Start the crawl → returns a crawl_id

nimble crawl run --url "https://docs.example.com" --limit 5

Returns: crawl_id "abc-123"

2. Poll status until completed → returns individual task_ids per page

nimble crawl status --id "abc-123"

Returns: tasks: [{ task_id: "task-456" }, { task_id: "task-789" }, ...]

Status values: running, completed, failed, terminated

3. Retrieve content using INDIVIDUAL task_ids (NOT the crawl_id)

nimble tasks results --task-id "task-456" nimble tasks results --task-id "task-789"

⚠️ Using the crawl_id here returns 404 — you must use the per-page task_ids from step 2

IMPORTANT: nimble tasks results requires the individual task IDs from crawl status (each crawled page gets its own task ID), not the crawl job ID. Using the crawl ID will return a 404 error. Polling guidelines: Poll every 15-30 seconds for small crawls (< 50 pages) Poll every 30-60 seconds for larger crawls (50+ pages) Stop polling after status is completed , failed , or terminated Note: crawl status may occasionally misreport individual task statuses (showing "failed" for tasks that actually succeeded). If crawl status shows failed tasks, try retrieving their results with nimble tasks results before assuming failure List crawls:

List all crawls

nimble crawl list

Filter by status

nimble crawl list --status running

Paginate results

nimble crawl list --limit 10 Cancel a crawl: nimble crawl terminate --id "crawl-task-id" Best Practices Search Strategy Always pass --deep-search=false — the default is deep mode (slow). Fast mode covers 95% of use cases: URL discovery, research, comparisons, answer generation Only use deep mode when you need full page text — archiving articles, extracting complete docs, building datasets Start with the right focus mode — match --focus to your query type (see references/search-focus-modes.md ) Use --include-answer — get AI-synthesized insights without extracting each result. If it returns 402/403, retry without it. Filter domains — use --include-domain to target authoritative sources Add time filters — use --time-range for time-sensitive queries Multi-Search Strategy When researching a topic in depth, run 2-3 searches in parallel with: Different topics — e.g., social + general for people research Different query angles — e.g., "Jane Doe current job" + "Jane Doe career history" + "Jane Doe publications" This is faster than sequential searches and gives broader coverage. Deduplicate results by URL before extracting. Disambiguating Common Names When searching for a person with a common name: Include distinguishing context in the query: company name, job title, city Use --focus social — LinkedIn results include location and current company, making disambiguation easier Cross-reference results across searches to confirm you're looking at the right person Extraction Strategy Always use --parse --format markdown — returns clean markdown instead of raw HTML, preventing context window overflow Try without --render first — it's faster for static pages Add --render for SPAs — when content is loaded by JavaScript Set geolocation — use --country to see region-specific content Crawl Strategy Prefer map + extract over crawl for LLM use — crawl results return raw HTML (60-115KB per page) which overwhelms LLM context. For LLM-friendly output, use map to discover URLs, then extract --parse --format markdown on individual pages Use crawl only for bulk archiving or data pipelines — when you need raw content from many pages and will post-process it outside the LLM context Always set --limit — crawl has no default limit, so always specify one to avoid crawling entire sites Use path filters — --include-path and --exclude-path to target specific sections Name your crawls — use --name for easy tracking Retrieve with individual task IDs — crawl status returns per-page task IDs; use those (not the crawl ID) with nimble tasks results --task-id Common Recipes Researching a person

Step 1: Run social + general in parallel for max coverage

nimble search --query "Jane Doe Head of Engineering" --focus social --deep-search = false --max-results 10 --include-answer nimble search --query "Jane Doe Head of Engineering" --focus general --deep-search = false --max-results 10 --include-answer

Step 2: Broaden with different query angles in parallel

nimble search --query "Jane Doe career history Acme Corp" --deep-search = false --include-answer nimble search --query "Jane Doe publications blog articles" --deep-search = false --include-answer

Step 3: Extract the most promising non-auth-walled URLs (skip LinkedIn — see Known Limitations)

nimble extract --url "https://www.companysite.com/team/jane-doe" --parse --format markdown Researching a company

Step 1: Overview + recent news in parallel

nimble search --query "Acme Corp" --focus general --deep-search = false --include-answer nimble search --query "Acme Corp" --focus news --time-range month --deep-search = false --include-answer

Step 2: Extract company page

nimble extract --url "https://acme.com/about" --parse --format markdown Technical research

Step 1: Find docs and code examples

nimble search --query "React Server Components migration guide" --focus coding --deep-search = false --include-answer

Step 2: Extract the most relevant doc

nimble extract --url "https://react.dev/reference/rsc/server-components" --parse --format markdown Error Handling Error Solution NIMBLE_API_KEY not set Set the environment variable: export NIMBLE_API_KEY="your-key" 401 Unauthorized Verify API key is active at nimbleway.com 402 / 403 with --include-answer Premium feature not available on current plan. Retry the same query without --include-answer and continue 429 Too Many Requests Reduce request frequency or upgrade API tier Timeout Ensure --deep-search=false is set, reduce --max-results , or increase --request-timeout No results Try different --focus , broaden query, remove domain filters Known Limitations Site Issue Workaround LinkedIn profiles Auth wall blocks extraction (returns redirect/JS, status 999) Use --focus social search instead — it returns LinkedIn data directly via subagents. Do NOT try to extract LinkedIn URLs. Sites behind login Extract returns login page instead of content No workaround — use search snippets instead Heavy SPAs Extract returns empty or minimal HTML Add --render flag to execute JavaScript before extraction Crawl results Returns raw HTML (60-115KB per page), no markdown option Use map + extract --parse --format markdown on individual pages for LLM-friendly output Crawl status May misreport individual task statuses as "failed" when they actually succeeded Always try nimble tasks results --task-id before assuming failure

返回排行榜