brightdata-cli

安装量: 1.7K
排名: #2929

安装

npx skills add https://github.com/brightdata/skills --skill brightdata-cli

Bright Data CLI The Bright Data CLI ( brightdata or bdata ) gives you full access to Bright Data's web data platform from the terminal. It handles authentication, proxy zones, anti-bot bypass, CAPTCHA solving, and JavaScript rendering automatically — the user just needs to log in once. Installation If the CLI is not installed yet, guide the user: macOS / Linux: curl -fsSL https://cli.brightdata.com/install.sh | bash Windows or manual install (any platform): npm install -g @brightdata/cli Without installing (one-off usage): npx --yes --package @brightdata/cli brightdata < command

Requires Node.js >= 20. After install, both brightdata and bdata (shorthand) are available. First-Time Setup Before anything else, check if the user is authenticated. If they haven't logged in yet, guide them through the one-time setup:

One-time login — opens the browser for OAuth, then everything is automatic

bdata login This single command: Opens the browser for secure OAuth authentication Saves the API key locally (never needs to be entered again) Auto-creates required proxy zones ( cli_unlocker , cli_browser ) Sets default configuration After login, every subsequent command works without any manual intervention. For headless/SSH environments where no browser is available: bdata login --device For direct API key authentication (non-interactive): bdata login --api-key < key

To verify setup is complete, run: bdata config Command Reference Read references/commands.md for the full command reference with all flags, options, and examples for every command. Read references/pipelines.md for the complete list of 40+ pipeline types (Amazon, LinkedIn, Instagram, TikTok, YouTube, Reddit, and more) with their specific parameters. Quick Command Overview bdata is the shorthand for brightdata . Both work identically. Command Purpose bdata scrape Scrape any URL as markdown, HTML, JSON, or screenshot bdata search "" Search Google/Bing/Yandex with structured results bdata pipelines [params] Extract structured data from 40+ platforms bdata pipelines list List all 40+ available pipeline types bdata status Check async job status bdata zones List proxy zones bdata budget View account balance and costs bdata skill add Install AI agent skills bdata skill list List available skills bdata config View/set configuration bdata login Authenticate with Bright Data bdata version Show CLI version and system info How to Use Each Command Scraping Scrape any URL with automatic bot bypass, CAPTCHA handling, and JS rendering:

Default: returns clean markdown

bdata scrape https://example.com

Get raw HTML

bdata scrape https://example.com -f html

Get structured JSON

bdata scrape https://example.com -f json

Take a screenshot

bdata scrape https://example.com -f screenshot -o page.png

Geo-targeted scrape from the US

bdata scrape https://amazon.com --country us

Save to file

bdata scrape https://example.com -o page.md

Async mode for heavy pages

bdata scrape https://example.com --async Searching Search engines with structured JSON output (Google returns parsed organic results, ads, People Also Ask, and related searches):

Google search with formatted table

bdata search "web scraping best practices"

Get raw JSON for piping

bdata search "typescript tutorials" --json

Search Bing

bdata search "bright data pricing" --engine bing

Localized search

bdata search "restaurants berlin" --country de --language de

News search

bdata search "AI regulation" --type news

Extract just URLs

bdata search "open source tools" --json | jq -r '.organic[].link' Pipelines (Structured Data Extraction) Extract structured data from 40+ platforms. These trigger async jobs that poll until results are ready:

LinkedIn profile

bdata pipelines linkedin_person_profile "https://linkedin.com/in/username"

Amazon product

bdata pipelines amazon_product "https://amazon.com/dp/B09V3KXJPB"

Instagram profile

bdata pipelines instagram_profiles "https://instagram.com/username"

Amazon search

bdata pipelines amazon_product_search "laptop" "https://amazon.com"

YouTube comments (top 50)

bdata pipelines youtube_comments "https://youtube.com/watch?v=..." 50

Google Maps reviews (last 7 days)

bdata pipelines google_maps_reviews "https://maps.google.com/..." 7

Output as CSV

bdata pipelines amazon_product "https://amazon.com/dp/..." --format csv -o product.csv

List all available pipeline types

bdata pipelines list Checking Status For async jobs (from --async scrapes or pipelines):

Quick status check

bdata status < job-id

Wait until complete

bdata status < job-id

--wait

With custom timeout

bdata status < job-id

--wait --timeout 300 Budget & Zones

Quick account balance

bdata budget

Detailed balance with pending charges

bdata budget balance

All zones cost/bandwidth

bdata budget zones

Specific zone costs

bdata budget zone my_zone

Date range filter

bdata budget zones --from 2024 -01-01T00:00:00 --to 2024 -02-01T00:00:00

List all zones

bdata zones

Zone details

bdata zones info cli_unlocker Configuration

View all config

bdata config

Set defaults

bdata config set default_zone_unlocker my_zone bdata config set default_format json Installing AI Agent Skills

Interactive picker — choose skills and target agents

bdata skill add

Install a specific skill

bdata skill add scrape

List available skills

bdata skill list Output Modes Every command supports multiple output formats: Flag Effect (none) Human-readable formatted output with colors --json Compact JSON to stdout --pretty Indented JSON to stdout -o Write to file (format auto-detected from extension) When piped (stdout is not a TTY), colors and spinners are automatically disabled. Chaining Commands The CLI is pipe-friendly:

Search → extract first URL → scrape it

bdata search "top open source projects" --json \ | jq -r '.organic[0].link' \ | xargs bdata scrape

Scrape and view with markdown reader

bdata scrape https://docs.github.com | glow -

Amazon product data to CSV

bdata pipelines amazon_product
"https://amazon.com/dp/xxx"
--format
csv
>
product.csv
Environment Variables
These override stored configuration:
Variable
Purpose
BRIGHTDATA_API_KEY
API key (skips login entirely)
BRIGHTDATA_UNLOCKER_ZONE
Default Web Unlocker zone
BRIGHTDATA_SERP_ZONE
Default SERP zone
BRIGHTDATA_POLLING_TIMEOUT
Polling timeout in seconds
Troubleshooting
Error
Fix
CLI not found
Install with
npm i -g @brightdata/cli
or
curl -fsSL https://cli.brightdata.com/install.sh | bash
"No Web Unlocker zone specified"
bdata config set default_zone_unlocker
or re-run
bdata login
"Invalid or expired API key"
bdata login
"Access denied"
Check zone permissions in the Bright Data control panel
"Rate limit exceeded"
Wait and retry, or use
--async
for large jobs
Async job timeout
Increase with
--timeout 1200
or
BRIGHTDATA_POLLING_TIMEOUT=1200
Key Design Principles
One-time auth
After
bdata login
, everything is automatic. No tokens to manage, no keys to pass.
Zones auto-created
Login creates
cli_unlocker
and
cli_browser
zones automatically.
Smart defaults
Markdown output, auto-detected formats from file extensions, colors only in TTY.
Pipe-friendly
JSON output + jq for automation. Colors/spinners disabled in pipes.
Async support
Heavy jobs can run in background with --async + status --wait . npm package : @brightdata/cli — install globally or use via npx .
返回排行榜