sec-edgar-pipeline

安装量: 82
排名: #9629

安装

npx skills add https://github.com/bobmatnyc/claude-mpm-skills --skill sec-edgar-pipeline

SEC EDGAR Pipeline Overview

This pipeline is centered on edgar-analyzer and the EDGAR data sources. The core loop is: configure credentials, create a project with examples, analyze patterns, generate code, run extraction, and export reports.

Setup (Keys + User Agent)

Use the setup wizard to configure required keys:

python -m edgar_analyzer setup

or

edgar-analyzer setup

Required entries:

OPENROUTER_API_KEY (Optional) JINA_API_KEY EDGAR user agent string ("Name email@example.com") End-to-End CLI Workflow

1. Create project

edgar-analyzer project create my_project --template minimal

2. Add examples + project.yaml

projects/my_project/examples/*.json

3. Analyze examples

edgar-analyzer analyze-project projects/my_project

4. Generate extraction code

edgar-analyzer generate-code projects/my_project

5. Run extraction

edgar-analyzer run-extraction projects/my_project --output-format csv

Outputs land in projects//output/.

EDGAR-Specific Conventions CIK values are 10-digit, zero-padded (e.g., 0000320193). Rate limit: SEC API allows 10 requests/sec. Scripts use ~0.11s delays. User agent is mandatory; include name + email. Scripted Example (Apple DEF 14A)

edgar/scripts/fetch_apple_def14a.py shows the direct flow:

Fetch latest DEF 14A metadata Download HTML Parse Summary Compensation Table (SCT) Save raw HTML + extracted JSON + ground truth Recipe-Driven Extraction

edgar/recipes/sct_extraction/config.yaml defines a multi-step pipeline:

Fetch DEF 14A filings by company list Extract SCT tables with SCTAdapter Validate with sct_validator Write results to output/sct Report Generation

edgar/scripts/create_csv_reports.py converts JSON results into:

executive_compensation_.csv top_25_executives_.csv company_summary_.csv Troubleshooting No filings found: confirm CIK formatting and filing type (DEF 14A vs DEF 14A/A). API errors: slow down requests and confirm user-agent is set. Extraction errors: regenerate code or use manual ground truth in POC scripts. Related Skills universal/data/reporting-pipelines toolchains/python/testing/pytest

返回排行榜