ai-tech-rss-fetch

安装量: 142
排名: #6044

安装

npx skills add https://github.com/tiangong-ai/skills --skill ai-tech-rss-fetch
AI Tech RSS Fetch
Core Goal
Subscribe to RSS/Atom sources.
Persist feed and entry metadata to SQLite.
Deduplicate entries with layered identity keys plus content fingerprints.
Keep only metadata; do not fetch full article bodies and do not summarize.
Triggering Conditions
Receive a request to subscribe RSS feeds from URLs or OPML.
Receive a request to run incremental RSS sync reliably.
Need stable metadata persistence for downstream processing.
Need dedupe-safe storage of feed items over repeated runs.
Workflow
Prepare runtime and database.
Ensure dependency is installed:
python3 -m pip install feedparser
.
In multi-agent runtimes, pin DB to an absolute path before any command:
export
AI_RSS_DB_PATH
=
"/absolute/path/to/workspace-rss-bot/ai_rss.db"
Initialize SQLite schema once:
python3 scripts/rss_subscribe.py init-db
--db
"
$AI_RSS_DB_PATH
"
Add feed subscriptions.
Add one feed URL:
python3 scripts/rss_subscribe.py add-feed
--db
"
$AI_RSS_DB_PATH
"
--url
"https://example.com/feed.xml"
Import feeds from OPML:
python3 scripts/rss_subscribe.py import-opml
--db
"
$AI_RSS_DB_PATH
"
--opml
assets/hn-popular-blogs-2025.opml
Run incremental sync.
Fetch active feeds and store metadata:
python3 scripts/rss_subscribe.py
sync
--db
"
$AI_RSS_DB_PATH
"
--max-feeds
20
--max-items-per-feed
100
Optional one-feed sync:
python3 scripts/rss_subscribe.py
sync
--db
"
$AI_RSS_DB_PATH
"
--feed-url
"https://example.com/feed.xml"
Query persisted metadata.
List feeds:
python3 scripts/rss_subscribe.py list-feeds
--db
"
$AI_RSS_DB_PATH
"
--limit
50
List recent entries:
python3 scripts/rss_subscribe.py list-entries
--db
"
$AI_RSS_DB_PATH
"
--limit
100
Input Requirements
Supported inputs:
RSS XML feed URLs.
OPML feed list files.
Output Contract (Metadata Only)
Persist
feeds
metadata to SQLite:
feed_url
,
feed_title
,
site_url
,
etag
,
last_modified
, status fields.
Persist
entries
metadata to SQLite:
id
,
dedupe_key
(compat primary identity snapshot),
guid
,
url
,
canonical_url
,
title
,
author
,
published_at
,
updated_at
,
summary
,
categories
,
content_hash
,
match_confidence
, timestamps.
Persist
entry_identities
mapping table to SQLite:
entry_id
,
key_type
,
key_value
,
created_at
.
Supported key types:
guid
,
canonical_url
,
legacy_guid
,
fallback_hash
.
Do not store generated summaries and do not create archive markdown files.
Configurable Parameters
db_path
AI_RSS_DB_PATH
(recommended absolute path in multi-agent runtime)
opml_path
feed_urls
max_feeds_per_run
max_items_per_feed
user_agent
seen_ttl_days
enable_conditional_get
Example config:
assets/config.example.json
Error and Boundary Handling
Feed HTTP/network failure: keep syncing other feeds and record
last_error
.
Feed
304 Not Modified
skip entry parsing and keep state.
Missing
guid
and
link
use hashed fallback identity and set match_confidence=low . Dependency missing ( feedparser ): return install guidance. Final Output Checklist (Required) core goal trigger conditions input requirements metadata schema dedupe and sync rules command workflow configurable parameters error handling Use the following simplified checklist verbatim when the user requests it: 核心目标 输入需求 触发条件 元数据模型 去重与同步规则 命令流程 可配置参数 错误处理 References references/input-model.md references/output-rules.md references/time-range-rules.md Assets assets/hn-popular-blogs-2025.opml (candidate feed pool) assets/config.example.json Scripts scripts/rss_subscribe.py
返回排行榜