google-news

安装量: 147
排名: #5867

安装

npx skills add https://github.com/outsharp/shipp-skills --skill google-news

Google News RSS API Google News is a free news aggregator that collects headlines from thousands of publishers around the world. Google exposes its feeds via public RSS 2.0 endpoints that require no authentication or API key . Base URL https://news.google.com/rss All feed URLs are built by appending paths and query parameters to this base. Query Parameters Every feed URL accepts the following query parameters to control region and language: Parameter Required Description Example hl Yes Interface language / locale code en-US , fr , de , ja , pt-BR , es-419 gl Yes Country / geographic location (ISO 3166-1 alpha-2) US , GB , IN , DE , JP , BR ceid Yes Compound locale key in the form {gl}:{language} US:en , GB:en , DE:de , JP:ja , BR:pt-419 Important: All three parameters should be consistent. Mismatched values may return unexpected or empty results. Supported Locations (Validated) The following locations have been tested and confirmed to return valid RSS feeds (HTTP 200): Location hl gl ceid Example URL 🇺🇸 United States en-US US US:en https://news.google.com/rss?hl=en-US&gl=US&ceid=US:en 🇬🇧 United Kingdom en-GB GB GB:en https://news.google.com/rss?hl=en-GB&gl=GB&ceid=GB:en 🇮🇳 India en-IN IN IN:en https://news.google.com/rss?hl=en-IN&gl=IN&ceid=IN:en 🇦🇺 Australia en-AU AU AU:en https://news.google.com/rss?hl=en-AU&gl=AU&ceid=AU:en 🇨🇦 Canada en-CA CA CA:en https://news.google.com/rss?hl=en-CA&gl=CA&ceid=CA:en 🇩🇪 Germany de DE DE:de https://news.google.com/rss?hl=de&gl=DE&ceid=DE:de 🇫🇷 France fr FR FR:fr https://news.google.com/rss?hl=fr&gl=FR&ceid=FR:fr 🇯🇵 Japan ja JP JP:ja https://news.google.com/rss?hl=ja&gl=JP&ceid=JP:ja 🇧🇷 Brazil pt-BR BR BR:pt-419 https://news.google.com/rss?hl=pt-BR&gl=BR&ceid=BR:pt-419 🇲🇽 Mexico es-419 MX MX:es-419 https://news.google.com/rss?hl=es-419&gl=MX&ceid=MX:es-419 🇮🇱 Israel en-IL IL IL:en https://news.google.com/rss?hl=en-IL&gl=IL&ceid=IL:en Feed Types 1. Top Stories (Headlines) Returns the current top stories for a given location. URL pattern: https://news.google.com/rss?hl={hl}&gl={gl}&ceid={gl}:{lang} Example — US top stories: https://news.google.com/rss?hl=en-US&gl=US&ceid=US:en 2. Topic Feeds Returns articles for a specific news topic / section. URL pattern: https://news.google.com/rss/topics/{TOPIC_ID}?hl={hl}&gl={gl}&ceid={gl}:{lang} Known Topic IDs (English, US): Topic Topic ID World CAAqJggKIiBDQkFTRWdvSUwyMHZNRGx1YlY4U0FtVnVHZ0pWVXlnQVAB Nation / U.S. CAAqIggKIhxDQkFTRHdvSkwyMHZNRGxqTjNjU0FtVnVLQUFQAQ Business CAAqJggKIiBDQkFTRWdvSUwyMHZNRGx6TVdZU0FtVnVHZ0pWVXlnQVAB Technology CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB Entertainment CAAqJggKIiBDQkFTRWdvSUwyMHZNREpxYW5RU0FtVnVHZ0pWVXlnQVAB Sports CAAqJggKIiBDQkFTRWdvSUwyMHZNRFp1ZEdvU0FtVnVHZ0pWVXlnQVAB Science CAAqJggKIiBDQkFTRWdvSUwyMHZNRFp0Y1RjU0FtVnVHZ0pWVXlnQVAB Health CAAqIQgKIhtDQkFTRGdvSUwyMHZNR3QwTlRFU0FtVnVLQUFQAQ Example — Technology news (US): https://news.google.com/rss/topics/CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB?hl=en-US&gl=US&ceid=US:en Note: Topic IDs are base64-encoded protocol buffer strings. They can differ by language/region. The IDs above are for en-US . To find topic IDs for other locales, inspect the RSS link on the Google News website for that locale. 3. Keyword / Search Feeds Returns articles matching a search query. URL pattern: https://news.google.com/rss/search?q={query}&hl={hl}&gl={gl}&ceid={gl}:{lang} Query modifiers: Modifier Description Example + or space AND (default) q=artificial+intelligence OR OR operator q=Tesla+OR+SpaceX - Exclude term q=Apple+-fruit "..." Exact phrase (URL-encode the quotes) q=%22climate+change%22 when:7d Time filter — last N days/hours q=Bitcoin+when:7d when:1h Time filter — last 1 hour q=breaking+news+when:1h after:YYYY-MM-DD Articles after a date q=Olympics+after:2024-07-01 before:YYYY-MM-DD Articles before a date q=Olympics+before:2024-08-15 site: Restrict to a domain q=AI+site:reuters.com Example — search for "artificial intelligence" in the last 7 days: https://news.google.com/rss/search?q=artificial+intelligence+when:7d&hl=en-US&gl=US&ceid=US:en RSS Response Format All feeds return RSS 2.0 XML . Here is the general structure:

< rss xmlns: media = " http://search.yahoo.com/mrss/ " version = " 2.0 "

< channel

< generator

NFE/5.0 </ generator

< title

Top stories - Google News </ title

< link

https://news.google.com/?hl=en-US & gl=US & ceid=US:en </ link

< language

en-US </ language

< webMaster

news-webmaster@google.com </ webMaster

< copyright

... </ copyright

< lastBuildDate

Wed, 18 Feb 2026 20:50:00 GMT </ lastBuildDate

< item

< title

Article headline - Publisher Name </ title

< link

https://news.google.com/rss/articles/... </ link

< guid isPermaLink = " true "

https://news.google.com/rss/articles/... </ guid

< pubDate

Wed, 18 Feb 2026 19:05:07 GMT </ pubDate

< description

< ol

< li

< a href = " ... "

Article Title </ a

    < font color = "

6f6f6f

"

Publisher </ font

</ li

... </ ol

</ description

< source url = " https://publisher-domain.com "

Publisher Name </ source

</ item

</ channel

</ rss

Key Fields per Field Description Headline text followed by - Publisher Name <link> Google News redirect URL. Visiting it in a browser redirects to the actual article. <guid> Unique identifier (same as <link> ) <pubDate> Publication date in RFC 2822 format <description> HTML snippet containing an ordered list (</p> </blockquote> <ol> ) of related/clustered articles with links and publisher names <source url="..."> Publisher name and homepage URL Common Patterns Fetch Top Headlines (curl + grep) curl -s "https://news.google.com/rss?hl=en-US&gl=US&ceid=US:en" \ | grep -oP '<title>\K[^<]+' Fetch Top Headlines (Python) import feedparser feed = feedparser . parse ( "https://news.google.com/rss?hl=en-US&gl=US&ceid=US:en" ) for entry in feed . entries : print ( f" { entry . published } — { entry . title } " ) print ( f" Link: { entry . link } " ) print ( ) Fetch Topic Feed (curl + xmllint) TOPIC = "CAAqJggKIiBDQkFTRWdvSUwyMHZNRGRqTVhZU0FtVnVHZ0pWVXlnQVAB" curl -s "https://news.google.com/rss/topics/ ${TOPIC} ?hl=en-US&gl=US&ceid=US:en" \ | xmllint --xpath '//item/title/text()' - Search for Articles (Python) import feedparser import urllib . parse query = urllib . parse . quote ( "artificial intelligence when:7d" ) url = f"https://news.google.com/rss/search?q= { query } &hl=en-US&gl=US&ceid=US:en" feed = feedparser . parse ( url ) for entry in feed . entries [ : 10 ] : print ( f"• { entry . title } " ) Fetch News for a Specific Location (Node.js) const https = require ( "https" ) ; const { parseStringPromise } = require ( "xml2js" ) ; const url = "https://news.google.com/rss?hl=en-GB&gl=GB&ceid=GB:en" ; https . get ( url , ( res ) => { let data = "" ; res . on ( "data" , ( chunk ) => ( data += chunk ) ) ; res . on ( "end" , async ( ) => { const result = await parseStringPromise ( data ) ; const items = result . rss . channel [ 0 ] . item || [ ] ; items . slice ( 0 , 10 ) . forEach ( ( item ) => { console . log ( item . title [ 0 ] ) ; } ) ; } ) ; } ) ; Extract Related Articles from Description (Python) import feedparser from html . parser import HTMLParser class RelatedParser ( HTMLParser ) : def __init__ ( self ) : super ( ) . __init__ ( ) self . articles = [ ] self . _in_a = False self . _href = "" self . _text = "" def handle_starttag ( self , tag , attrs ) : if tag == "a" : self . _in_a = True self . _href = dict ( attrs ) . get ( "href" , "" ) self . _text = "" def handle_endtag ( self , tag ) : if tag == "a" and self . _in_a : self . articles . append ( { "title" : self . _text , "link" : self . _href } ) self . _in_a = False def handle_data ( self , data ) : if self . _in_a : self . _text += data feed = feedparser . parse ( "https://news.google.com/rss?hl=en-US&gl=US&ceid=US:en" ) for entry in feed . entries [ : 3 ] : print ( f"\n=== { entry . title } ===" ) parser = RelatedParser ( ) parser . feed ( entry . description ) for art in parser . articles : print ( f" • { art [ 'title' ] } " ) print ( f" { art [ 'link' ] } " ) Build a Multi-Region News Aggregator (Python) import feedparser REGIONS = { "US" : "hl=en-US&gl=US&ceid=US:en" , "UK" : "hl=en-GB&gl=GB&ceid=GB:en" , "DE" : "hl=de&gl=DE&ceid=DE:de" , "JP" : "hl=ja&gl=JP&ceid=JP:ja" , "BR" : "hl=pt-BR&gl=BR&ceid=BR:pt-419" , } for region , params in REGIONS . items ( ) : feed = feedparser . parse ( f"https://news.google.com/rss? { params } " ) print ( f"\n--- { region } Top 3 ---" ) for entry in feed . entries [ : 3 ] : print ( f" • { entry . title } " ) Monitor a Topic with Polling (bash) #!/usr/bin/env bash FEED = "https://news.google.com/rss/search?q=breaking+news+when:1h&hl=en-US&gl=US&ceid=US:en" SEEN_FILE = "/tmp/gnews_seen.txt" touch " $SEEN_FILE " while true ; do curl -s " $FEED " | grep -oP '<guid[^>]*>\K[^<]+' | while read -r guid ; do if ! grep -qF " $guid " " $SEEN_FILE " ; then echo " $guid " >> " $SEEN_FILE " TITLE = $( curl -s " $FEED " | grep -oP "<item>.*?<guid[^>]*> ${guid} .*?</item>" \ | grep -oP '<title>\K[^<]+' | head -1 ) echo "[NEW] $TITLE " fi done sleep 120 done Resolving Google News Redirect URLs Article links in the RSS feed point to https://news.google.com/rss/articles/... which redirect (HTTP 302/303) to the actual publisher URL. To resolve the final URL: curl curl -Ls -o /dev/null -w '%{url_effective}' \ "https://news.google.com/rss/articles/CBMiWkFV..." Python import requests response = requests . head ( "https://news.google.com/rss/articles/CBMiWkFV..." , allow_redirects = True , timeout = 10 , ) print ( response . url ) # final publisher URL Rate Limits Google does not publish official rate limits for the RSS feeds. Based on community observations: Guideline Recommendation Polling interval ≥ 60 seconds between requests for the same feed Concurrent requests Keep below ~10 concurrent connections Burst behavior Rapid bursts may trigger HTTP 429 or CAPTCHA challenges User-Agent Use a descriptive User-Agent; empty or bot-like strings may be blocked If you receive an HTTP 429 response, back off exponentially (e.g., 1 min → 2 min → 4 min). Error Handling HTTP Status Meaning Action 200 Success Parse the RSS XML 301/302 Redirect Follow the redirect (most HTTP clients do this automatically) 404 Feed not found Check the URL, topic ID, or locale parameters 429 Rate limited Back off and retry after a delay 5xx Server error Retry with exponential backoff Tips No auth needed — all feeds are fully public. Start fetching immediately. Use feedparser in Python — it handles RSS parsing, date normalization, and encoding edge cases. Combine search modifiers — q=Tesla+site:reuters.com+when:30d for precise results. Topic IDs are locale-specific — an English topic ID may not work with hl=de . Inspect the Google News page in that locale to find the correct ID. The <description> field is HTML — it contains clustered/related articles as an <ol> list. Parse the HTML to extract multiple sources per story. The <title> includes the publisher — the format is Headline text - Publisher Name . Split on - (space-dash-space) from the right to separate them. Feed results are limited — Google typically returns ~100 items per feed. Use search with date filters to paginate through older results. Respect the copyright notice — Google's RSS feeds are intended for personal, non-commercial use in feed readers. Review Google's terms for other uses. Resolve redirects lazily — only resolve the Google redirect URL to the publisher URL when you actually need the final link. This saves requests. Set a proper User-Agent — e.g., User-Agent: MyNewsBot/1.0 (contact@example.com) . Some environments may get blocked without one. Changelog 0.1.0 — Initial release with top stories, topic feeds, search feeds, multi-region support, and common usage patterns. </article> <a href="/" class="back-link">← <span data-i18n="detail.backToLeaderboard">返回排行榜</span></a> </div> <aside class="sidebar"> <section class="related-skills" id="relatedSkillsSection"> <h2 class="related-title" data-i18n="detail.relatedSkills">相关 Skills</h2> <div class="related-list" id="relatedSkillsList"> <div class="skeleton-card"></div> <div class="skeleton-card"></div> <div class="skeleton-card"></div> </div> </section> </aside> </div> </div> <script src="https://unpkg.com/i18next@23.11.5/i18next.min.js" defer></script> <script src="https://unpkg.com/i18next-browser-languagedetector@7.2.1/i18nextBrowserLanguageDetector.min.js" defer></script> <script defer> // Language resources - same pattern as index page const resources = { 'zh-CN': null, 'en': null, 'ja': null, 'ko': null, 'zh-TW': null, 'es': null, 'fr': null }; // Load language files (only current + fallback for performance) async function loadLanguageResources() { const savedLang = localStorage.getItem('i18nextLng') || 'en'; const langsToLoad = new Set([savedLang, 'en']); // current + fallback await Promise.all([...langsToLoad].map(async (lang) => { try { const response = await fetch(`/locales/${lang}.json`); if (response.ok) { resources[lang] = { translation: await response.json() }; } } catch (error) { console.warn(`Failed to load ${lang} language file:`, error); } })); } // Load a single language on demand (for language switching) async function loadLanguage(lang) { if (resources[lang]) return; try { const response = await fetch(`/locales/${lang}.json`); if (response.ok) { resources[lang] = { translation: await response.json() }; i18next.addResourceBundle(lang, 'translation', resources[lang].translation); } } catch (error) { console.warn(`Failed to load ${lang} language file:`, error); } } // Initialize i18next async function initI18n() { try { await loadLanguageResources(); // Filter out null values from resources const validResources = {}; for (const [lang, data] of Object.entries(resources)) { if (data !== null) { validResources[lang] = data; } } console.log('Loaded languages:', Object.keys(validResources)); console.log('zh-CN resource:', validResources['zh-CN']); console.log('detail.home in resource:', validResources['zh-CN']?.translation?.detail?.home); // 检查是否有保存的语言偏好 const savedLang = localStorage.getItem('i18nextLng'); // 如果没有保存的语言偏好,默认使用英文 const defaultLang = savedLang && ['zh-CN', 'en', 'ja', 'ko', 'zh-TW', 'es', 'fr'].includes(savedLang) ? savedLang : 'en'; await i18next .use(i18nextBrowserLanguageDetector) .init({ lng: defaultLang, // 强制设置初始语言 fallbackLng: 'en', supportedLngs: ['zh-CN', 'en', 'ja', 'ko', 'zh-TW', 'es', 'fr'], resources: validResources, detection: { order: ['localStorage'], // 只使用 localStorage,不检测浏览器语言 caches: ['localStorage'], lookupLocalStorage: 'i18nextLng' }, interpolation: { escapeValue: false } }); console.log('i18next initialized, language:', i18next.language); console.log('Test translation:', i18next.t('detail.home')); // Set initial language in selector const langSwitcher = document.getElementById('langSwitcher'); langSwitcher.value = i18next.language; // Update page language updatePageLanguage(); // Language switch event langSwitcher.addEventListener('change', async (e) => { await loadLanguage(e.target.value); // load on demand i18next.changeLanguage(e.target.value).then(() => { updatePageLanguage(); localStorage.setItem('i18nextLng', e.target.value); }); }); } catch (error) { console.error('i18next init failed:', error); } } // Translation helper function t(key, options = {}) { return i18next.t(key, options); } // Update all translatable elements function updatePageLanguage() { // Update HTML lang attribute document.documentElement.lang = i18next.language; // Update elements with data-i18n attribute document.querySelectorAll('[data-i18n]').forEach(el => { const key = el.getAttribute('data-i18n'); el.textContent = t(key); }); } // Copy command function function copyCommand() { const command = document.getElementById('installCommand').textContent; const btn = document.getElementById('copyBtn'); navigator.clipboard.writeText(command).then(() => { btn.textContent = t('copied'); btn.classList.add('copied'); setTimeout(() => { btn.textContent = t('copy'); btn.classList.remove('copied'); }, 2000); }).catch(() => { // Fallback for non-HTTPS const textArea = document.createElement('textarea'); textArea.value = command; textArea.style.position = 'fixed'; textArea.style.left = '-9999px'; document.body.appendChild(textArea); textArea.select(); document.execCommand('copy'); document.body.removeChild(textArea); btn.textContent = t('copied'); btn.classList.add('copied'); setTimeout(() => { btn.textContent = t('copy'); btn.classList.remove('copied'); }, 2000); }); } // Initialize document.getElementById('copyBtn').addEventListener('click', copyCommand); initI18n(); // 异步加载相关 Skills async function loadRelatedSkills() { const owner = 'outsharp'; const skillName = 'google-news'; const currentLang = 'zh-TW'; const listContainer = document.getElementById('relatedSkillsList'); const section = document.getElementById('relatedSkillsSection'); try { const response = await fetch(`/api/related-skills/${encodeURIComponent(owner)}/${encodeURIComponent(skillName)}?limit=6`); if (!response.ok) { throw new Error('Failed to load'); } const data = await response.json(); const relatedSkills = data.related_skills || []; if (relatedSkills.length === 0) { // 没有相关推荐时隐藏整个区域 section.style.display = 'none'; return; } // 渲染相关 Skills listContainer.innerHTML = relatedSkills.map(skill => { const desc = skill.description || ''; const truncatedDesc = desc.length > 60 ? desc.substring(0, 60) + '...' : desc; return ` <a href="${currentLang === 'en' ? '' : '/' + currentLang}/skill/${skill.owner}/${skill.repo}/${skill.skill_name}" class="related-card"> <div class="related-name">${escapeHtml(skill.skill_name)}</div> <div class="related-meta"> <span class="related-owner">${escapeHtml(skill.owner)}</span> <span class="related-installs">${skill.installs}</span> </div> <div class="related-desc">${escapeHtml(truncatedDesc)}</div> </a> `; }).join(''); } catch (error) { console.error('Failed to load related skills:', error); // 加载失败时显示提示或隐藏 listContainer.innerHTML = '<div class="related-empty">暂无相关推荐</div>'; } } // HTML 转义 function escapeHtml(text) { const div = document.createElement('div'); div.textContent = text; return div.innerHTML; } // 页面加载完成后异步加载相关 Skills if (document.readyState === 'loading') { document.addEventListener('DOMContentLoaded', loadRelatedSkills); } else { loadRelatedSkills(); } </script> </body> </html>