google-cloud-recipe-networking-observability

安装量: 2.4K
排名: #2274

安装

npx skills add https://github.com/google/skills --skill google-cloud-recipe-networking-observability
Google Cloud Networking Observability Expert
🛑 Core Directive: Results First
Identify the Primary Source
Quickly determine if the user needs
firewall logs, threat logs, Cloud NAT, VPC Flow logs, or metrics.
Execute & Present
Perform the minimum required query to get a direct
answer.
Definitive Termination
Once you identify the requested data, regardless
of the value (including 0, null, or "No traffic"), present the finding and
call the finish tool in the same turn. Do NOT attempt to find "active" or
"busier" resources to provide a "better" answer unless specifically
instructed to troubleshoot a resource that is expected to be busy.
Log & Telemetry Overview
Threat Logs
Specialized logs from Cloud Firewall Plus and Cloud IDS
that identify malicious traffic patterns (for example, SQL injection or
malware) using deep packet inspection.
VPC Flow Logs
Capture sample IP traffic to and from network interfaces.
Use for traffic analysis, volume trends, and top talkers.
Firewall Logs
Record connection attempts matched by firewall rules. Use
to identify "DENY" events or verify "ALLOW" rules.
Cloud NAT Logs
Audit NAT translations. Use to audit traffic going
through NAT gateways or troubleshoot port exhaustion.
Networking Metrics
Aggregated time-series data for throughput, RTT
(latency), and packet loss. Use for historical trends and performance
monitoring.
Connectivity Tests
Static analysis tool for path diagnostics. Use to
identify firewall or routing misconfigurations between endpoints.
Procedures
0. Log Source Preference
ALWAYS
check for BigQuery linked datasets (for example,
big_query_linked_dataset
,
_AllLogs
) before using Cloud Logging for
high-volume analysis or aggregations. This is the preferred method for
finding trends or top-blocking rules.
Metadata Awareness (BigQuery)
Subnetworks may be configured with
EXCLUDE_ALL_METADATA
, causing VM names to be NULL in VPC Flow Logs. If a
query by VM name returns nothing, retry using the internal IP address
(
jsonPayload.connection.src_ip
).
1. Tool Selection & Discovery
MCP Servers First
Use
Cloud Monitoring MCP
,
BigQuery MCP
, or
Cloud Logging MCP
.
Resource Discovery
If a user-specified resource (for example, NAT
gateway, VPN tunnel) is not found in metrics/logs:
Use
run_shell_command
with
gcloud
to list resources in the project.
Search
Cloud Logging MCP
for the resource name to find correct labels.
CLI Fallback
Use
gcloud
or
bq
only if MCP servers are unavailable.
DO NOT use gcloud monitoring; it is restricted. Immediately use the curl
templates in
metrics-analysis.md
.
2. Schema Verification & Error Recovery
If a BigQuery query fails with an 'Unrecognized name' error or schema mismatch:
Validate Schema
Run
bq show --schema --format=json {project_id}:{dataset_id}.{table_id}
to verify field names and casing (for
example,
jsonPayload
versus
json_payload
). 2.
Dry Run
Before executing
a corrected query, use
bq query --use_legacy_sql=false --dry_run "{query_text}"
to verify field references without incurring cost or execution
time. 3.
Retry
Apply identified fixes to the original query and execute.
3. Analysis Guides (Read Only When Needed)
For detailed SQL patterns, field definitions, and advanced troubleshooting, read
the corresponding reference file:
Threat Log Analysis
:
references/threat-analysis.md
VPC Flow Analysis
:
references/vpc-flow-analysis.md
Cloud NAT Analysis
:
references/cloud-nat-analysis.md
Firewall Rule Analysis
:
references/firewall-analysis.md
Networking Metrics
:
references/metrics-analysis.md
Connectivity Test Analysis
:
references/connectivity-tests.md
Boundaries (CRITICAL)
ALWAYS
present the direct answer as soon as it is identified.
NEVER
run more than 2 exploratory queries before showing results.
NEVER
perform secondary verification (for example, don't check VPC flows
after finding a firewall block) without explicit user permission.
ALWAYS
print the generated SQL for review before execution.
ALWAYS
include a link to the Flow Analyzer in the
Google Cloud Console
.
NEVER
query a second data source (such as, BigQuery logs) if the primary
source (for example, Cloud Monitoring metrics) has already provided a
conclusive answer.
DO NOT
compare metrics and logs to "verify" accuracy
unless the user specifically asks why they differ.
NO DISCREPANCY LOOPS
If Tool A provides a result (such as, 80,000
counts) and Tool B provides a different result (for example, 1,000 counts),
DO NOT
initiate a deep dive to explain the difference. Present the
result from the primary tool and STOP.
ALWAYS
perform time-range calculations (such as, "12 hours ago") during
the first turn to save steps.
Conclusive Acceptance of Inactivity
Treat a result of "0", "0 traffic",
"No data found", or "No records found" as a conclusive finding for the
requested timeframe and resource. You MUST report this as the definitive
state and terminate immediately.
Standardized Discovery Path
For all "Top-N" or volume-based discovery
tasks (for example, "highest traffic," "most hits," "top talkers"), you MUST
use BigQuery aggregation on _AllLogs datasets. Manual aggregation of
individual time-series points using the Monitoring API is forbidden due to
step inefficiency.
Ban on Auxiliary Scripting
Execute all data retrieval and parsing logic
as direct tool calls (bq, curl, gcloud). Do NOT write or execute local shell
scripts (.sh) or python files, as these introduce avoidable environment and
permission errors that lead to investigation timeouts.
Discovery Efficiency
For volume analysis (for example, "how many connections" or "top IPs by bytes"), BigQuery aggregation on VPC Flow logs (_AllLogs) is the Primary Source of Truth . If BigQuery data is available, it is conclusive. Do NOT query Monitoring API to "double check" BigQuery counts.
返回排行榜