holmesgpt-skill

安装量: 34
排名: #19976

安装

npx skills add https://github.com/julianobarbosa/claude-code-skills --skill holmesgpt-skill
HolmesGPT Skill
AI-powered troubleshooting for Kubernetes and cloud-native environments.
Overview
HolmesGPT is a CNCF Sandbox project that connects AI models with live
observability data to investigate infrastructure problems, find root
causes, and suggest remediations. It operates with
read-only access
and respects RBAC permissions, making it safe for production environments.
Quick Reference
Topic
Reference
Installation
references/installation.md
Configuration
references/configuration.md
Data Sources
references/data-sources.md
Commands
references/commands.md
Troubleshooting
references/troubleshooting.md
HTTP API
references/http-api.md
Integrations
references/integrations.md
Key Features
Root Cause Analysis
Investigates alerts and cluster issues
Multi-Source Integration
30+ toolsets (K8s, Prometheus, Grafana)
Alert Integration
AlertManager, PagerDuty, OpsGenie, Jira, Slack
Interactive Mode
Troubleshooting with
/run
,
/show
,
/clear
Custom Toolsets
Extend with proprietary tools via YAML configuration
CI/CD Integration
Automated deployment failure investigation Installation Quick Start CLI (Homebrew) brew tap robusta-dev/homebrew-holmesgpt brew install holmesgpt export ANTHROPIC_API_KEY = "your-key"

or OPENAI_API_KEY

holmes ask "what pods are unhealthy?" Kubernetes (Helm) helm repo add robusta https://robusta-charts.storage.googleapis.com helm repo update helm install holmesgpt robusta/holmes -f values.yaml Docker docker run -it --net = host \ -e OPENAI_API_KEY = "your-key" \ -v ~/.kube/config:/root/.kube/config \ us-central1-docker.pkg.dev/genuine-flight-317411/devel/holmes \ ask "what pods are crashing?" Essential Commands

Basic investigation

holmes ask "what pods are unhealthy and why?" holmes ask "why is my deployment failing?"

Interactive mode

holmes ask "investigate issue" --interactive

Alert investigation

holmes investigate alertmanager --alertmanager-url http://localhost:9093 holmes investigate pagerduty --pagerduty-api-key < KEY

--update

With file context

holmes ask "summarize the key points" -f ./logs.txt

CI/CD integration

holmes ask "why did deployment fail?" --destination slack --slack-token < TOKEN

Supported AI Providers Provider Environment Variable Models Anthropic ANTHROPIC_API_KEY Sonnet 4, Opus 4.5 OpenAI OPENAI_API_KEY GPT-4.1, GPT-4o Azure OpenAI AZURE_API_KEY GPT-4.1 AWS Bedrock AWS credentials Claude 3.5 Sonnet Google Gemini GEMINI_API_KEY Gemini 1.5 Pro Vertex AI VERTEXAI_PROJECT Gemini 1.5 Pro Ollama Local install Llama 3.1, Mistral Basic Helm Values Structure

values.yaml for Kubernetes deployment

image : repository : robustadev/holmes tag : latest env : - name : ANTHROPIC_API_KEY valueFrom : secretKeyRef : name : holmesgpt - secrets key : anthropic - api - key

Model configuration

modelList : sonnet : api_key : "{{ env.ANTHROPIC_API_KEY }}" model : anthropic/claude - sonnet - 4 - 20250514 temperature : 0

Toolsets to enable

toolsets : kubernetes/core : enabled : true kubernetes/logs : enabled : true prometheus/metrics : enabled : true

Resources

resources : requests : memory : "1024Mi" cpu : "100m" limits : memory : "1024Mi"

RBAC (read-only by default)

createServiceAccount : true Interactive Mode Commands Command Description /clear Reset context when changing topics /run Execute custom commands and share output with AI /show Display complete tool outputs /context Review accumulated investigation information Custom Toolset Example

custom-toolset.yaml

toolsets : my-custom-tool : description : "Custom diagnostic tool" tools : - name : check_service_health description : "Check health of a specific service" command : | curl -s http://{{ service_name }}.{{ namespace }}.svc.cluster.local/health parameters : - name : service_name description : "Name of the service" - name : namespace description : "Kubernetes namespace" Use with: holmes ask "check health" -t custom-toolset.yaml Kubernetes Annotations for Integration

Add to Services/Deployments for HolmesGPT context

metadata
:
annotations
:
holmesgpt.dev/runbook
:
|
This service handles payment processing.
Common issues: database connectivity, API rate limits.
Check: kubectl logs -l app=payment-service
Environment Variables Reference
Variable
Description
Default
HOLMES_CONFIG_PATH
Config file path
~/.holmes/config.yaml
HOLMES_LOG_LEVEL
Log verbosity
INFO
PROMETHEUS_URL
Prometheus server URL
-
GITHUB_TOKEN
GitHub API token
-
DATADOG_API_KEY
DataDog API key
-
CONFLUENCE_BASE_URL
Confluence URL
-
Best Practices
Use Specific Queries
Include namespace, deployment name, symptoms
Start with Claude Sonnet 4.0/4.5
Best accuracy for complex investigations
Enable Relevant Toolsets
Only enable what you need to reduce noise
Use Interactive Mode
For complex multi-step investigations
Set Up Runbooks
Provide context for known alert types
CI/CD Integration
Automate deployment failure analysis Security Considerations HolmesGPT uses read-only access ( get , list , watch only) Respects existing RBAC permissions Never modifies, creates, or deletes resources API keys stored in Kubernetes Secrets Data not used for model training Official Resources Documentation: https://holmesgpt.dev/ GitHub: https://github.com/robusta-dev/holmesgpt Helm Chart: https://github.com/robusta-dev/holmesgpt/tree/master/helm/holmes Slack Community: Cloud Native Slack
返回排行榜