ACCESSING CLUSTERS CRITICAL: Always prefix kubectl/flux commands with inline KUBECONFIG assignment. Do NOT use export or && - the variable must be set in the same command:

CORRECT - inline assignment

KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl get pods

WRONG - export with && breaks in some shell contexts

export KUBECONFIG =~ /.kube/ < cluster

.yaml && kubectl get pods Cluster Context CRITICAL: Always confirm cluster before running commands. Cluster Purpose Kubeconfig dev Manual testing ~/.kube/dev.yaml integration Automated testing ~/.kube/integration.yaml live Production ~/.kube/live.yaml KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl < command

Accessing Internal Services Platform services are exposed through the internal ingress gateway over HTTPS. DNS URLs are useful for browser-based access (Grafana, Hubble UI, Longhorn UI). OAuth2 Proxy caveat: Prometheus, Alertmanager, and some other services are behind OAuth2 Proxy. DNS URLs redirect to an OAuth login page and cannot be used for API queries via curl . Use kubectl exec or port-forward instead for programmatic access. Service Live Auth API Access Prometheus https://prometheus.internal.tomnowak.work OAuth2 Proxy kubectl exec or port-forward Alertmanager https://alertmanager.internal.tomnowak.work OAuth2 Proxy kubectl exec or port-forward Grafana https://grafana.internal.tomnowak.work Built-in auth Browser only Hubble UI https://hubble.internal.tomnowak.work None Browser Longhorn UI https://longhorn.internal.tomnowak.work None Browser Garage Admin https://garage.internal.tomnowak.work None Browser Domain pattern: .internal..tomnowak.work live: internal.tomnowak.work integration: internal.integration.tomnowak.work dev: internal.dev.tomnowak.work Querying Prometheus/Alertmanager API

Option 1: kubectl exec (quick, no setup)

KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl exec -n monitoring prometheus-kube-prometheus-stack-0 -c prometheus -- \ wget -qO- 'http://localhost:9090/api/v1/query?query=up' | jq '.data.result' KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl exec -n monitoring prometheus-kube-prometheus-stack-0 -c prometheus -- \ wget -qO- 'http://localhost:9090/api/v1/alerts' | jq '.data.alerts[] | select(.state == "firing")' KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl exec -n monitoring alertmanager-kube-prometheus-stack-0 -c alertmanager -- \ wget -qO- 'http://localhost:9093/api/v2/alerts' | jq .

Option 2: Port-forward (for scripts and repeated queries)

KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl port-forward -n monitoring svc/prometheus-operated 9090 :9090 & curl -s "http://localhost:9090/api/v1/query?query=up" | jq '.data.result' Using the helper scripts:

Prometheus (start port-forward first; script defaults to http://localhost:9090)

KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl port-forward -n monitoring svc/prometheus-operated 9090 :9090 & .claude/skills/prometheus/scripts/promql.sh alerts --firing

Loki (no HTTPRoute — always requires port-forward)

KUBECONFIG =~ /.kube/ < cluster

.yaml kubectl port-forward -n monitoring svc/loki-headless 3100 :3100 & export LOKI_URL = http://localhost:3100 .claude/skills/loki/scripts/logql.sh tail '{namespace="monitoring"}' --since 15m Common kubectl Patterns Read-only commands used during daily operations and investigations: Command Purpose kubectl get pods -n List pods in a namespace kubectl get pods -A List pods across all namespaces kubectl describe pod -n Detailed pod info with events kubectl logs -n --tail=100 Recent logs from a pod kubectl logs -n --previous Logs from previous container instance kubectl get events -n --sort-by='.lastTimestamp' Recent events timeline kubectl top pods -n CPU/memory usage per pod kubectl top nodes CPU/memory usage per node kubectl get ns --show-labels Namespace labels (network policy profiles) kubectl explain API schema reference for a resource type Flux GitOps Commands Status and Reconciliation

Check status

KUBECONFIG =~ /.kube/ < cluster

.yaml flux get all KUBECONFIG =~ /.kube/ < cluster

.yaml flux get kustomizations KUBECONFIG =~ /.kube/ < cluster

.yaml flux get helmreleases -A

Trigger reconciliation

KUBECONFIG =~ /.kube/ < cluster

.yaml flux reconcile source git flux-system KUBECONFIG =~ /.kube/ < cluster

.yaml flux reconcile kustomization < name

KUBECONFIG =~ /.kube/ < cluster

.yaml flux reconcile helmrelease < name

-n < namespace

Flux Status Interpretation Status Meaning Action Ready: True Resource is reconciled and healthy None - operating normally Ready: False Resource failed to reconcile Check the message/reason for details Stalled: True Resource has stopped retrying after repeated failures Suspend/resume to reset (see sre skill) Suspended: True Resource is intentionally paused Resume when ready: flux resume Reconciling Resource is actively being applied Wait for completion Researching Unfamiliar Services When investigating unknown services, spawn a haiku agent to research documentation: Task tool: - subagent_type: "general-purpose" - model: "haiku" - prompt: "Research [service] troubleshooting docs. Focus on: 1. Common failure modes 2. Health indicators 3. Configuration gotchas Start with: [docs-url]" Chart URL to Docs mapping: Chart Source Documentation charts.jetstack.io cert-manager.io/docs charts.longhorn.io longhorn.io/docs grafana.github.io grafana.com/docs prometheus-community.github.io prometheus.io/docs Common Confusions BAD: Use helm list to check Helm release status GOOD: Use kubectl get helmrelease -A - Flux manages releases via CRDs, not Helm CLI Keywords kubernetes, kubectl, kubeconfig, flux, flux status, cluster access, internal URL, service URL, port-forward, helm release, gitops, reconciliation

k8s

安装

CORRECT - inline assignment

WRONG - export with && breaks in some shell contexts

Option 1: kubectl exec (quick, no setup)

Option 2: Port-forward (for scripts and repeated queries)

Prometheus (start port-forward first; script defaults to http://localhost:9090)

Loki (no HTTPRoute — always requires port-forward)

Check status

Trigger reconciliation