- Kubernetes Operations
- Expert knowledge for Kubernetes cluster management, deployment, and troubleshooting with mastery of kubectl and cloud-native patterns.
- Core Expertise
- Kubernetes Operations
- Workload Management
-
- Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs
- Networking
-
- Services, Ingress, NetworkPolicies, and DNS configuration
- Configuration & Storage
-
- ConfigMaps, Secrets, PersistentVolumes, and PersistentVolumeClaims
- Troubleshooting
-
- Debugging pods, analyzing logs, and inspecting cluster events
- Cluster Operations Process
- Manifest First
-
- Always prefer declarative YAML manifests for resource management
- Validate & Dry-Run
-
- Use
- kubectl apply --dry-run=client
- to validate changes
- Inspect & Verify
-
- After applying changes, verify with
- kubectl get
- ,
- kubectl describe
- ,
- kubectl logs
- Monitor Health
-
- Continuously check status of nodes, pods, and services
- Clean Up
- Ensure old or unused resources are properly garbage collected Essential Commands
Resource management
kubectl apply -f manifest.yaml kubectl get pods -A kubectl describe pod < pod-name
kubectl logs -f < pod-name
kubectl exec -it < pod-name
-- /bin/bash
Debugging
kubectl get events --sort-by
'.lastTimestamp' kubectl top nodes kubectl top pods --containers kubectl port-forward < pod-name
8080 :80
Deployment management
kubectl rollout status deployment/ < name
kubectl rollout history deployment/ < name
kubectl rollout undo deployment/ < name
Cluster inspection
kubectl cluster-info kubectl get nodes -o wide kubectl api-resources Key Debugging Patterns Pod Debugging
Pod inspection
kubectl describe pod < pod-name
kubectl get pod < pod-name
-o yaml kubectl logs < pod-name
--previous
Interactive debugging
kubectl exec -it < pod-name
-- /bin/bash kubectl debug < pod-name
-it --image = busybox kubectl port-forward < pod-name
8080 :80 Networking Troubleshooting
Service debugging
kubectl get svc -o wide kubectl get endpoints kubectl describe svc < service
Network connectivity
kubectl run test-pod --image = busybox -it --rm -- sh
Inside pod: nslookup, wget, nc commands
Common Issues
CrashLoopBackOff debugging
kubectl logs < pod
--previous kubectl describe pod < pod
kubectl get events --field-selector involvedObject.name = < pod
Resource constraints
kubectl top pod < pod
kubectl describe pod < pod
| grep -A 5 Limits
State management
kubectl state list kubectl state show < resource
Best Practices Context Safety (CRITICAL) Always specify --context explicitly in every kubectl command Never rely on the current context - it may have been changed by another process Use kubectl --context=
get pods format for all operations This prevents accidental operations on the wrong cluster (e.g., running production commands against staging)
CORRECT: Explicit context
kubectl --context = gke_myproject_us-central1_prod get pods kubectl --context = staging-cluster apply -f deployment.yaml
WRONG: Relying on current context
kubectl get pods
Which cluster is this targeting?
Resource Definitions
Use declarative YAML manifests
Implement proper labels and selectors
Define resource requests and limits
Configure health checks (liveness/readiness probes)
Security
Use NetworkPolicies to restrict traffic
Implement RBAC for access control
Store sensitive data in Secrets
Run containers as non-root users
Monitoring
Configure proper logging and metrics
Set up alerts for critical conditions
Use health checks and readiness probes
Monitor resource usage and quotas
Agentic Optimizations
Context
Command
Pod status (structured)
kubectl get pods -n