Skip to content

macmiranda/kubernetes-demos

 
 

Repository files navigation

Introduction

Practice Kubernetes troubleshooting with realistic error scenarios.

Each scenario is run with kubectl apply commands. To cleanup, run kubectl delete on the same.

Scenarios

Pod Issues

Crashing Pod (CrashLoopBackoff)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/crashpod/broken.yaml
Example:

OOMKilled Pod (Out of Memory Kill)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/oomkill/oomkill_job.yaml
Example:

High CPU Throttling (CPUThrottlingHigh)

Apply the following YAML and wait 15 minutes. (CPU throttling is only an issue if it occurs for a meangingful periods of time. Less than 15 minutes of throttling typically does not trigger an alert.)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/cpu_throttling/throttling.yaml
Example:

Pending Pod

Apply the following YAML and wait 15 minutes. (By default, most systems only alert after pods are pending for 15 minutes. This prevents false alarms on autoscaled clusters, where its OK for pods to be temporarily pending.)

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/pending_pods/pending_pod_node_selector.yaml
Example:

ImagePullBackOff

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/image_pull_backoff/no_such_image.yaml
Example:

Liveness Probe

Apply the following YAML to simulate a Liveness probe fail.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/liveness_probe_fail/failing_liveness_probe.yaml
Example:

Job Issues

Failing Job

Deploy a failing job. The job will fail after 60 seconds, then attempt to run again. After two attempts, it will fail for good.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/job_failure/job_crash.yaml
Example:

Helm Monitoring

Add robusta's helm chart repository:

helm repo add robusta https://robusta-charts.storage.googleapis.com && helm repo update

Deploy a failing release:

helm install kubewatch robusta/kubewatch --set='rbac.create=true,updateStrategy.type=Error' --namespace demo-namespace --create-namespace

Deploy a successful release:

helm upgrade kubewatch robusta/kubewatch --set='rbac.create=true' --namespace demo-namespace --create-namespace

Uninstall kubewatch:

helm del kubewatch  --namespace demo-namespace 

Delete the test namespace:

kubectl delete namespace demo-namespace 
Example:

An example of broken Helm release, using Robusta's Helm Releases Monitoring feature.

Other Demos

Change Tracking

Deploy a healthy pod. Then break it.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/crashpod/healthy.yaml
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/crashpod/broken.yaml

Now audit your cluster. If someone else made this change, would you be able to pinpoint the change that broke the application?

Example:

Deployment Image Change Tracking

Create an nginx deployment. Then change the image tag to simulate an unexpected image tag change.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/deployment_image_change/before_image_change.yaml
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/deployment_image_change/after_image_change.yaml

Did you immediately get notified about a change in the image tag? Note: You will need to configure a playbook for this to work. Instructions coming soon!

Example:

Ingress Port and Path Change Tracking

Create an ingress. Then changes its port and path to simulate an unexpected ingress modification.

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/ingress_port_path_change/before_port_path_change.yaml
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/ingress_port_path_change/after_port_path_change.yaml

Did you immediately get notified about a change in the port number and path? Note: You will need to configure a playbook for this to work. Instructions coming soon!

Example:

Drift and Namespace Comparison

kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/namespace_drift/example.yaml

Can you quickly tell the difference between the compare1 and compare2 namespaces? What is the drift between them?

Example:

High overhead of GKE Nodes

On GKE, nodes can reserve more than 50% of CPU for themselves. Users pay for CPU that is unavailable to applications.

Reproduction:

  1. Create a default GKE cluster with autopilot disabled. Don't change any other settings.
  2. Deploy the following pod:
kubectl apply -f https://raw.githubusercontent.com/robusta-dev/kubernetes-demos/main/gke_node_allocatable/gke_issue.yaml
  1. Run kubectl get pods -o wide gke-node-allocatable-issue

The pod will be Pending. A Pod requesting 1 CPU cannot run on an empty node with 2 CPUs!

Example:

About

YAMLs for creating Kubernetes errors and other scenarios

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%