Kubernetes

Zero-Downtime Deployments with Rolling Updates and Readiness Probes

Beginner25 min to complete8 min read

Most Kubernetes deployments drop traffic during updates because readiness probes are misconfigured or missing. This tutorial shows you the exact configuration that eliminates downtime — and how to verify it.

Before you begin

  • kubectl configured against a running cluster
  • Basic understanding of Kubernetes Pods and Deployments
Kubernetes
Deployments
Rolling Updates
Readiness Probes
DevOps

The default Kubernetes rolling update strategy sounds safe: bring up new pods before terminating old ones. But if your readiness probe isn't configured correctly, Kubernetes will send traffic to pods that aren't ready yet — and your users see errors.

This tutorial covers the complete configuration that makes rolling updates actually zero-downtime.

Why Traffic Drops During Updates

When Kubernetes updates a deployment, it follows this sequence:

  1. Create a new pod with the updated image
  2. Wait for the pod to pass its readiness probe
  3. Add the pod to the Service endpoints
  4. Terminate an old pod
  5. Repeat until all pods are updated

The problem: if there's no readiness probe, step 2 considers the pod ready the moment the container starts. Your app might need 5–10 seconds to warm up its database connections, load config, or compile templates. During that window, the pod receives traffic it can't handle.

The second problem: when a pod receives SIGTERM (step 4), it might still be handling active requests. If the app exits immediately, those requests fail.

Step 1: Deploy a Sample Application Without Probes

bash
1kubectl apply -f - <<EOF
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5  name: api-server
6spec:
7  replicas: 3
8  selector:
9    matchLabels:
10      app: api-server
11  template:
12    metadata:
13      labels:
14        app: api-server
15    spec:
16      containers:
17        - name: api
18          image: nginx:1.24
19          ports:
20            - containerPort: 80
21---
22apiVersion: v1
23kind: Service
24metadata:
25  name: api-server
26spec:
27  selector:
28    app: api-server
29  ports:
30    - port: 80
31      targetPort: 80
32EOF

Step 2: Add Readiness and Liveness Probes

bash
1kubectl apply -f - <<EOF
2apiVersion: apps/v1
3kind: Deployment
4metadata:
5  name: api-server
6spec:
7  replicas: 3
8  selector:
9    matchLabels:
10      app: api-server
11  strategy:
12    type: RollingUpdate
13    rollingUpdate:
14      maxSurge: 1         # Allow 1 extra pod during update
15      maxUnavailable: 0   # Never reduce below desired count
16  template:
17    metadata:
18      labels:
19        app: api-server
20    spec:
21      terminationGracePeriodSeconds: 60
22      containers:
23        - name: api
24          image: nginx:1.24
25          ports:
26            - containerPort: 80
27          readinessProbe:
28            httpGet:
29              path: /healthz
30              port: 80
31            initialDelaySeconds: 5
32            periodSeconds: 5
33            failureThreshold: 3
34            successThreshold: 1
35          livenessProbe:
36            httpGet:
37              path: /healthz
38              port: 80
39            initialDelaySeconds: 15
40            periodSeconds: 10
41            failureThreshold: 3
42          lifecycle:
43            preStop:
44              exec:
45                command: ["sleep", "15"]
46EOF

Let's break down each piece:

maxUnavailable: 0 — Kubernetes must never have fewer than the desired replica count available. This forces it to bring up the new pod first (maxSurge: 1) before terminating any old pod.

readinessProbe — Until this passes, the pod does not receive traffic. initialDelaySeconds: 5 gives the app 5 seconds before the first check. failureThreshold: 3 means three consecutive failures before marking the pod unready.

livenessProbe — If this fails, Kubernetes restarts the container. Set initialDelaySeconds higher than your readiness probe — you don't want the liveness probe killing a pod that's still starting up.

terminationGracePeriodSeconds: 60 — Kubernetes waits up to 60 seconds for the pod to exit after sending SIGTERM before force-killing with SIGKILL.

lifecycle.preStop: sleep 15 — When Kubernetes removes a pod from the Service endpoints, it's not instantaneous. The kube-proxy and cloud load balancer take a few seconds to propagate the change. The preStop sleep gives in-flight requests time to complete before SIGTERM is sent to your process.

Step 3: Create a Health Endpoint

For nginx, add a simple health route using a ConfigMap:

bash
1kubectl apply -f - <<EOF
2apiVersion: v1
3kind: ConfigMap
4metadata:
5  name: nginx-health
6data:
7  default.conf: |
8    server {
9        listen 80;
10        location /healthz {
11            return 200 "ok\n";
12            add_header Content-Type text/plain;
13        }
14        location / {
15            return 200 "hello\n";
16        }
17    }
18EOF
19
20kubectl patch deployment api-server --type=json -p='[
21  {"op": "add", "path": "/spec/template/spec/volumes", "value": [{"name": "nginx-conf", "configMap": {"name": "nginx-health"}}]},
22  {"op": "add", "path": "/spec/template/spec/containers/0/volumeMounts", "value": [{"name": "nginx-conf", "mountPath": "/etc/nginx/conf.d"}]}
23]'

Step 4: Verify Zero-Downtime During Update

Send continuous traffic in one terminal:

bash
kubectl run load --rm -it --image=busybox -- sh -c \
  'while true; do wget -qO- http://api-server/healthz && sleep 0.1; done'

In another terminal, trigger an update:

bash
kubectl set image deployment/api-server api=nginx:1.25

Watch the rollout:

bash
kubectl rollout status deployment/api-server
# Waiting for deployment "api-server" rollout to finish: 1 out of 3 new replicas have been updated...
# Waiting for deployment "api-server" rollout to finish: 1 old replicas are pending termination...
# deployment "api-server" successfully rolled out

You should see no errors in the load terminal — every request returns ok.

Step 5: Roll Back if Something Goes Wrong

bash
1# View rollout history
2kubectl rollout history deployment/api-server
3
4# Roll back to the previous version
5kubectl rollout undo deployment/api-server
6
7# Roll back to a specific revision
8kubectl rollout undo deployment/api-server --to-revision=2

Verification Checklist

Before declaring a deployment zero-downtime capable, verify:

bash
1# Readiness probe is configured
2kubectl get deployment api-server -o jsonpath='{.spec.template.spec.containers[0].readinessProbe}'
3
4# maxUnavailable is 0
5kubectl get deployment api-server -o jsonpath='{.spec.strategy.rollingUpdate}'
6# {"maxSurge":"1","maxUnavailable":"0"}
7
8# terminationGracePeriodSeconds is set
9kubectl get deployment api-server -o jsonpath='{.spec.template.spec.terminationGracePeriodSeconds}'
10# 60
11
12# preStop hook exists
13kubectl get deployment api-server -o jsonpath='{.spec.template.spec.containers[0].lifecycle}'

Common Mistakes

Liveness probe timing out before the app starts — set initialDelaySeconds on the liveness probe to at least 2× your app's startup time. If liveness fires before the app is ready, Kubernetes restart-loops your pod indefinitely.

No preStop sleep — without it, SIGTERM fires while the load balancer still routes traffic to the pod. Even a 5-second sleep is better than nothing.

maxUnavailable: 25% (the default) — the default allows 25% of pods to be unavailable during an update. For a 4-replica deployment, that's 1 pod down while the new one starts. Fine for internal services, not for production APIs.

Official References

We built Podscape to simplify Kubernetes workflows like this — logs, events, and cluster state in one interface, without switching tools.

Struggling with this in production?

We help teams fix these exact issues. Our engineers have deployed these patterns across production environments at scale.