Manual certificate management is a 2am pager alert waiting to happen. Someone forgets to renew, the cert expires on a Friday afternoon, and now you're scrambling to run openssl commands in production while users are seeing browser warnings. I've been there. cert-manager eliminates this entirely — it issues, renews, and rotates certificates automatically, and integrates directly with Kubernetes Ingress. This tutorial wires it up end-to-end, from a first install through wildcard certs, with the debugging steps you'll actually need.

What You'll Build

By the end of this tutorial you'll have:

cert-manager installed via Helm with CRDs managed properly
A Let's Encrypt staging ClusterIssuer for testing (no rate limits)
A Certificate issued via HTTP-01 challenge wired to a real Ingress
A production ClusterIssuer switch once staging is confirmed working
A DNS-01 challenge example for wildcard certificates using Cloudflare

The full flow takes about 45 minutes the first time. Once you have the pattern down, replicating it across domains takes 5 minutes.

Step 1: Install cert-manager

cert-manager ships as a Helm chart. The crds.enabled=true flag tells Helm to install the CRDs as part of the release — this means they'll be upgraded and removed with the chart, which is what you want.

bash

1helm repo add jetstack https://charts.jetstack.io
2helm repo update
3
4helm install cert-manager jetstack/cert-manager \
5  --namespace cert-manager \
6  --create-namespace \
7  --version v1.14.0 \
8  --set crds.enabled=true

Verify the pods are running before moving on:

bash

kubectl get pods -n cert-manager

You should see three pods all in Running state:

NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-7d75b9b5b5-xkjqp             1/1     Running   0          90s
cert-manager-cainjector-6c9f7b5b8-4frcl   1/1     Running   0          90s
cert-manager-webhook-6b9b7b5b5-z7vdm      1/1     Running   0          90s

If the webhook pod is stuck in Init or ContainerCreating, wait another 30 seconds — it needs to inject its own CA before it can serve requests. If it stays stuck, check kubectl logs -n cert-manager deployment/cert-manager-webhook.

Step 2: Create a Staging ClusterIssuer

Always start with Let's Encrypt staging. Production is rate-limited to 5 duplicate certificates per domain per week. If you make a configuration mistake — and you will the first time — you'll burn that quota on failed attempts. Staging has no rate limits. The only difference is that staging issues certificates from a fake CA that browsers don't trust, which is exactly what you want for testing.

yaml

1apiVersion: cert-manager.io/v1
2kind: ClusterIssuer
3metadata:
4  name: letsencrypt-staging
5spec:
6  acme:
7    server: https://acme-staging-v02.api.letsencrypt.org/directory
8    email: your@email.com
9    privateKeySecretRef:
10      name: letsencrypt-staging-key
11    solvers:
12      - http01:
13          ingress:
14            ingressClassName: nginx

Apply it:

bash

kubectl apply -f clusterissuer-staging.yaml

Check that the issuer registered successfully:

bash

kubectl describe clusterissuer letsencrypt-staging

Look for Status: True and Ready in the conditions. If you see Failed to register ACME account, check that the email is valid and the ACME server URL is correct.

Step 3: Issue a Certificate via HTTP-01

There are two ways to trigger cert-manager: annotate your Ingress directly, or create a standalone Certificate resource. I'll show both — use annotations for simple cases, the Certificate resource when you need more control (multiple SANs, custom duration, etc.).

Approach 1: Ingress annotation

Add these to your existing Ingress:

yaml

1apiVersion: networking.k8s.io/v1
2kind: Ingress
3metadata:
4  name: my-app-ingress
5  namespace: default
6  annotations:
7    cert-manager.io/cluster-issuer: "letsencrypt-staging"
8spec:
9  ingressClassName: nginx
10  tls:
11    - hosts:
12        - yourdomain.com
13      secretName: yourdomain-tls
14  rules:
15    - host: yourdomain.com
16      http:
17        paths:
18          - path: /
19            pathType: Prefix
20            backend:
21              service:
22                name: my-app
23                port:
24                  number: 80

cert-manager watches for Ingress resources with that annotation and automatically creates a Certificate resource for you.

Approach 2: Certificate resource

When you want explicit control — multiple domains, custom renewal window, or a cert not tied to a specific Ingress:

yaml

1apiVersion: cert-manager.io/v1
2kind: Certificate
3metadata:
4  name: yourdomain-cert
5  namespace: default
6spec:
7  secretName: yourdomain-tls
8  issuerRef:
9    name: letsencrypt-staging
10    kind: ClusterIssuer
11  dnsNames:
12    - yourdomain.com

Apply and watch the status:

bash

kubectl apply -f certificate.yaml
kubectl get certificate -w

The READY column should flip to True within 60–90 seconds if everything is configured correctly. If it stays False, move to the next step.

Step 4: Debug the Stuck Certificate

This is where everyone gets lost. cert-manager creates a chain of resources during issuance: Certificate → CertificateRequest → Order → Challenge. If something fails, it fails at one level of this chain. You need to walk down the chain to find where.

bash

1# Level 1: Certificate — shows overall status and any top-level errors
2kubectl describe certificate yourdomain-cert -n default
3
4# Level 2: CertificateRequest — shows the actual CSR and approval status
5kubectl describe certificaterequest -n default
6
7# Level 3: Order — shows the ACME order state with Let's Encrypt
8kubectl describe order -n default
9
10# Level 4: Challenge — shows the individual challenge attempt
11kubectl describe challenge -n default

The Challenge resource is usually where you find the actual error. Common failure reasons:

HTTP-01 challenge URL not reachable. Let's Encrypt sends a GET request to http://yourdomain.com/.well-known/acme-challenge/<token>. If port 80 isn't publicly reachable, or your Ingress controller isn't routing that path, the challenge fails. Check it yourself:

bash

curl http://yourdomain.com/.well-known/acme-challenge/<token>

You should get back the token value. If you get a 404 or connection refused, the Ingress isn't routing the challenge path correctly. Some Ingress controllers block /.well-known/ by default — check your nginx configuration.

ingressClassName mismatch. The ingressClassName in your ClusterIssuer solver must match the class your Ingress controller is actually using. Check with:

bash

kubectl get ingressclass

DNS not propagated. If you just created the DNS record for the domain, Let's Encrypt might be checking before it resolves. Wait for propagation and then delete the stuck Challenge resource to trigger a retry:

bash

kubectl delete challenge -n default <challenge-name>

cert-manager will create a new one automatically.

cert-manager logs. When all else fails:

bash

kubectl logs -n cert-manager deployment/cert-manager -f

The log lines are verbose but the errors are clear — look for lines containing Error or Failed.

Step 5: Switch to Production

Once your staging certificate is issued, kubectl get certificate shows READY=True. The cert itself will be from Let's Encrypt's staging CA (browsers will warn — that's expected). At this point you know the full pipeline works. Now switch to production.

yaml

1apiVersion: cert-manager.io/v1
2kind: ClusterIssuer
3metadata:
4  name: letsencrypt-production
5spec:
6  acme:
7    server: https://acme-v02.api.letsencrypt.org/directory
8    email: your@email.com
9    privateKeySecretRef:
10      name: letsencrypt-production-key
11    solvers:
12      - http01:
13          ingress:
14            ingressClassName: nginx

bash

kubectl apply -f clusterissuer-production.yaml

Update your Ingress annotation or Certificate resource to reference letsencrypt-production. Then delete the old TLS secret so cert-manager issues a fresh one from the production CA — cert-manager won't replace an existing secret automatically when you change the issuer:

bash

kubectl delete secret yourdomain-tls -n default

cert-manager detects the missing secret and triggers a new issuance immediately. Watch it:

bash

kubectl get certificate -w -n default

Within 60–90 seconds, READY flips to True and the secret is recreated with a valid, browser-trusted certificate. Your Ingress controller picks it up without any restart.

Step 6: DNS-01 for Wildcard Certificates

HTTP-01 challenges only work for exact hostnames. If you want *.yourdomain.com — so every subdomain gets TLS without individual certificates — you need DNS-01. Instead of serving a file over HTTP, cert-manager creates a DNS TXT record to prove domain ownership. This means it needs API access to your DNS provider.

I'll use Cloudflare here since it's common, but cert-manager supports Route53, Google Cloud DNS, Azure DNS, and many others via the same pattern.

Create a Cloudflare API token with Zone:DNS:Edit permissions for your domain. Then store it as a secret in the cert-manager namespace (not your app namespace — the ClusterIssuer lives there):

bash

kubectl create secret generic cloudflare-api-token \
  --from-literal=api-token=<your-cloudflare-api-token> \
  -n cert-manager

Create the DNS-01 ClusterIssuer:

yaml

1apiVersion: cert-manager.io/v1
2kind: ClusterIssuer
3metadata:
4  name: letsencrypt-dns01
5spec:
6  acme:
7    server: https://acme-v02.api.letsencrypt.org/directory
8    email: your@email.com
9    privateKeySecretRef:
10      name: letsencrypt-dns01-key
11    solvers:
12      - dns01:
13          cloudflare:
14            apiTokenSecretRef:
15              name: cloudflare-api-token
16              key: api-token

bash

kubectl apply -f clusterissuer-dns01.yaml

Request the wildcard certificate. Note that Let's Encrypt requires you to include both *.yourdomain.com and yourdomain.com as separate SANs — the wildcard doesn't cover the apex domain:

yaml

1apiVersion: cert-manager.io/v1
2kind: Certificate
3metadata:
4  name: wildcard-cert
5  namespace: default
6spec:
7  secretName: wildcard-tls
8  issuerRef:
9    name: letsencrypt-dns01
10    kind: ClusterIssuer
11  dnsNames:
12    - "*.yourdomain.com"
13    - "yourdomain.com"

bash

kubectl apply -f wildcard-certificate.yaml
kubectl get certificate wildcard-cert -w

DNS-01 issuance takes longer than HTTP-01 — up to 2–3 minutes while cert-manager creates the TXT record and Let's Encrypt waits for DNS propagation. If it gets stuck, check the Challenge resource:

bash

kubectl describe challenge -n default

Look for errors around DNS propagation or API token permissions. A common mistake is using a Cloudflare API key (account-level) instead of an API token (zone-level) — they're different things in the Cloudflare dashboard.

Verification

After issuance, verify the certificate contents:

bash

1# List all certificates across namespaces
2kubectl get certificate -A
3
4# Inspect the issued cert — check Subject, SANs, and expiry
5kubectl get secret yourdomain-tls -o jsonpath='{.data.tls\.crt}' | \
6  base64 -d | \
7  openssl x509 -noout -text | \
8  grep -E "Subject:|DNS:|Not After"
9
10# Verify the live HTTPS endpoint
11curl -v https://yourdomain.com 2>&1 | grep -E "SSL|issuer|expire"

For the production issuer, the issuer line from curl should reference R10 or R11 (Let's Encrypt's current intermediate CAs), not the fake staging CA. If you still see the staging CA, the old secret wasn't deleted and re-issued — delete it again and wait.

cert-manager automatically renews certificates 30 days before expiry by default. You don't need to do anything — it handles the renewal challenge the same way it handled the initial issuance. You can verify renewal is configured by checking the Certificate status:

bash

kubectl describe certificate yourdomain-cert -n default | grep -A5 "Renewal Time"

Common Mistakes

Using production before verifying with staging. You will hit a configuration issue on your first attempt. Always validate with staging first or you'll burn your 5-cert/domain/week quota on debugging. Staging is indistinguishable from production during the issuance process — the only difference is the CA root.

Reusing the same secretName across Certificates. Each Certificate resource must have a unique secretName. If two Certificates point to the same secret, cert-manager will fight over it and produce undefined behavior. This is a subtle bug that only surfaces when you have multiple domains.

HTTP-01 with no port 80 exposure. If your Ingress controller only accepts HTTPS (port 443 only), HTTP-01 challenges will never complete. Let's Encrypt needs to reach port 80. Either open port 80 on your load balancer for the challenge path, or switch to DNS-01.

Not deleting the old TLS secret when switching issuers. cert-manager will not replace an existing, valid secret with a certificate from a different issuer. It sees the secret exists, assumes it's managed, and does nothing. Always delete the secret manually when switching from staging to production.

Issuer vs ClusterIssuer namespace mismatch. Issuer is namespace-scoped — a Certificate in namespace: production can't reference an Issuer in namespace: staging. Use ClusterIssuer (cluster-scoped) unless you have a specific reason to scope issuers per namespace. Most teams use ClusterIssuer exclusively.

Cleanup

bash

1kubectl delete certificate yourdomain-cert wildcard-cert -n default
2kubectl delete secret yourdomain-tls wildcard-tls -n default
3kubectl delete clusterissuer letsencrypt-staging letsencrypt-production letsencrypt-dns01
4kubectl delete secret cloudflare-api-token -n cert-manager
5
6helm uninstall cert-manager -n cert-manager
7kubectl delete namespace cert-manager

Note that uninstalling the Helm release with crds.enabled=true will also delete the CRDs and all cert-manager resources. If you have other tooling depending on those CRDs, skip the Helm uninstall and clean up manually.

From here, the natural next step is integrating cert-manager with an internal PKI for private certificates using the CA issuer type, or setting up trust-manager to distribute CA bundles across namespaces. But for public-facing services, what you've built here covers the full lifecycle — issuance, renewal, wildcard support, and the debugging path when things go wrong.

Official References

cert-manager Documentation — Official docs covering installation, issuers, certificates, and troubleshooting
ACME HTTP-01 Challenge — How cert-manager implements HTTP-01 challenges and the Ingress solver
ACME DNS-01 Challenge — DNS-01 solvers for all supported providers including Cloudflare, Route53, and Google Cloud DNS
Let's Encrypt Rate Limits — Official rate limit documentation — read before using the production ACME server
cert-manager Troubleshooting — Official debugging guide for stuck certificates and challenge failures

Automating TLS with cert-manager and Let's Encrypt

Before you begin

What You'll Build

Step 1: Install cert-manager

Step 2: Create a Staging ClusterIssuer

Step 3: Issue a Certificate via HTTP-01

Step 4: Debug the Stuck Certificate

Step 5: Switch to Production

Step 6: DNS-01 for Wildcard Certificates

Verification

Common Mistakes

Cleanup

Official References

Struggling with this in production?