Recovery¶
Recovering Expired Kubelet Certificates¶
If kubelet certificates expire and nodes can no longer communicate with the API server, use the recover-expired-kubelet-certs.yaml playbook.
Symptoms¶
kubectl get nodesshows nodes asNotReady- kubelet logs show certificate-related errors
Unable to connect to the server: x509: certificate has expired
Recovery Procedure¶
The playbook:
- Generates a bootstrap token on the first control plane
- Retrieves the CA certificate from the cluster
- Creates a
bootstrap-kubelet.conffile on each affected node with:- Bootstrap token for authentication
- Base64-encoded CA certificate
- API endpoint configuration
- Restarts kubelet on each node
- Kubelet uses the bootstrap token to request new certificates via CSR signing
Automatic Renewal
The playbook configures a systemd timer for monthly certificate renewal (kubernetes_manage_cert_renewal: true). This should prevent certificate expiration under normal operation.
Cluster Reset¶
The reset.yaml playbook performs a complete cluster teardown:
Danger
This is destructive and irreversible. It will completely destroy the cluster.
What it does:
- Runs
kubeadm reseton all nodes - Stops systemd units (kubelet, containerd)
- Removes all containerd containers
- Uninstalls Kubernetes packages (
kubeadm,kubelet,kubectl) - Cleans up directories:
/etc/kubernetes/var/lib/etcd/etc/cni/var/lib/kubelet/var/lib/containerd
- Removes the
etcduser and group - Optionally reboots nodes