OpenShift etcd Health Check¶
Introduction¶
Control plane troubleshooting should stay evidence-driven. Check ClusterOperators, component pods, recent events, and logs before restarting anything.
Why This Matters¶
OpenShift administration relies on operators and cluster-scoped resources. A bad change can affect many projects, so inspect status and events before applying fixes.
Practical Examples¶
oc get clusteroperators
oc get pods -n openshift-etcd
oc logs -n openshift-etcd -l k8s-app=etcd --tail=50
oc get events -n openshift-etcd --sort-by=.lastTimestamp
Example output:
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
etcd 4.15.12 True False False 8d Etcd is available.
Verification¶
oc get co etcd kube-apiserver console
oc get pods -n openshift-etcd
oc get events -n openshift-etcd
Troubleshooting¶
Read the operator message, check the namespace where the component runs, inspect related events, and confirm whether the condition is Available, Progressing, or Degraded.
Common Mistakes¶
- Restarting control plane pods without reading the operator message.
- Ignoring certificate or quorum warnings.
- Troubleshooting from a stale kubeconfig context.
Quick Checklist¶
- Confirm the active project.
- Inspect the exact object named in the error.
- Read recent events.
- Apply one focused fix.
- Verify status after the change.
Related Guides¶
Summary¶
OpenShift etcd Health Check is an administration task that should be driven by cluster status, operator conditions, and component logs instead of broad restarts.