Troubleshoot OpenShift Events¶
Introduction¶
OpenShift events explain recent scheduling, image pull, probe, mount, and admission failures. Sort them by timestamp so the latest failure appears at the bottom.
Symptoms¶
Typical symptoms include failed pods, route errors, denied requests, unhealthy operators, or command errors that repeat after retries.
Common Causes¶
- Looking only at the final error and ignoring events.
- Checking the wrong project with oc.
- Changing several objects at once before confirming the current state.
Step 1: Check the Current Status¶
oc get events -n app --sort-by=.lastTimestamp
oc get events -n app --field-selector type=Warning
oc describe pod web-7c9d7f6f8b-jx4mk -n app
Example output:
LAST SEEN TYPE REASON OBJECT MESSAGE
45s Warning Unhealthy pod/web-7c9d7f6f8b-jx4mk Readiness probe failed: HTTP probe failed with statuscode: 503
Step 2: Inspect Logs and Events¶
oc get events -n app --sort-by=.lastTimestamp
oc get pods -n app -o wide
Step 3: Verify Configuration¶
Compare the object selectors, service account, image reference, route target, or operator status with the failing symptom. In OpenShift, events often show the exact admission, scheduling, pull, SCC, or route reason.
Step 4: Apply the Fix¶
Apply the smallest targeted fix: correct the selector, update the route or service port, link the pull secret, grant the specific RBAC or SCC permission, or repair the unhealthy operator dependency.
Step 5: Confirm the Problem Is Resolved¶
Run the verification commands again and confirm the status, events, and user-facing test all agree.
Common Mistakes¶
- Looking only at the final error and ignoring events.
- Checking the wrong project with oc.
- Changing several objects at once before confirming the current state.
Quick Checklist¶
- Confirm the active project.
- Inspect the exact object named in the error.
- Read recent events.
- Apply one focused fix.
- Verify status after the change.
Related Guides¶
Summary¶
Troubleshoot OpenShift Events requires matching the symptom to the OpenShift object that owns it. Use oc status commands, events, logs, and focused verification so the fix is tied to evidence.