Events & Troubleshooting
Access Kubernetes events and container logs with descriptions and troubleshooting guidance.
What Are Events?
Kubernetes generates events as things happen in your cluster. Each event includes context and guidance:
- What happened: Description of the event
- Why it matters: Impact on your applications
- What to do: Recommended actions if needed
- Context: Related resources and timeline
Tip: Event Context Provided
Events include descriptions and troubleshooting steps to help you quickly identify and resolve issues.
Event Categories
✅ Normal Events (Good News)
These events mean things are working as expected:
What you'll see:
- "Pod successfully started on node worker-1"
- "Container image pulled and ready to run"
- "Application deployment completed successfully"
- "Storage volume attached and ready"
Why we show these: To confirm your operations completed successfully and provide an audit trail.
⚠️ Warning Events (Pay Attention)
These events indicate potential issues that haven't caused failures yet:
What you'll see (with explanations):
"Pod waiting to be scheduled"
- What it means: No node has enough resources for this pod
- What to do: Check if you need to add nodes or reduce resource requests
"Image pull is slow"
- What it means: Container image is taking a while to download
- What to do: Usually resolves itself. If persistent, check network connectivity
"Health check failing"
- What it means: Your application isn't responding to readiness probes
- What to do: Check application logs for startup issues
🔴 Error Events (Action Required)
These events indicate active problems:
What you'll see (with troubleshooting):
"Container crashed with exit code 1"
- What it means: Your application exited with an error
- What to do: Check container logs below for the error message
"Out of memory (OOMKilled)"
- What it means: Container used more memory than its limit
- What to do: Increase memory limits or optimize your application
"Cannot pull image"
- What it means: Can't download the container image
- What to do: Verify image name, registry access, and credentials
Event Sources
Events are generated by various Kubernetes components:
Scheduler
- Pod scheduling decisions
- Node selection
- Resource constraints
Kubelet
- Container lifecycle
- Volume operations
- Node status changes
Controller Manager
- Deployment rollouts
- ReplicaSet scaling
- Job execution
API Server
- Resource creation/deletion
- Authentication/Authorization events
Event Information
Each event contains:
- Type: Normal or Warning
- Reason: Event classification (e.g., Failed, Started, Killing)
- Message: Detailed description
- Source: Component that generated the event
- Object: Related Kubernetes resource
- Timestamp: When the event occurred
- Count: Number of occurrences
Event Monitoring
Resource Events
Monitor events for specific resources to track their lifecycle and identify issues.
Configuration:
clusterPirate:
monitoring:
resourceEventsEnabled: trueSystem Events
Track cluster-wide and node-level events to monitor infrastructure health.
Configuration:
clusterPirate:
monitoring:
systemEventsEnabled: trueCommon Event Scenarios
Pod Failures
Failed Scheduling
- Reason:
FailedScheduling - Common Causes: Insufficient resources, node selectors, taints/tolerations
- Resolution: Check node capacity, adjust resource requests, verify node labels
Image Pull Errors
- Reason:
Failed,ErrImagePull,ImagePullBackOff - Common Causes: Invalid image name, missing credentials, network issues
- Resolution: Verify image name, check image pull secrets, test registry connectivity
Container Crashes
- Reason:
CrashLoopBackOff,Error - Common Causes: Application errors, missing dependencies, configuration issues
- Resolution: Check container logs, verify environment variables, review application code
OOM Kills
- Reason:
OOMKilled - Common Causes: Insufficient memory limits, memory leaks
- Resolution: Increase memory limits, profile application memory usage, fix memory leaks
Volume Issues
Mount Failures
- Reason:
FailedMount - Common Causes: Volume not available, incorrect PVC configuration, storage class issues
- Resolution: Verify PVC exists, check storage class, ensure volume is bound
Volume Full
- Reason:
VolumeResizeFailed - Common Causes: Disk space exhausted, volume resize not supported
- Resolution: Clean up disk space, resize volume, migrate to larger volume
Readiness/Liveness Failures
Probe Failures
- Reason:
Unhealthy - Common Causes: Application not ready, incorrect probe configuration, network issues
- Resolution: Check application startup time, adjust probe settings, verify endpoint availability
Pod Logs
Accessing Logs
Logs are available for all containers in running and recently terminated pods.
Via Web Console:
- Navigate to cluster in portal
- Select namespace and pod
- Choose container (if multiple)
- View real-time logs
Log Features
Real-time Streaming
- Live tail of container stdout/stderr
- Automatic updates as new logs are written
Historical Logs
- Access logs from previous container runs
- View logs from terminated containers
Filtering
- Search log content
- Filter by timestamp
- Filter by log level (if structured)
Log Retention
- Active Containers: Logs available while container is running
- Terminated Containers: Logs retained based on Kubernetes configuration
- Pod Deletion: Logs are lost when pod is deleted
Use Cases
Troubleshooting Application Issues
- Check Pod Events: Identify scheduling or startup issues
- Review Container Logs: Look for application errors or exceptions
- Monitor Resource Events: Track deployment updates and rollouts
- Examine System Events: Identify infrastructure problems
Debugging Crashes
- Find OOM Events: Check for memory-related kills
- Review Exit Codes: Understand how containers terminated
- Analyze Error Patterns: Identify recurring issues
- Check Previous Logs: Review logs from failed containers
Monitoring Deployments
- Track Rollout Events: Monitor deployment progress
- Identify Pod Failures: Catch issues during updates
- Verify Configuration: Ensure correct settings applied
- Watch Resource Updates: Track replica changes
Audit Trail
- Resource Creation: Track when resources were created
- Configuration Changes: Monitor updates to workloads
- Access Events: Review authentication/authorization events
- Deletion Events: Track resource cleanup
Kubernetes Events
Access events through Kubernetes resource endpoints.
Get Resource with Events
When retrieving a specific resource, events are included in the response:
GET /v1/workspaces/{workspaceId}/observability/{observabilityInstanceId}/clusters/{clusterId}/namespaces/{namespace}/pods/{podName}
Authorization: Bearer <access-token>Response includes:
{
"resource": {
/* pod details */
},
"events": [
{
"type": "Normal",
"reason": "Started",
"message": "Started container nginx",
"timestamp": "2024-01-15T10:30:00Z",
"count": 1
}
]
}Best Practices
Event Monitoring
- Enable both resource and system events for complete visibility
- Set up alerts for critical event types (OOMKilled, CrashLoopBackOff)
- Regularly review warning events to catch issues early
Log Management
- Implement structured logging in applications
- Include correlation IDs for request tracing
- Use appropriate log levels (DEBUG, INFO, WARN, ERROR)
- Avoid logging sensitive information
Troubleshooting Workflow
- Start with events to identify the problem type
- Review pod logs for application-specific details
- Check resource configuration for misconfigurations
- Examine metrics for resource constraints
- Review cluster-wide events for infrastructure issues