| .. | ||
| dashboard.json | ||
| image.png | ||
| README.md | ||
UbiOps Deployments Dashboard
Grafana dashboard (dashboard.json) for monitoring UbiOps deployment pods on Kubernetes — health, resource usage, restarts, and limits. Data comes from Prometheus (kube-state-metrics + cAdvisor container_* metrics).
Variables
| Variable | Source | Purpose |
|---|---|---|
datasource |
Prometheus datasource picker | Select the Prometheus instance |
namespace |
label_values(kube_pod_info, namespace) |
Namespace to scope to |
deployment |
label_values(kube_deployment_metadata_generation{namespace=$namespace}, deployment) |
Deployment to inspect (defaults to all, .*) |
Pods are matched by pod=~"$deployment.*", so a deployment selection covers all of its pods.
Rows & panels
Overview — at-a-glance stat tiles: Running / Pending / Failed pods, Restarts (1h), OOMKilled (1h), Waiting containers.
Resource Usage — CPU and memory working-set usage per pod over time.
Deployment Status — desired vs. available replicas, and container restart rate.
Resource Limits — usage vs. limits for CPU and memory (aggregate and per-pod), plus per-pod limits and % of limit (green/yellow/red at 70%/90%) to spot pods approaching OOM.
Pod Details — table of every pod with restart count and memory % of limit, sorted by restarts.
Usage
Default time range is the last 1h with 30s auto-refresh. Import into Grafana (schema dashboard.grafana.app/v2, built on Grafana v13), then pick a datasource, namespace, and deployment.
Key things to watch
- OOMKilled (1h) and Memory % of Limit — memory pressure / under-provisioned limits.
- Restarts and Container Restart Rate — crash loops.
- Pending / Failed pods — scheduling or startup problems.
- Replicas (desired vs. available) — incomplete rollouts.