Update ReadMes

This commit is contained in:
kvanbezouw 2026-06-02 11:49:50 +02:00
parent 901a8c8407
commit 83e2efdb43
2 changed files with 3 additions and 9 deletions

View File

@ -2,8 +2,6 @@
Grafana dashboard for monitoring [vLLM](https://github.com/vllm-project/vllm) inference servers running as UbiOps deployments — request throughput, queue depth, KV cache pressure, and token rates. Fed by the `vllm:*` Prometheus metrics that vLLM exposes. Grafana dashboard for monitoring [vLLM](https://github.com/vllm-project/vllm) inference servers running as UbiOps deployments — request throughput, queue depth, KV cache pressure, and token rates. Fed by the `vllm:*` Prometheus metrics that vLLM exposes.
> **Note:** `dashboard.json` is currently empty (0 bytes) — the export did not save. These docs are reconstructed from `image.png`; re-export the dashboard to capture the panel/query definitions.
## Variables ## Variables
- **Data Source** — Prometheus instance. - **Data Source** — Prometheus instance.
@ -26,12 +24,6 @@ Grafana dashboard for monitoring [vLLM](https://github.com/vllm-project/vllm) in
- *Input Tokens Per Minute (ITPM)* — prompt token volume. - *Input Tokens Per Minute (ITPM)* — prompt token volume.
- *Output Tokens Per Minute (OTPM)* — generated token volume. - *Output Tokens Per Minute (OTPM)* — generated token volume.
## Key things to watch
- **KV Cache Usage** near 100% with rising **Requests Waiting** — the server is capacity-bound; scale up or shorten contexts.
- **Tokens Generated/sec** / **OTPM** dropping while RPM holds — degraded decode throughput.
- Sustained **Requests Waiting** — queue backlog and latency.
## Usage ## Usage
Default range in the screenshot is the last 2 days with auto-refresh. Import into Grafana, then select datasource, namespace, and deployment. Default range in the screenshot is the last 2 days with auto-refresh. Import into Grafana, then select datasource, namespace, and deployment.

View File

@ -44,4 +44,6 @@ Secrets are exported empty and must be set per environment:
Import this directory as a UbiOps project export (e.g. via Import this directory as a UbiOps project export (e.g. via
`ubiops project_export create`), then fill in the secret environment variables `ubiops project_export create`), then fill in the secret environment variables
listed above before sending requests. listed above before sending requests. Note that this implementation requires outbound internet acces.
When running in airgapped environments, users can make use of the bring your own docker image functionality