Update ReadMes

2026-06-02 11:49:50 +02:00 · 2026-06-02 11:49:50 +02:00 · 83e2efdb43
commit 83e2efdb43
parent 901a8c8407
2 changed files with 3 additions and 9 deletions
--- a/grafana/vllm-metrics/README.md
+++ b/grafana/vllm-metrics/README.md
@ -2,8 +2,6 @@

 Grafana dashboard for monitoring [vLLM](https://github.com/vllm-project/vllm) inference servers running as UbiOps deployments — request throughput, queue depth, KV cache pressure, and token rates. Fed by the `vllm:*` Prometheus metrics that vLLM exposes.

-> **Note:** `dashboard.json` is currently empty (0 bytes) — the export did not save. These docs are reconstructed from `image.png`; re-export the dashboard to capture the panel/query definitions.
-
 ## Variables

 - **Data Source** — Prometheus instance.
@ -26,12 +24,6 @@ Grafana dashboard for monitoring [vLLM](https://github.com/vllm-project/vllm) in
 - *Input Tokens Per Minute (ITPM)* — prompt token volume.
 - *Output Tokens Per Minute (OTPM)* — generated token volume.

-## Key things to watch
-
- **KV Cache Usage** near 100% with rising **Requests Waiting** — the server is capacity-bound; scale up or shorten contexts.
- **Tokens Generated/sec** / **OTPM** dropping while RPM holds — degraded decode throughput.
- Sustained **Requests Waiting** — queue backlog and latency.
-
 ## Usage

 Default range in the screenshot is the last 2 days with auto-refresh. Import into Grafana, then select datasource, namespace, and deployment.
--- a/ubiops-deployments/README.md
+++ b/ubiops-deployments/README.md
@ -44,4 +44,6 @@ Secrets are exported empty and must be set per environment:

 Import this directory as a UbiOps project export (e.g. via
 `ubiops project_export create`), then fill in the secret environment variables
-listed above before sending requests.
+listed above before sending requests. Note that this implementation requires outbound internet acces.
+
+When running in airgapped environments, users can make use of the bring your own docker image functionality