mindef-overdracht/ubiops-deployments
2026-06-02 11:49:50 +02:00
..
deployments Import part 17 2026-06-02 11:46:30 +02:00
info.yaml Import part 17 2026-06-02 11:46:30 +02:00
README.md Update ReadMes 2026-06-02 11:49:50 +02:00

ubiops-deployments

UbiOps export (format spec v8.0, exported 2026-06-02) bundling the deployments behind the MinDef metadata/throughput setup. All deployments are OpenAI-compatible and run in request format (supports_request_format: true) with plain input/output.

Layout

deployments/
├── deployment-gpt-oss-chat/        # the LLM serving deployment
│   └── deployment_gpt-oss-120b.yaml
├── deployments-embedder/           # embedding model
│   └── deployment_bge-m3/
└── deployments-proxies/            # OpenAI-compatible proxy deployments
    ├── deployment_llm-proxy/
    └── deployment_proxy-gpt-oss-batch-3x/

Each deployment folder holds its deployment_*.yaml (deployment config) and a versions/ folder with one *.yaml + *.zip per version (the YAML is the version config, the ZIP is the packaged code).

Deployments

Deployment Default version Purpose
gpt-oss-120b v-gpt-120b-tool-calling Serves openai/gpt-oss-120b via vLLM on a 16gb_8vcpu_rtxpro GPU instance.
bge-m3 v3 BGE-M3 embedding model.
llm-proxy v11 OpenAI-compatible proxy routing requests to UbiOps deployments.
proxy-gpt-oss-batch-3x v1 Proxy fanning batch requests across GPT-OSS instances.

Configuration

Secrets are exported empty and must be set per environment:

  • gpt-oss-120bHF_TOKEN (secret), MODEL_NAME (openai/gpt-oss-120b). The serving version also sets VLLM_USE_V1=1, GPU_MEMORY_UTILIZATION=0.90, and MAX_MODEL_LEN=125000, with a /health check on port 8000.
  • llm-proxyUBIOPS_API_TOKEN (secret).

Importing

Import this directory as a UbiOps project export (e.g. via ubiops project_export create), then fill in the secret environment variables listed above before sending requests. Note that this implementation requires outbound internet acces.

When running in airgapped environments, users can make use of the bring your own docker image functionality