This project starts from the official vLLM 0.11.0 image. The goal is to preserve benchmark client behavior while removing unrelated runtime content.
Project Overview
The project keeps the official vllm/vllm-openai:v0.11.0 dependency set, benchmark CLI, tokenizer, dataset sampling, request functions, and small helper scripts.
The use case is a benchmark-only container for external OpenAI-compatible services. It helps compare service versions and run tests from a smaller runtime while keeping vLLM 0.11.0 command habits.
Image Layout
The build keeps the benchmark entry, OpenAI-compatible request dependencies, tokenizer dependencies, and helpers. Server components, development tools, temporary caches, and large test files are removed from the final image.
The entry maps to vllm.entrypoints.cli.main:main. That keeps familiar benchmark commands while narrowing the image responsibility.