update vllm dockerfile for ray by wedu-nvidia · Pull Request #1392 · NVIDIA-NeMo/Skills

wedu-nvidia · 2026-04-21T22:33:49Z

Summary by CodeRabbit

New Features
- Multi-node deployments now automatically enable the distributed execution backend for improved scalability.
Chores
- Updated container base image to a newer release.
- Added performance-related packages to the container build to improve runtime behavior.

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai · 2026-04-21T22:34:31Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: dc97fd16-7797-4c91-b08a-19f0ebc9c8f0

📥 Commits

Reviewing files that changed from the base of the PR and between 2a64753 and a3b8eed.

📒 Files selected for processing (1)

nemo_skills/inference/server/serve_vllm.py

📝 Walkthrough

Walkthrough

Upgraded the vLLM Docker base image and added a new Python dependency; adjusted the vLLM server launcher to inject a Ray distributed-executor flag when running with multiple nodes unless an explicit backend flag is already provided.

Changes

Cohort / File(s)	Summary
Docker Base Image & Dependencies `dockerfiles/Dockerfile.vllm`	Bumped base image from `vllm/vllm-openai:v0.14.1` to `vllm/vllm-openai:v0.18.1` and added `pip install "ray[cgraph]"` alongside existing installs.
vLLM Serve CLI Behavior `nemo_skills/inference/server/serve_vllm.py`	When `args.num_nodes > 1`, detect if `--distributed-executor-backend` was supplied (exact or `=` form); if not, inject `--distributed-executor-backend=ray` into the constructed `vLLM api_server` command.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'update vllm dockerfile for ray' accurately describes the main change: upgrading vLLM Docker image and adding ray support. It is concise and specific.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch wedu/update-ray

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

dockerfiles/Dockerfile.vllm (1)
2-2: Pin ray[cgraph] to a specific version to avoid dependency drift.

Line 2 installs Ray without a version pin, making rebuilds non-reproducible. Ray's transitive dependencies (particularly aiohttp) can vary across releases, introducing unexpected conflicts.
♻️ Proposed change
+ARG RAY_VERSION=2.55.0
-RUN pip install "ray[cgraph]"
+RUN pip install --no-cache-dir "ray[cgraph]==${RAY_VERSION}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dockerfiles/Dockerfile.vllm` at line 2, The Dockerfile.vllm currently runs
pip install "ray[cgraph]" without a version, which leads to non-reproducible
builds; update the RUN pip install step that references "ray[cgraph]" to pin Ray
to the project's tested version (e.g., add an explicit version specifier or use
your constraints file) so transitive deps like aiohttp remain stable, and ensure
you update any corresponding documentation/constraints (requirements.txt or
constraints.txt) and CI images to match the pinned version.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dockerfiles/Dockerfile.vllm`:
- Around line 1-5: The Dockerfile currently runs as root (base image FROM
vllm/vllm-openai:v0.18.1) and lacks a USER directive; add steps to create a
non-root user (e.g., create a group and user, set a home or workdir, chown any
needed directories after installing packages), switch to that user with a USER
<username> instruction before finishing the image, and ensure any runtime
writable paths (WORKDIR or volumes) are owned by that user so the container does
not default to root at runtime.

---

Nitpick comments:
In `@dockerfiles/Dockerfile.vllm`:
- Line 2: The Dockerfile.vllm currently runs pip install "ray[cgraph]" without a
version, which leads to non-reproducible builds; update the RUN pip install step
that references "ray[cgraph]" to pin Ray to the project's tested version (e.g.,
add an explicit version specifier or use your constraints file) so transitive
deps like aiohttp remain stable, and ensure you update any corresponding
documentation/constraints (requirements.txt or constraints.txt) and CI images to
match the pinned version.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9abc8c5a-b5b4-43a3-bfa3-9ca13996b899

📥 Commits

Reviewing files that changed from the base of the PR and between 94478e5 and 2a64753.

📒 Files selected for processing (1)

dockerfiles/Dockerfile.vllm

coderabbitai · 2026-04-21T22:37:36Z

+FROM vllm/vllm-openai:v0.18.1
+RUN pip install "ray[cgraph]"
 RUN pip install "vllm[audio]"
 # Required by vLLM for Qwen-VL model family (runtime dependency, not directly imported)
 RUN pip install qwen-vl-utils


⚠️ Potential issue | 🟠 Major

Run the container as a non-root user.

No explicit USER is set, so runtime may default to root. Please set a non-root user to reduce container breakout impact.

🔒 Example hardening

FROM vllm/vllm-openai:v0.18.1 RUN pip install "ray[cgraph]" RUN pip install "vllm[audio]" # Required by vLLM for Qwen-VL model family (runtime dependency, not directly imported) RUN pip install qwen-vl-utils +RUN useradd --create-home --uid 10001 appuser +USER appuser

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

FROM vllm/vllm-openai:v0.18.1

RUN pip install "ray[cgraph]"

RUN pip install "vllm[audio]"

# Required by vLLM for Qwen-VL model family (runtime dependency, not directly imported)

RUN pip install qwen-vl-utils

FROM vllm/vllm-openai:v0.18.1

RUN pip install "ray[cgraph]"

RUN pip install "vllm[audio]"

# Required by vLLM for Qwen-VL model family (runtime dependency, not directly imported)

RUN pip install qwen-vl-utils

RUN useradd --create-home --uid 10001 appuser

USER appuser

🧰 Tools

🪛 Trivy (0.69.3)

[error] 1-1: Image user should not be 'root'

Specify at least 1 USER command in Dockerfile with non-root user as argument

Rule: DS-0002

Learn more

(IaC/Dockerfile)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dockerfiles/Dockerfile.vllm` around lines 1 - 5, The Dockerfile currently runs as root (base image FROM vllm/vllm-openai:v0.18.1) and lacks a USER directive; add steps to create a non-root user (e.g., create a group and user, set a home or workdir, chown any needed directories after installing packages), switch to that user with a USER <username> instruction before finishing the image, and ensure any runtime writable paths (WORKDIR or volumes) are owned by that user so the container does not default to root at runtime.

Kipok

can you run gpu and slurm tests on this?

Kipok · 2026-04-21T23:17:37Z

    args, unknown = parser.parse_known_args()

    extra_arguments = join(unknown)
+    has_distributed_backend = any(


we shouldn't need this change I think - the user args will always override the default anyway. So we cna just check for multinode and if that's set we add the argument always

sure, we can get rid of this

Kipok · 2026-04-21T23:18:12Z

we could combine with upgrade to sglang and trtllm as well so that we do them all in one batch of tests

wedu-nvidia · 2026-04-21T23:30:19Z

we could combine with upgrade to sglang and trtllm as well so that we do them all in one batch of tests

Why I did not see sglang and trtllm docker files?

you suggest to revise?

_containers = {
"trtllm": "nvcr.io/nvidia/tensorrt-llm/release:1.3.0rc1",
"sglang": "lmsysorg/sglang:v0.5.8",
}

Kipok · 2026-04-21T23:31:33Z

yes, as well as all configs

update vllm dockerfile for ray

453139f

Signed-off-by: Wei Du <wedu@nvidia.com>

wedu-nvidia requested a review from Kipok April 21, 2026 22:33

Merge branch 'main' into wedu/update-ray

2a64753

update default para

a3b8eed

Signed-off-by: Wei Du <wedu@nvidia.com>

coderabbitai Bot reviewed Apr 21, 2026

View reviewed changes

Kipok reviewed Apr 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update vllm dockerfile for ray#1392

update vllm dockerfile for ray#1392
wedu-nvidia wants to merge 3 commits intomainfrom
wedu/update-ray

wedu-nvidia commented Apr 21, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Uh oh!

Kipok left a comment

Uh oh!

Kipok Apr 21, 2026

Uh oh!

wedu-nvidia Apr 21, 2026

Uh oh!

Kipok commented Apr 21, 2026

Uh oh!

wedu-nvidia commented Apr 21, 2026

Uh oh!

Kipok commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wedu-nvidia commented Apr 21, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Kipok left a comment

Choose a reason for hiding this comment

Uh oh!

Kipok Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

wedu-nvidia Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

Kipok commented Apr 21, 2026

Uh oh!

wedu-nvidia commented Apr 21, 2026

Uh oh!

Kipok commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wedu-nvidia commented Apr 21, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 21, 2026 •

edited

Loading