[docker] Restructure Dockerfile for more efficient and cache-friendly builds #30626

amrmahdi · 2025-12-13T21:02:53Z

Approach:

Pre-install PyTorch, FlashInfer, and other slow-changing dependencies in vllm-base before installing the vLLM wheel for better layer caching
Add parallel extensions-build stage for DeepGEMM and EP kernels
Move stable packages (accelerate, bitsandbytes, etc.) earlier in build

This allows incremental builds with Python-only changes to skip the expensive dependency installation layers.

Performance:

Incremental builds with Python-only changes now complete in ~16 minutes (previously 35+ minutes).

Future work:

We considered building these base stages as separate images that could be built independently and baked into CI AMIs for maximum cache reuse. However, this introduces maintenance burden of extra pipelines and update strategies. The inline approach is simpler and can be optimized later by baking the main build image into AMIs daily to maximize layer cache reuse.

chatgpt-codex-connector · 2025-12-13T21:03:00Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

gemini-code-assist

Code Review

This pull request significantly refactors the Dockerfile to improve build efficiency and leverage caching more effectively. The introduction of a parallel extensions-build stage and the restructuring of the vllm-base stage to pre-install slow-changing dependencies are excellent changes that should substantially reduce incremental build times. The overall approach is well-thought-out and correctly implemented. I have one suggestion to further optimize Docker image layering, but the pull request is a great improvement overall.

docker/Dockerfile

mergify · 2025-12-13T21:13:26Z

Hi @amrmahdi, the pre-commit checks have failed. Please run:

uv pip install pre-commit
pre-commit install
pre-commit run --all-files

Then, commit the changes and push to your branch.

For future commits, pre-commit will run automatically on changed files before each commit.

Tip

Is mypy or markdownlint failing?

mypy and markdownlint are run differently in CI. If the failure is related to either of these checks, please use the following commands to run them locally:

# For mypy (substitute "3.10" with the failing version if needed)
pre-commit run --hook-stage manual mypy-3.10
# For markdownlint
pre-commit run --hook-stage manual markdownlint

mergify · 2025-12-13T22:00:27Z

Documentation preview: https://vllm--30626.org.readthedocs.build/en/30626/

amrmahdi · 2025-12-15T05:51:26Z

As a datapoint, my rebase build took 16m20s https://buildkite.com/vllm/ci/builds/43561/steps/canvas?sid=019b207f-7cc2-4327-8035-07fa0e925428

mgoin

LGTM overall, great work. Just a few quick nits to cleanup

docker/Dockerfile

… builds - Pre-install PyTorch, FlashInfer, and other slow-changing dependencies in vllm-base before installing the vLLM wheel for better layer caching - Add parallel extensions-build stage for DeepGEMM and EP kernels - Move stable packages (accelerate, bitsandbytes, etc.) earlier in build This allows incremental builds with Python-only changes to skip the expensive dependency installation layers. Performance: Incremental builds with Python-only changes now complete in ~16 minutes (previously 35+ minutes). Future work: We considered building these base stages as separate images that could be built independently and baked into CI AMIs for maximum cache reuse. However, this introduces maintenance burden of extra pipelines and update strategies. The inline approach is simpler and can be optimized later by baking the main build image into AMIs daily to maximize layer cache reuse. Signed-off-by: Amr Mahdi <[email protected]>

mgoin

LGTM, thanks for the nice work!

… builds (vllm-project#30626) Signed-off-by: Amr Mahdi <[email protected]>

… builds (vllm-project#30626) Signed-off-by: Amr Mahdi <[email protected]> Signed-off-by: Nathan Price <[email protected]>

mergify bot added the ci/build label Dec 13, 2025

amrmahdi force-pushed the amrh/base-images-inline branch from d98658a to d516ce4 Compare December 13, 2025 21:04

gemini-code-assist bot reviewed Dec 13, 2025

View reviewed changes

docker/Dockerfile Show resolved Hide resolved

amrmahdi force-pushed the amrh/base-images-inline branch 3 times, most recently from 856d887 to 0267ae2 Compare December 13, 2025 21:08

amrmahdi mentioned this pull request Dec 13, 2025

Optimize dockerfile #28823

Open

5 tasks

amrmahdi force-pushed the amrh/base-images-inline branch from 0267ae2 to a36d065 Compare December 13, 2025 21:59

mergify bot added the documentation Improvements or additions to documentation label Dec 13, 2025

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 13, 2025

mgoin reviewed Dec 15, 2025

View reviewed changes

docker/Dockerfile Outdated Show resolved Hide resolved

docker/Dockerfile Outdated Show resolved Hide resolved

docker/Dockerfile Outdated Show resolved Hide resolved

amrmahdi force-pushed the amrh/base-images-inline branch from 28530ae to 8645883 Compare December 15, 2025 20:32

mgoin approved these changes Dec 15, 2025

View reviewed changes

Merge branch 'main' into amrh/base-images-inline

aaa4166

vllm-bot merged commit ff21a0f into vllm-project:main Dec 16, 2025
87 of 92 checks passed

amrmahdi deleted the amrh/base-images-inline branch December 16, 2025 02:54

weiyu0824 pushed a commit to weiyu0824/vllm that referenced this pull request Dec 16, 2025

[docker] Restructure Dockerfile for more efficient and cache-friendly…

af141b3

… builds (vllm-project#30626) Signed-off-by: Amr Mahdi <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[docker] Restructure Dockerfile for more efficient and cache-friendly builds #30626

[docker] Restructure Dockerfile for more efficient and cache-friendly builds #30626

amrmahdi commented Dec 13, 2025 •

edited by github-actions bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Dec 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

mergify bot commented Dec 13, 2025

Uh oh!

mergify bot commented Dec 13, 2025

Uh oh!

amrmahdi commented Dec 15, 2025

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mgoin left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[docker] Restructure Dockerfile for more efficient and cache-friendly builds #30626

[docker] Restructure Dockerfile for more efficient and cache-friendly builds #30626

Conversation

amrmahdi commented Dec 13, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approach:

Performance:

Future work:

Uh oh!

chatgpt-codex-connector bot commented Dec 13, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify bot commented Dec 13, 2025

Uh oh!

mergify bot commented Dec 13, 2025

Uh oh!

amrmahdi commented Dec 15, 2025

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amrmahdi commented Dec 13, 2025 •

edited by github-actions bot

Loading