[Doc] Refactor the DeepSeek-V3.1 tutorial. #4399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

MengqingCao merged 1 commit into vllm-project:main from 1092626063:DeepSeek3.1

Dec 2, 2025

Contributor

1092626063 commented Nov 24, 2025 •

edited by MengqingCao

Loading

What this PR does / why we need it?

Refactor the DeepSeek-V3.1 tutorial.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.11.2
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

github-actions bot commented Nov 24, 2025

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

github-actions bot added the documentation label

gemini-code-assist bot reviewed

View reviewed changes

Contributor

gemini-code-assist bot left a comment

Code Review

This pull request adds a comprehensive tutorial for deploying the DeepSeek-V3.1 model. While the document covers various deployment scenarios, I've found several critical errors in the provided code snippets and configurations, particularly for multi-node and prefill-decode disaggregation setups. These issues, including Python syntax errors, incorrect data parallel configurations, and inconsistent model naming, would likely prevent users from successfully following the instructions. My review provides specific corrections to address these critical problems and improve the tutorial's accuracy and usability.

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Show resolved Hide resolved

1092626063 force-pushed the DeepSeek3.1 branch 4 times, most recently from 74cdeb3 to 16d672f Compare

November 27, 2025 10:15

menogrey reviewed

View reviewed changes

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Outdated

+              local_ip="xxxx"
+              # [Optional] jemalloc
+              # if `libjemalloc.so` is install on your machine, you can turn it on.

Contributor

menogrey Nov 28, 2025

jemalloc is for better performance, please add some description, otherwise may be a little confused. Thanks.

Contributor Author

1092626063 Nov 28, 2025

i have added some description: “jemalloc is for better performance, if libjemalloc.so is install on your machine, you can turn it on.”

docs/source/tutorials/DeepSeek-V3.1.md Outdated

+              ### Model Weight
+              - `DeepSeek-V3.1`(BF16 version): [Download model weight](https://www.modelscope.cn/models/deepseek-ai/DeepSeek-V3.1)
+              - `DeepSeek-V3.1-w8a8`(Quantized version): [Download model weight](https://www.modelscope.cn/models/Eco-Tech/DeepSeek-V3.1-w8a8). Note: modify `torch_dtype` from `float16` to `bfloat16` in `config.json`.
+              - Method of Quantify: [DeepSeek-V3.1 W8A8+MTP](https://gitcode.com/Ascend/msit/blob/master/msmodelslim/example/DeepSeek/README.md#deepseek-v31-w8a8-%E6%B7%B7%E5%90%88%E9%87%8F%E5%8C%96-mtp-%E9%87%8F%E5%8C%96)

Contributor

menogrey Nov 28, 2025

DeepSeek-V3.1 W8A8+MTP seems not having a available download url. It's better to upload to modelscope or other platform, since you mention DeepSeek-V3.1 W8A8+MTP as below.

Contributor Author

1092626063 Nov 28, 2025

ok, we don't have mtp weights on modelscope, so i put a method of quantify here, maybe i should add more details.

docs/source/tutorials/DeepSeek-V3.1.md Outdated

+              export VLLM_ASCEND_ENABLE_FLASHCOMM1=0
+              export DISABLE_L2_CACHE=1
+              vllm serve vllm-ascend/DeepSeek-V3.1_w8a8mix_mtp \

Contributor

menogrey Nov 28, 2025

In fact, if you use xxx/xxx as a model name, vllm will search it from the huggingface (or if you set VLLM_USE_MODELSCOPE, vllm will search from the modelscope), the vllm-ascend/xxx usually indicate it is from our modelscope vllm-ascend published models, so better change it to a local path.

Contributor Author

1092626063 Nov 28, 2025

changed

docs/source/tutorials/DeepSeek-V3.1.md

+              export VLLM_USE_V1=1
+              export HCCL_BUFFSIZE=200
+              export PYTORCH_NPU_ALLOC_CONF=expandable_segments:True
+              export VLLM_ASCEND_ENABLE_MLAPO=1

Contributor

menogrey Nov 28, 2025

@wangxiyuan The VLLM_ASCEND_ENABLE_MLAPO=1 is also needed in DeepSeek-V3.1? And i am not sure it is ok for this, since i remember it caused some issue in 0.11.0rc1 DeepSeek-V3.2-Exp.

menogrey reviewed

View reviewed changes

docs/source/tutorials/DeepSeek-V3.1.md Outdated

+              --gpu-memory-utilization 0.92 \
+              --speculative-config '{"num_speculative_tokens": 1, "method": "deepseek_mtp"}' \
+              --compilation-config '{"cudagraph_mode": "FULL_DECODE_ONLY"}' \
+              --additional-config '{"ascend_scheduler_config":{"enabled":false},"torchair_graph_config":{"enabled":false}}'

Contributor

menogrey Nov 28, 2025

ascend schedular is ready to be dropped in main. Refer to this #4498

Contributor Author

1092626063 Nov 29, 2025

down

menogrey reviewed

View reviewed changes

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

docs/source/tutorials/DeepSeek-V3.1.md Outdated Show resolved Hide resolved

1092626063 force-pushed the DeepSeek3.1 branch 2 times, most recently from 9ce8f7b to 208dcae Compare

November 28, 2025 09:10

1092626063 mentioned this pull request

[Doc] Add single NPU tutorial for Qwen2.5-Omni-7B #4446

Merged

1092626063 force-pushed the DeepSeek3.1 branch 2 times, most recently from 712ae28 to 8bb9393 Compare

November 29, 2025 03:46

1092626063 mentioned this pull request

[Doc] Add Qwen3-235B tutorial #4358

Open

1092626063 force-pushed the DeepSeek3.1 branch from 8bb9393 to 5b7511a Compare

December 1, 2025 06:54


          deepseekv3.1 tutorial

fed65de

Signed-off-by: 1092626063 <[email protected]>

1092626063 force-pushed the DeepSeek3.1 branch from 5b7511a to fed65de Compare

December 1, 2025 06:57

Contributor

menogrey commented Dec 1, 2025

LGTM, thanks for your contribution!

MengqingCao approved these changes

View reviewed changes

MengqingCao merged commit eabedf4 into vllm-project:main

17 checks passed

ChenCangtao pushed a commit to ChenCangtao/vllm-ascend that referenced this pull request


          [Doc] Refactor the DeepSeek-V3.1 tutorial. (vllm-project#4399)

808fca5

### What this PR does / why we need it?
Refactor the DeepSeek-V3.1 tutorial. 

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: 1092626063 <[email protected]>

1092626063 mentioned this pull request

add DeepSeek-R1 tutorial. #4666

Open

Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request


          [Doc] Refactor the DeepSeek-V3.1 tutorial. (vllm-project#4399)

ae3c6bf

### What this PR does / why we need it?
Refactor the DeepSeek-V3.1 tutorial.

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: 1092626063 <[email protected]>
Signed-off-by: Che Ruan <[email protected]>

Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request


          [Doc] Refactor the DeepSeek-V3.1 tutorial. (vllm-project#4399)

45d9367

### What this PR does / why we need it?
Refactor the DeepSeek-V3.1 tutorial.

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: 1092626063 <[email protected]>
Signed-off-by: Che Ruan <[email protected]>

Meihan-chen pushed a commit to Meihan-chen/vllm-ascend that referenced this pull request


          [Doc] Refactor the DeepSeek-V3.1 tutorial. (vllm-project#4399)

da3ec42

### What this PR does / why we need it?
Refactor the DeepSeek-V3.1 tutorial. 

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: 1092626063 <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels