Skip to content

Conversation

@Semmer2
Copy link
Contributor

@Semmer2 Semmer2 commented Nov 26, 2025

What this PR does / why we need it?

Add single NPU tutorial for Qwen2.5-Omni-7B

Does this PR introduce any user-facing change?

No

How was this patch tested?

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 26, 2025
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new tutorial for running Qwen2.5-Omni-7B on a single NPU. The documentation is well-structured, covering both offline inference and online serving. I've identified a missing dependency installation step that would prevent the offline inference example from running and have provided a suggestion to fix it.


Run the following script to execute offline inference on a single NPU:


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Python script for offline inference uses qwen_vl_utils.process_vision_info, but the qwen_vl_utils package is not installed in the Docker container by default. This will cause an ImportError when running the script. Please add a step to install this package.

Suggested change
pip install qwen_vl_utils --extra-index-url https://download.pytorch.org/whl/cpu/

@Semmer2 Semmer2 force-pushed the QwenOmniDocMain branch 7 times, most recently from 598628d to d56cd51 Compare November 28, 2025 07:35

Qwen2.5-Omni is an end-to-end multimodal model designed to perceive diverse modalities, including text, images, audio, and video, while simultaneously generating text and natural speech responses in a streaming manner.

This document will show the main verification steps of the model, including supported features, feature configuration, environment preparation, single-node and multi-node deployment, accuracy and performance evaluation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to add this model's first supported version, like "The DeepSeek-V3.1 model is first supported in vllm-ascend:v0.9.1rc3"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


You can using our official docker image, v0.11.0 and later version of vllm-ascend supports Qwen2.5-Omni.

:::{note}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check this note, only aarch64 supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked, sorry for the wrong info.


In addition, if you don't want to use the docker image as above, you can also build all from source:

- Install `vllm-ascend` from source, refer to [installation](../installation.md).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build from source i think you can just delete it. Since if one want to build from source, he must have some experiences and should check the installation page. Docker image can make the simple way for inexperienced person. And the tab code you can delete it also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified.

::::{tab-item} A3&A2 series
:sync: A3&A2

Start the docker image on your node, refer to [using docker](../installation.md#set-up-using-docker).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide docker run command directly, like "#4399"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, added.

@1092626063
Copy link
Contributor

/lgtm

@MengqingCao MengqingCao merged commit b747c95 into vllm-project:main Nov 29, 2025
17 checks passed
ChenCangtao pushed a commit to ChenCangtao/vllm-ascend that referenced this pull request Dec 3, 2025
### What this PR does / why we need it?
Add single NPU tutorial for Qwen2.5-Omni-7B

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: Ting FU <[email protected]>
Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request Dec 4, 2025
### What this PR does / why we need it?
Add single NPU tutorial for Qwen2.5-Omni-7B

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: Ting FU <[email protected]>
Signed-off-by: Che Ruan <[email protected]>
Mercykid-bash pushed a commit to Mercykid-bash/vllm-ascend that referenced this pull request Dec 4, 2025
### What this PR does / why we need it?
Add single NPU tutorial for Qwen2.5-Omni-7B

- vLLM version: v0.11.2
- vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2

Signed-off-by: Ting FU <[email protected]>
Signed-off-by: Che Ruan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants