-
Notifications
You must be signed in to change notification settings - Fork 665
Upgrade vllm commit hash to 1216 #5053
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the vLLM commit hash in the versioning_policy.md documentation file. I have added one suggestion to format the commit hash as a clickable link to improve the documentation's usability and help prevent potential issues for users. Additionally, for future pull requests, please provide a descriptive title and fill out the description template to give reviewers better context, as the current title is unclear and the description is empty.
| | vLLM Ascend | vLLM | Python | Stable CANN | PyTorch/torch_npu | | ||
| |-------------|--------------|------------------|-------------|--------------------| | ||
| | main | 4429d934de3c5cc327b0d7aec8e473aeba38db90, v0.12.0 tag | >= 3.10, < 3.12 | 8.3.RC2 | 2.8.0 / 2.8.0 | | ||
| | main | 6063853ead105f89400770b4fe6760959796d827, v0.12.0 tag | >= 3.10, < 3.12 | 8.3.RC2 | 2.8.0 / 2.8.0 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard-coding a full 40-character commit hash is error-prone. A typo could lead users to an incorrect commit, causing significant issues when they try to set up their environment based on this documentation. To improve usability and reduce this risk, I suggest making this hash a clickable link to the commit on GitHub. This allows for easy verification.
| | main | 6063853ead105f89400770b4fe6760959796d827, v0.12.0 tag | >= 3.10, < 3.12 | 8.3.RC2 | 2.8.0 / 2.8.0 | | |
| | main | [6063853e](https://github.com/vllm-project/vllm/commit/6063853ead105f89400770b4fe6760959796d827), v0.12.0 tag | >= 3.10, < 3.12 | 8.3.RC2 | 2.8.0 / 2.8.0 | |
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
vllm_ascend/platform.py
Outdated
| use_sparse = kwargs["attn_selector_config"].use_sparse | ||
| else: | ||
| use_mla = kwargs.get("use_mla", args[4] if len(args) >= 5 else None) | ||
| use_sparse = kwargs.get("use_sparse", args[5] if len(args) >= 6 else None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use_sparse应该是6 index下标
Please paste the PR in vllm than break vllm-ascend |
|
please fix the CI error |
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: zxwang <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
Signed-off-by: leo-pony <[email protected]>
|
I‘ll fix the UT later |
Fix broken ut introduced by #5053 - vLLM version: v0.12.0 - vLLM main: vllm-project/vllm@ad32e3e Signed-off-by: wangxiyuan <[email protected]>
What this PR does / why we need it?
Upstream vLLM PR #30212 vllm-project/vllm#30212 and vllm-project/vllm#29646
refactored the attention backend selection interface, This PR adapts vllm-ascend's get_attn_backend_cls to align with the new upstream standard, ensuring compatibility and reducing maintenance overhead.
Does this PR introduce any user-facing change?
How was this patch tested?
co-author:[leo-pony][email protected]