Skip to content

[Bug]: qwen3-vl-235B-bf16 FULL_DECODE_ONLY + VLLM_ASCEND_ENABLE_NZ=1 压测报错问题 #4960

@Levi-JQ

Description

@Levi-JQ

Your current environment

The output of `python collect_env.py`
Your output of above commands here

🐛 Describe the bug

报错信息如下:
Image
由于w8a8权重同配置下没有报错,所以怀疑和NZ格式有关
验证和NZ格式有关:
(1)v0.11.0-dev分支加了这个PR #4495 可以解决
(2)验证使用export VLLM_ASCEND_ENABLE_NZ=0 也可以解决

怀疑是FULL_DECODE_ONLY和VLLM_ASCEND_ENABLE_NZ=1有冲突,报错位置在PA算子处

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions