Skip to content

Commit 9766cf9

Browse files
fix qwen3vl mrope op (#4484)
### What this PR does / why we need it? Qwen2.5-VL mrope precision problem would been solved once this pr is merged ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on G8600 with textVQA dataset - vLLM version: v0.11.2 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.2 --------- Signed-off-by: 李少鹏 <[email protected]> Co-authored-by: wangxiyuan <[email protected]>
1 parent 3c3c9a5 commit 9766cf9

File tree

1 file changed

+2
-6
lines changed

1 file changed

+2
-6
lines changed

vllm_ascend/ops/rotary_embedding.py

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,6 @@
2424
from vllm.model_executor.layers.rotary_embedding import (
2525
DeepseekScalingRotaryEmbedding, MRotaryEmbedding, RotaryEmbedding,
2626
YaRNScalingRotaryEmbedding)
27-
from vllm.platforms import CpuArchEnum
2827

2928
from vllm_ascend.platform import NPUPlatform
3029
from vllm_ascend.utils import (AscendDeviceType, enable_custom_op,
@@ -411,10 +410,7 @@ def forward_oot(
411410
query: torch.Tensor,
412411
key: torch.Tensor,
413412
):
414-
# TODO: This judgment will be removed once the mrope precision issue is fixed
415-
if self.mrope_section != [
416-
16, 24, 24
417-
] or NPUPlatform.get_cpu_architecture() == CpuArchEnum.X86:
413+
if self.mrope_section != [16, 24, 24]:
418414
return super().forward_oot(positions, query, key)
419415

420416
import torch_npu
@@ -429,7 +425,7 @@ def forward_oot(
429425
self.cos_sin_cache = self.cos_sin_cache.to( # type: ignore
430426
query.dtype) # type: ignore
431427

432-
query, key = torch_npu.npu_mrope(positions,
428+
query, key = torch_npu.npu_mrope(positions.contiguous(),
433429
query.contiguous(),
434430
key.contiguous(),
435431
self.cos_sin_cache.contiguous(),

0 commit comments

Comments
 (0)