|
1 | 1 | # Feature Support |
2 | 2 |
|
3 | | -| Feature | Supported | CI Coverage | Guidance Document | Current Status | Next Step | |
4 | | -|--------------------------|-----------|-------------|-------------------|---------------------------|--------------------| |
5 | | -| Chunked Prefill | ❌ | | | NA | Plan in 2025.03.30 | |
6 | | -| Automatic Prefix Caching | ❌ | | | NA | Plan in 2025.03.30 | |
7 | | -| LoRA | ❌ | | | NA | Plan in 2025.06.30 | |
8 | | -| Prompt adapter | ❌ | | | NA | Plan in 2025.06.30 | |
9 | | -| Speculative decoding | ✅ | | | Basic functions available | Need fully test | |
10 | | -| Pooling | ✅ | | | Basic functions available(Bert) | Need fully test and add more models support| |
11 | | -| Enc-dec | ❌ | | | NA | Plan in 2025.06.30| |
12 | | -| Multi Modality | ✅ | | ✅ | Basic functions available(LLaVA/Qwen2-vl/Qwen2-audio/internVL)| Improve performance, and add more models support | |
13 | | -| LogProbs | ✅ | | | Basic functions available | Need fully test | |
14 | | -| Prompt logProbs | ✅ | | | Basic functions available | Need fully test | |
15 | | -| Async output | ✅ | | | Basic functions available | Need fully test | |
16 | | -| Multi step scheduler | ✅ | | | Basic functions available | Need fully test, Find more details at [<u> Blog </u>](https://blog.vllm.ai/2024/09/05/perf-update.html#batch-scheduling-multiple-steps-ahead-pr-7000), [<u> RFC </u>](https://github.com/vllm-project/vllm/issues/6854) and [<u>issue</u>](https://github.com/vllm-project/vllm/pull/7000) | |
17 | | -| Best of | ✅ | | | Basic functions available | Need fully test | |
18 | | -| Beam search | ✅ | | | Basic functions available | Need fully test | |
19 | | -| Guided Decoding | ✅ | | | Basic functions available | Find more details at the [<u>issue</u>](https://github.com/vllm-project/vllm-ascend/issues/177) | |
20 | | -| Tensor Parallel | ✅ | | | Basic functions available | Need fully test | |
21 | | -| Pipeline Parallel | ✅ | | | Basic functions available | Need fully test | |
| 3 | +The feature support principle of vLLM Ascend is: **aligned with the vLLM**. We are also actively collaborating with the community to accelerate support. |
| 4 | + |
| 5 | +You can check the [support status of vLLM V1 Engine][v1_user_guide]. Below is the feature support status of vLLM Ascend: |
| 6 | + |
| 7 | +| Feature | vLLM V0 Engine | vLLM V1 Engine | Next Step | |
| 8 | +|-------------------------------|----------------|----------------|------------------------------------------------------------------------| |
| 9 | +| Chunked Prefill | 🚧 WIP | 🚧 WIP | Functional, waiting for CANN 8.1 nnal package release | |
| 10 | +| Automatic Prefix Caching | 🚧 WIP | 🚧 WIP | Functional, waiting for CANN 8.1 nnal package release | |
| 11 | +| LoRA | 🟢 Functional | 🚧 WIP | [vllm-ascend#396][multilora], CI needed, working on V1 support | |
| 12 | +| Prompt adapter | No plan | 🟡 Planned | Plan in 2025.06.30 | |
| 13 | +| Speculative decoding | 🟢 Functional | 🚧 WIP | CI needed; working on V1 support | |
| 14 | +| Pooling | 🟢 Functional | 🟢 Functional | CI needed and adapting more models; V1 support rely on vLLM support. | |
| 15 | +| Enc-dec | 🔴 NO plan | 🟡 Planned | Plan in 2025.06.30 | |
| 16 | +| Multi Modality | 🟢 Functional | 🟢 Functional | [Tutorial][multimodal], optimizing and adapting more models | |
| 17 | +| LogProbs | 🟢 Functional | 🟢 Functional | CI needed | |
| 18 | +| Prompt logProbs | 🟢 Functional | 🟢 Functional | CI needed | |
| 19 | +| Async output | 🟢 Functional | 🟢 Functional | CI needed | |
| 20 | +| Multi step scheduler | 🟢 Functional | 🔴 Deprecated | [vllm#8779][v1_rfc], replaced by [vLLM V1 Scheduler][v1_scheduler]) | |
| 21 | +| Best of | 🟢 Functional | 🔴 Deprecated | [vllm#13361][best_of], CI needed | |
| 22 | +| Beam search | 🟢 Functional | 🟢 Functional | CI needed | |
| 23 | +| Guided Decoding | 🟢 Functional | 🟢 Functional | [vllm-ascend#177][guided_decoding] | |
| 24 | +| Tensor Parallel | 🟢 Functional | 🟢 Functional | CI needed | |
| 25 | +| Pipeline Parallel | 🟢 Functional | 🟢 Functional | CI needed | |
| 26 | +| Expert Parallel | 🔴 NO plan | 🟢 Functional | CI needed; No plan on V0 support | |
| 27 | +| Data Parallel | 🔴 NO plan | 🟢 Functional | CI needed; No plan on V0 support | |
| 28 | +| Prefill Decode Disaggregation | 🟢 Functional | 🟢 Functional | 1P1D available, working on xPyD and V1 support. | |
| 29 | +| Quantization | 🟢 Functional | 🟢 Functional | W8A8 available, CI needed; working on more quantization method support | |
| 30 | +| Graph Mode | 🔴 NO plan | 🟢 Functional | Functional, waiting for CANN 8.1 nnal package release | |
| 31 | +| Sleep Mode | 🟢 Functional | 🟢 Functional | level=1 available, CI needed, working on V1 support | |
| 32 | + |
| 33 | +- 🟢 Functional: Fully operational, with ongoing optimizations. |
| 34 | +- 🚧 WIP: Under active development |
| 35 | +- 🟡 Planned: Scheduled for future implementation (some may have open PRs/RFCs). |
| 36 | +- 🔴 NO plan / Deprecated: No plan for V0 or deprecated by vLLM v1. |
| 37 | + |
| 38 | +[v1_user_guide]: https://docs.vllm.ai/en/latest/getting_started/v1_user_guide.html |
| 39 | +[multimodal]: https://vllm-ascend.readthedocs.io/en/latest/tutorials/single_npu_multimodal.html |
| 40 | +[best_of]: https://github.com/vllm-project/vllm/issues/13361 |
| 41 | +[guided_decoding]: https://github.com/vllm-project/vllm-ascend/issues/177 |
| 42 | +[v1_scheduler]: https://github.com/vllm-project/vllm/blob/main/vllm/v1/core/sched/scheduler.py |
| 43 | +[v1_rfc]: https://github.com/vllm-project/vllm/issues/8779 |
| 44 | +[multilora]: https://github.com/vllm-project/vllm-ascend/issues/396 |
0 commit comments