[WIP]Add Func: npugraph_batch_size auto-adjust to different model #713

chris668899 · 2025-04-28T14:34:27Z

What this PR does / why we need it?

This PR add new function of : npugraph_batch_size can dynamic adjust to different model; before this PR, the npugraph_batch_sizes given from vllm to vllm-ascend always too large, and that may result in ERROR while running on different, with the information: "The resources are insufficient".
Now, with this PR, the code can dynamic adjust npugraph_batch_sizes depend on the model hidden_layer_nums and parallel config, for example:
a. for Qwen2.5-7B, the npugraph_batch_size length is 33 total;
b. for Qwen2.5-72B, the npugraph_batch_size length is 11 total;

github-actions bot added the module:tests label Apr 29, 2025

chris668899 force-pushed the main branch from 8f7b199 to 46e8acd Compare April 29, 2025 14:34

chris668899 closed this Apr 29, 2025

chris668899 force-pushed the main branch from e5bdac5 to b917361 Compare April 29, 2025 14:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP]Add Func: npugraph_batch_size auto-adjust to different model #713

[WIP]Add Func: npugraph_batch_size auto-adjust to different model #713

Uh oh!

chris668899 commented Apr 28, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[WIP]Add Func: npugraph_batch_size auto-adjust to different model #713

[WIP]Add Func: npugraph_batch_size auto-adjust to different model #713

Uh oh!

Conversation

chris668899 commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chris668899 commented Apr 28, 2025 •

edited

Loading