-
-
Notifications
You must be signed in to change notification settings - Fork 12k
[Bugfix] Fix RequestOutput miss lora_request #30636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Jee Jee Li <[email protected]>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses a bug where the lora_request information was missing from the RequestOutput. The changes correctly propagate the full LoRARequest object through the RequestState to the final RequestOutput. The fix is validated by an updated test in tests/lora/test_llama_tp.py, which now asserts that the lora_request is correctly populated in the output. The implementation is clean and effectively resolves the issue.
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
|
Trying to fix this here too: #30744 but this makes sense in the interim; thanks! |
|
@LucasWilkinson Yeah, my changes are mainly aimed at minimizing edge-case OOM issues. |
Since `LoRARequest` is part of the public API - e.g. as a parameter to `LLM.generate()` or since vllm-project#30636 a member of `RequestOutput` - this warning looks outdated: ```python class LoRARequest(...): """ Request for a LoRA adapter. Note that this class should be used internally. For online serving, it is recommended to not allow users to use this class but instead provide another layer of abstraction to prevent users from accessing unauthorized LoRA adapters. ... """ ``` Indeed, `LoRAReqest` seems to have been part of the public API since it was added in vllm-project#1804 Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]>
Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Nathan Price <[email protected]>
Purpose
Test Plan
LoRA tests in CI , and we also can test it locally by the following script:
Test Result
test_llama_tp.pyin CI should pass correctly, the script should print information as followEssential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.