-
Notifications
You must be signed in to change notification settings - Fork 654
[main][Bugfix] Remove the ZMQ communication setup on the D node #4926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[main][Bugfix] Remove the ZMQ communication setup on the D node #4926
Conversation
Signed-off-by: SlightwindSec <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly removes the ZeroMQ communication setup on the D node in ascend_store_connector.py to prevent the LookupKeyServer from starting on a consumer node. However, the changes in pool_scheduler.py introduce a critical bug. The conditional initialization of LookupKeyClient is incorrect and will lead to a runtime AttributeError when a consumer node is configured to load data, as self.client will be None but its lookup method is still called. I've provided a critical review comment to fix this logic and a follow-up comment to remove the type: ignore which was masking this bug.
| self.client = LookupKeyClient( | ||
| vllm_config) if self.kv_role != "kv_consumer" else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition for initializing LookupKeyClient is incorrect. If self.kv_role == "kv_consumer" and self.consumer_is_to_load is True, self.client will be None. However, get_num_new_matched_tokens will then attempt to call self.client.lookup, leading to a runtime AttributeError. The client should only be None when it's a consumer that is not configured to load.
| self.client = LookupKeyClient( | |
| vllm_config) if self.kv_role != "kv_consumer" else None | |
| if self.kv_role == "kv_consumer" and not self.consumer_is_to_load: | |
| self.client = None | |
| else: | |
| self.client = LookupKeyClient(vllm_config) |
| num_external_hit_tokens = self.client.lookup( # type: ignore[union-attr] | ||
| token_len, request.block_hashes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the suggested fix for the self.client initialization, this type: ignore comment is no longer necessary and should be removed to improve code clarity. The code path that reaches this line will always have self.client initialized.
num_external_hit_tokens = self.client.lookup(token_len, request.block_hashes)|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
In the PD separation scenario, the D node does not need to perform get operations, and therefore does not need to create ZeroMQ (ZMQ) communication.