-
Notifications
You must be signed in to change notification settings - Fork 655
[0.11.0][Bugfix] Remove the ZMQ communication setup on the D node #4916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[0.11.0][Bugfix] Remove the ZMQ communication setup on the D node #4916
Conversation
Signed-off-by: SlightwindSec <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request attempts to conditionally initialize the Mooncake lookup client and server. However, the conditions introduced are too restrictive and lead to incorrect behavior. Specifically, the MooncakeLookupServer is not started for the kv_both role, and the MooncakeLookupClient is not created in cases where it's needed (e.g., kv_both role, or kv_consumer with loading enabled), which will cause a runtime crash. I've provided critical feedback and suggestions to fix these issues.
| self.client = MooncakeLookupClient( | ||
| vllm_config) if self.kv_role == "kv_producer" else None | ||
| self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get( | ||
| "consumer_is_to_load", False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition for creating MooncakeLookupClient is too restrictive and will cause a NoneType error. The client is needed for lookups when the kv_role is kv_producer or kv_both, or when it's kv_consumer and consumer_is_to_load is true. The current logic only creates the client for kv_producer, which will cause a crash in other cases where a lookup is attempted.
To fix this, consumer_is_to_load should be initialized before self.client, and the condition for client creation should be corrected.
| self.client = MooncakeLookupClient( | |
| vllm_config) if self.kv_role == "kv_producer" else None | |
| self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get( | |
| "consumer_is_to_load", False) | |
| self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get( | |
| "consumer_is_to_load", False) | |
| if self.kv_role != "kv_consumer" or self.consumer_is_to_load: | |
| self.client = MooncakeLookupClient(vllm_config) | |
| else: | |
| self.client = None |
|
|
||
| assert self.connector_worker is not None | ||
| if vllm_config.parallel_config.rank == 0: | ||
| if vllm_config.parallel_config.rank == 0 and self.kv_role == "kv_producer": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition to start MooncakeLookupServer is too restrictive. It should be started if the role is kv_producer or kv_both. With the current change, a node with kv_both role will not start the server, which is incorrect as it's also a producer.
| if vllm_config.parallel_config.rank == 0 and self.kv_role == "kv_producer": | |
| if vllm_config.parallel_config.rank == 0 and self.kv_role != "kv_consumer": |
|
please update the commit message and point that which PR is picked from main. |
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
Signed-off-by: SlightwindSec <[email protected]>
Signed-off-by: SlightwindSec <[email protected]>
done |
In the PD separation scenario, the D node does not need to perform get operations, and therefore does not need to create ZeroMQ (ZMQ) communication.