-
Notifications
You must be signed in to change notification settings - Fork 659
[0.11.0][Bugfix] Remove the ZMQ communication setup on the D node #4916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -45,7 +45,7 @@ | |||||||||||||||||||||
| ) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| assert self.connector_worker is not None | ||||||||||||||||||||||
| if vllm_config.parallel_config.rank == 0: | ||||||||||||||||||||||
| if vllm_config.parallel_config.rank == 0 and self.kv_role == "kv_producer": | ||||||||||||||||||||||
| self.lookup_server = MooncakeLookupServer( | ||||||||||||||||||||||
| self.connector_worker, vllm_config, self.use_layerwise) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
|
|
@@ -160,9 +160,10 @@ | |||||||||||||||||||||
| class MooncakeStoreConnectorV1Scheduler: | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| def __init__(self, vllm_config: "VllmConfig", use_layerwise): | ||||||||||||||||||||||
| self.client = MooncakeLookupClient(vllm_config) | ||||||||||||||||||||||
| self.use_layerwise = use_layerwise | ||||||||||||||||||||||
| self.kv_role = vllm_config.kv_transfer_config.kv_role | ||||||||||||||||||||||
| self.client = MooncakeLookupClient( | ||||||||||||||||||||||
| vllm_config) if self.kv_role == "kv_producer" else None | ||||||||||||||||||||||
| self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get( | ||||||||||||||||||||||
| "consumer_is_to_load", False) | ||||||||||||||||||||||
|
Comment on lines
165
to
168
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The condition for creating To fix this,
Suggested change
|
||||||||||||||||||||||
| self.load_async = vllm_config.kv_transfer_config.kv_connector_extra_config.get( | ||||||||||||||||||||||
|
|
@@ -207,7 +208,7 @@ | |||||||||||||||||||||
| else: | ||||||||||||||||||||||
| token_ids = torch.tensor(request.prompt_token_ids) | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
| num_external_hit_tokens = self.client.lookup(token_ids) | ||||||||||||||||||||||
|
Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py
|
||||||||||||||||||||||
|
|
||||||||||||||||||||||
| if num_external_hit_tokens == request.num_tokens: | ||||||||||||||||||||||
| num_external_hit_tokens -= 1 | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition to start
MooncakeLookupServeris too restrictive. It should be started if the role iskv_producerorkv_both. With the current change, a node withkv_bothrole will not start the server, which is incorrect as it's also a producer.