Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
)

assert self.connector_worker is not None
if vllm_config.parallel_config.rank == 0:
if vllm_config.parallel_config.rank == 0 and self.kv_role == "kv_producer":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition to start MooncakeLookupServer is too restrictive. It should be started if the role is kv_producer or kv_both. With the current change, a node with kv_both role will not start the server, which is incorrect as it's also a producer.

Suggested change
if vllm_config.parallel_config.rank == 0 and self.kv_role == "kv_producer":
if vllm_config.parallel_config.rank == 0 and self.kv_role != "kv_consumer":

self.lookup_server = MooncakeLookupServer(
self.connector_worker, vllm_config, self.use_layerwise)

Expand Down Expand Up @@ -160,9 +160,10 @@
class MooncakeStoreConnectorV1Scheduler:

def __init__(self, vllm_config: "VllmConfig", use_layerwise):
self.client = MooncakeLookupClient(vllm_config)
self.use_layerwise = use_layerwise
self.kv_role = vllm_config.kv_transfer_config.kv_role
self.client = MooncakeLookupClient(
vllm_config) if self.kv_role == "kv_producer" else None
self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get(
"consumer_is_to_load", False)
Comment on lines 165 to 168
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The condition for creating MooncakeLookupClient is too restrictive and will cause a NoneType error. The client is needed for lookups when the kv_role is kv_producer or kv_both, or when it's kv_consumer and consumer_is_to_load is true. The current logic only creates the client for kv_producer, which will cause a crash in other cases where a lookup is attempted.

To fix this, consumer_is_to_load should be initialized before self.client, and the condition for client creation should be corrected.

Suggested change
self.client = MooncakeLookupClient(
vllm_config) if self.kv_role == "kv_producer" else None
self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get(
"consumer_is_to_load", False)
self.consumer_is_to_load = vllm_config.kv_transfer_config.kv_connector_extra_config.get(
"consumer_is_to_load", False)
if self.kv_role != "kv_consumer" or self.consumer_is_to_load:
self.client = MooncakeLookupClient(vllm_config)
else:
self.client = None

self.load_async = vllm_config.kv_transfer_config.kv_connector_extra_config.get(
Expand Down Expand Up @@ -207,7 +208,7 @@
else:
token_ids = torch.tensor(request.prompt_token_ids)

num_external_hit_tokens = self.client.lookup(token_ids)

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "MooncakeLookupClient | None" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "MooncakeLookupClient | None" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "MooncakeLookupClient | None" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "MooncakeLookupClient | None" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "MooncakeLookupClient | None" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "MooncakeLookupClient | None" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "Optional[MooncakeLookupClient]" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "Optional[MooncakeLookupClient]" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "Optional[MooncakeLookupClient]" has no attribute "lookup" [union-attr]

Check failure on line 211 in vllm_ascend/distributed/mooncake/mooncake_store_connector_v1.py

View workflow job for this annotation

GitHub Actions / lint / pre-commit

Item "None" of "Optional[MooncakeLookupClient]" has no attribute "lookup" [union-attr]

if num_external_hit_tokens == request.num_tokens:
num_external_hit_tokens -= 1
Expand Down
Loading