Skip to content

Conversation

@SlightwindSec
Copy link
Contributor

@SlightwindSec SlightwindSec commented Dec 11, 2025

In the PD separation scenario, the D node does not need to perform get operations, and therefore does not need to create ZeroMQ (ZMQ) communication.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly removes the ZeroMQ communication setup on the D node in ascend_store_connector.py to prevent the LookupKeyServer from starting on a consumer node. However, the changes in pool_scheduler.py introduce a critical bug. The conditional initialization of LookupKeyClient is incorrect and will lead to a runtime AttributeError when a consumer node is configured to load data, as self.client will be None but its lookup method is still called. I've provided a critical review comment to fix this logic and a follow-up comment to remove the type: ignore which was masking this bug.

Comment on lines +29 to +30
self.client = LookupKeyClient(
vllm_config) if self.kv_role != "kv_consumer" else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The condition for initializing LookupKeyClient is incorrect. If self.kv_role == "kv_consumer" and self.consumer_is_to_load is True, self.client will be None. However, get_num_new_matched_tokens will then attempt to call self.client.lookup, leading to a runtime AttributeError. The client should only be None when it's a consumer that is not configured to load.

Suggested change
self.client = LookupKeyClient(
vllm_config) if self.kv_role != "kv_consumer" else None
if self.kv_role == "kv_consumer" and not self.consumer_is_to_load:
self.client = None
else:
self.client = LookupKeyClient(vllm_config)

Comment on lines +78 to +79
num_external_hit_tokens = self.client.lookup( # type: ignore[union-attr]
token_len, request.block_hashes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

With the suggested fix for the self.client initialization, this type: ignore comment is no longer necessary and should be removed to improve code clarity. The code path that reaches this line will always have self.client initialized.

        num_external_hit_tokens = self.client.lookup(token_len, request.block_hashes)

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

@wangxiyuan wangxiyuan merged commit b8a317c into vllm-project:main Dec 12, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants