-
Notifications
You must be signed in to change notification settings - Fork 654
PCP/DCP 适配mooncake layerwise connector #4924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
此 PR 旨在为 mooncake layerwise connector 适配 PCP/DCP。改动范围较广,并引入了处理分布式 KV 缓存传输的复杂逻辑。我发现了一些严重问题,涉及不正确的端口计算、竞争条件和错误的数据切片,这些问题可能导致通信失败或行为不正确。为了确保实现的正确性,应解决这些问题。
| remote_block_offset = 0 | ||
| for local_kv_id in range(len(remote_handshake_port_list)): | ||
| num_blocks_to_push = local_block_nums[local_kv_id] | ||
| local_block_ids_list.append( | ||
| meta.local_block_ids[:num_blocks_to_push]) | ||
| remote_block_ids_list.append( | ||
| meta.remote_block_ids[remote_block_offset:remote_block_offset+ | ||
| num_blocks_to_push]) | ||
| remote_block_offset += num_blocks_to_push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
切分 meta.local_block_ids 的逻辑不正确。在每次循环迭代中,它都重复地从列表的开头进行切片(meta.local_block_ids[:num_blocks_to_push])。这将导致所有切分都使用相同的初始本地块集合,从而导致不正确的数据传输。您应该使用一个偏移量来在每次迭代中切分 meta.local_block_ids 的正确部分,类似于 remote_block_offset 用于 meta.remote_block_ids 的方式。
remote_block_offset = 0
local_block_offset = 0
for local_kv_id in range(len(remote_handshake_port_list)):
num_blocks_to_push = local_block_nums[local_kv_id]
local_block_ids_list.append(
meta.local_block_ids[local_block_offset:local_block_offset +
num_blocks_to_push])
remote_block_ids_list.append(
meta.remote_block_ids[remote_block_offset:remote_block_offset +
num_blocks_to_push])
remote_block_offset += num_blocks_to_push
local_block_offset += num_blocks_to_push|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
b77ab28 to
ee4baa3
Compare
What this PR does / why we need it?
Does this PR introduce any user-facing change?
How was this patch tested?