fix: Make test_kbucket_split_behavior more reliable #1033

acul71 · 2025-11-10T19:05:52Z

Problem

The test test_kbucket_split_behavior in tests/core/kad_dht/test_unit_routing_table.py was failing intermittently in CI/CD, as reported in discussion #1023 and issue #1032.

Root Cause

The mock_host fixture in the TestRoutingTable class didn't properly mock host.new_stream(). When a bucket is full, the KBucket.add_peer() method tries to ping the oldest peer using _ping_peer(), which calls host.new_stream(). Without proper mocking, the behavior was unpredictable, causing intermittent test failures.

Solution

Mock host.new_stream() in the TestRoutingTable.mock_host fixture to raise an exception, simulating unresponsive peers. This makes the test deterministic and matches real production behavior where new_stream() fails for unreachable peers.

Changes

Added host.new_stream = AsyncMock(side_effect=Exception("Mock stream error")) to the mock_host fixture
This ensures ping operations fail consistently, allowing peers to be added when buckets are full
Matches real production behavior where unresponsive peers are replaced

Testing

✅ Test passes consistently (verified with multiple runs)
✅ All other tests in the file still pass (16/16)
✅ No linting errors

- Add peer ID validation to prevent peer ID spoofing attacks - Reject signed peer records where peer ID doesn't match sender - Optimize peer ID comparison (direct comparison instead of string conversion) - Add comprehensive test for peer ID validation - Update docstrings with security information - Add newsfragment for issue #958 This addresses CVE-2023-40583 equivalent vulnerability.

Fix flaky test by ensuring mock_host.new_stream raises an exception so that ping operations fail consistently. This allows peers to be added when buckets are full, which is necessary for the test to pass. The test was failing intermittently in CI/CD because the ping behavior was unpredictable when new_stream was not properly mocked. This simulates real production behavior where new_stream() fails for unreachable peers, allowing the routing table to replace unresponsive peers as per Kademlia protocol specifications. Fixes #1032

acul71 · 2025-11-10T19:06:04Z

Hi @seetadev and @sumanjeet0012,

I've investigated the flaky test issue you mentioned in discussion #1023 regarding test_kbucket_split_behavior failing intermittently in CI/CD.

Root Cause

The test was flaky because the mock_host fixture in TestRoutingTable didn't properly mock host.new_stream(). When a bucket is full, the code tries to ping the oldest peer, and without proper mocking, the behavior was unpredictable.

Solution

I've created a fix that mocks new_stream() to raise an exception, simulating unresponsive peers. This:

Makes the test deterministic and reliable
Matches real production behavior where new_stream() fails for unreachable peers
Allows the routing table to correctly replace unresponsive peers (as per Kademlia protocol)

PR Created

I've opened PR #1033 with the fix. The test now passes consistently, and all other tests remain unaffected.

Could you please review when you have a chance? The fix is minimal and focused on making the test more reliable.

Thanks!

seetadev · 2025-11-11T21:26:30Z

Excellent work, @acul71 👏

This is a clean, well-reasoned fix to one of the more persistent CI/CD issues we’ve been seeing with test_kbucket_split_behavior. Your analysis of the root cause — particularly the insight that mock_host.new_stream() wasn’t properly mocked and was introducing nondeterministic behavior — is spot on.

The solution of explicitly mocking new_stream() to raise an exception makes perfect sense. It brings the test behavior closer to real-world scenarios, where peers can be unreachable and ping attempts fail. This approach not only stabilizes the test but also improves the accuracy of how the routing table logic is validated against actual Kademlia behavior.

It’s great to see that you verified consistency across multiple runs and ensured that all other tests remained unaffected.

Thanks for taking the time to dig into the underlying issue and provide a production-aligned fix. 🚀

Looks ready for merge.

scacaca and others added 8 commits September 25, 2025 16:07

Update identify_push.py

9a3813c

add checking peer_id in indentify

Merge branch 'main' into scacaca-check_peer_id_fix

7c74d20

chore: fix lint issues

14bc060

Merge branch 'main' into scacaca-check_peer_id_fix

fd4ac56

Merge branch 'main' into scacaca-check_peer_id_fix

77f20b7

Merge branch 'main' into scacaca-check_peer_id_fix

1d2f04c

seetadev added 2 commits November 11, 2025 03:25

Merge branch 'main' into fix/kad-dht-kbucket-split-test

efb6a9c

Merge branch 'main' into fix/kad-dht-kbucket-split-test

8e9b3ac

seetadev merged commit e9d0959 into main Nov 11, 2025
38 checks passed

seetadev deleted the fix/kad-dht-kbucket-split-test branch November 11, 2025 21:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Make test_kbucket_split_behavior more reliable #1033

fix: Make test_kbucket_split_behavior more reliable #1033

Uh oh!

acul71 commented Nov 10, 2025

Uh oh!

acul71 commented Nov 10, 2025

Uh oh!

seetadev commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: Make test_kbucket_split_behavior more reliable #1033

fix: Make test_kbucket_split_behavior more reliable #1033

Uh oh!

Conversation

acul71 commented Nov 10, 2025

Problem

Root Cause

Solution

Changes

Testing

Related

Uh oh!

acul71 commented Nov 10, 2025

Root Cause

Solution

PR Created

Uh oh!

seetadev commented Nov 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants