Skip to content

Conversation

@john-z-yang
Copy link
Contributor

@john-z-yang john-z-yang commented Mar 11, 2025

main

Gnuplot not found, using plotters backend
Benchmarking bench_InflightActivationStore/get_pending_activation: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 171.0s, or reduce sample count to 10.
bench_InflightActivationStore/get_pending_activation
                        time:   [661.18 ms 662.53 ms 664.10 ms]
                        thrpt:  [6.1678 Kelem/s 6.1823 Kelem/s 6.1950 Kelem/s]
Found 4 outliers among 256 measurements (1.56%)
  2 (0.78%) low mild
  2 (0.78%) high severe
Benchmarking bench_InflightActivationStore/set_status: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 116.2s, or reduce sample count to 10.
bench_InflightActivationStore/set_status
                        time:   [445.91 ms 446.69 ms 447.47 ms]
                        thrpt:  [9.1538 Kelem/s 9.1697 Kelem/s 9.1857 Kelem/s]
Found 2 outliers among 256 measurements (0.78%)
  2 (0.78%) high mild

john/shard-store

     Running benches/store_bench.rs (target/release/deps/store_bench-9149732531649d04)
Gnuplot not found, using plotters backend
Benchmarking bench_InflightActivationStore_2_shards/get_pending_activation: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 135.1s, or reduce sample count to 10.
bench_InflightActivationStore_2_shards/get_pending_activation
                        time:   [523.55 ms 525.56 ms 528.18 ms]
                        thrpt:  [7.7549 Kelem/s 7.7935 Kelem/s 7.8235 Kelem/s]
Found 12 outliers among 256 measurements (4.69%)
  6 (2.34%) high mild
  6 (2.34%) high severe
Benchmarking bench_InflightActivationStore_2_shards/set_status: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 90.1s, or reduce sample count to 10.
bench_InflightActivationStore_2_shards/set_status
                        time:   [355.38 ms 355.98 ms 356.59 ms]
                        thrpt:  [11.487 Kelem/s 11.506 Kelem/s 11.526 Kelem/s]
Found 19 outliers among 256 measurements (7.42%)
  8 (3.12%) low severe
  2 (0.78%) low mild
  1 (0.39%) high mild
  8 (3.12%) high severe

Benchmarking bench_InflightActivationStore_4_shards/get_pending_activation: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 114.2s, or reduce sample count to 10.
bench_InflightActivationStore_4_shards/get_pending_activation
                        time:   [444.42 ms 444.90 ms 445.38 ms]
                        thrpt:  [9.1966 Kelem/s 9.2065 Kelem/s 9.2166 Kelem/s]
Found 2 outliers among 256 measurements (0.78%)
  2 (0.78%) high mild
Benchmarking bench_InflightActivationStore_4_shards/set_status: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 92.3s, or reduce sample count to 10.
bench_InflightActivationStore_4_shards/set_status
                        time:   [363.92 ms 364.67 ms 365.40 ms]
                        thrpt:  [11.210 Kelem/s 11.232 Kelem/s 11.255 Kelem/s]
Found 23 outliers among 256 measurements (8.98%)
  12 (4.69%) low severe
  1 (0.39%) low mild
  3 (1.17%) high mild
  7 (2.73%) high severe

Benchmarking bench_InflightActivationStore_8_shards/get_pending_activation: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 112.9s, or reduce sample count to 10.
bench_InflightActivationStore_8_shards/get_pending_activation
                        time:   [444.70 ms 445.38 ms 446.17 ms]
                        thrpt:  [9.1803 Kelem/s 9.1966 Kelem/s 9.2108 Kelem/s]
Found 5 outliers among 256 measurements (1.95%)
  1 (0.39%) low mild
  3 (1.17%) high mild
  1 (0.39%) high severe
Benchmarking bench_InflightActivationStore_8_shards/set_status: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 91.4s, or reduce sample count to 10.
bench_InflightActivationStore_8_shards/set_status
                        time:   [371.25 ms 373.51 ms 376.51 ms]
                        thrpt:  [10.879 Kelem/s 10.966 Kelem/s 11.033 Kelem/s]
Found 23 outliers among 256 measurements (8.98%)
  3 (1.17%) low severe
  7 (2.73%) low mild
  7 (2.73%) high mild
  6 (2.34%) high severe

Benchmarking bench_InflightActivationStore_16_shards/get_pending_activation: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 114.1s, or reduce sample count to 10.
bench_InflightActivationStore_16_shards/get_pending_activation
                        time:   [431.45 ms 438.97 ms 446.09 ms]
                        thrpt:  [9.1821 Kelem/s 9.3308 Kelem/s 9.4935 Kelem/s]
Found 45 outliers among 256 measurements (17.58%)
  37 (14.45%) low severe
  2 (0.78%) low mild
  3 (1.17%) high mild
  3 (1.17%) high severe
Benchmarking bench_InflightActivationStore_16_shards/set_status: Warming up for 3.0000 s
Warning: Unable to complete 256 samples in 5.0s. You may wish to increase target time to 93.4s, or reduce sample count to 10.
bench_InflightActivationStore_16_shards/set_status
                        time:   [383.85 ms 386.71 ms 390.08 ms]
                        thrpt:  [10.500 Kelem/s 10.592 Kelem/s 10.671 Kelem/s]
Found 27 outliers among 256 measurements (10.55%)
  2 (0.78%) low severe
  12 (4.69%) low mild
  5 (1.95%) high mild
  8 (3.12%) high severe

@codecov
Copy link

codecov bot commented Mar 11, 2025

Codecov Report

Attention: Patch coverage is 94.73684% with 17 lines in your changes missing coverage. Please review.

Project coverage is 83.89%. Comparing base (79f6e42) to head (961fd57).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/store/inflight_activation.rs 94.01% 17 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #242      +/-   ##
==========================================
+ Coverage   83.35%   83.89%   +0.53%     
==========================================
  Files          19       19              
  Lines        3568     3824     +256     
==========================================
+ Hits         2974     3208     +234     
- Misses        594      616      +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@john-z-yang john-z-yang force-pushed the john/shard-store branch 8 times, most recently from 630794a to e2a2ca4 Compare March 12, 2025 02:42
Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good 👏

.into_iter()
.collect::<Result<Vec<_>, _>>()?
.into_iter()
.sum())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These combinator chains 🤩

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we actually might not need these join_all operation since most of them are for upkeep, and we could just run a single upkeep per shard.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could just run a single upkeep per shard.

That could help simplify these combinator chains, but it could make metrics harder to understand as all the upkeep metrics will be fractional. Having each shard run upkeep independently feels worth exploring at the very least.

@john-z-yang john-z-yang force-pushed the john/shard-store branch 8 times, most recently from f3e38f1 to 83979f2 Compare March 14, 2025 23:17
@manishrawat1992
Copy link

manishrawat1992 commented Jun 7, 2025

In am wondering, does resharding require cleaning up current queue by stopping ingest workers or can it be done at runtime by just redeploying taskworker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants