Skip to content

Conversation

@cjqzhao
Copy link
Collaborator

@cjqzhao cjqzhao commented Jul 1, 2025

Add child manager aggregation logic

@dfawley @easwars

Comment on lines 51 to 52
// work_requests: Arc<Mutex<HashSet<Arc<T>>>>,
// work_scheduler: Arc<dyn WorkScheduler>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete all commented-out code.

fn resolve_child_controller(
&mut self,
channel_controller: WrappedController,
channel_controller: &mut WrappedController,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please configure your editor to remove trailing whitespaces on save. And/or make sure you are running rustfmt (which I would expect would do that, too).

Comment on lines 121 to 123



Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting-wise, please make sure to follow some standard practices, e.g. just one blank line between functions, no double-blank lines in the middle of a function, no blank lines between closing braces, etc.

}

pub trait ChildIdentifier: PartialEq + Hash + Eq + Send + Sync + 'static {}
use std::{sync::{atomic::{AtomicUsize, Ordering}}};
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go above where the other use statements are.

pub trait ChildIdentifier: PartialEq + Hash + Eq + Send + Sync + 'static {}
use std::{sync::{atomic::{AtomicUsize, Ordering}}};

pub trait ChildIdentifier: PartialEq + Hash + Eq + Send + Sync + Display + 'static {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know that we want to enforce child identifiers to implement Display... Debug should be sufficient.

// }
}

fn compare_prev_to_new_pickers(& self, old_pickers: &[Arc<dyn Picker>], new_pickers: &[Arc<dyn Picker>]) -> bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to do this... We can just always produce an update if any child updated. That should be good enough.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will look into it! I remember that I was running into an issue with updated not being reliable which is why I added this but I definitely could have missed something

}

fn exit_idle(&mut self, channel_controller: &mut dyn ChannelController) {
todo!()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call all children and exit_idle them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh huh I wrote this but somehow didn't make it into the commit...

}
}

struct ReadyPickers {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have just one RoundRobinPicker that holds these same fields?

Comment on lines 532 to 545
pub fn new() -> Self {
Self {
pickers: vec![],
next: AtomicUsize::new(0),
}
}

fn add_picker(&mut self, picker: Arc<dyn Picker>) {
self.pickers.push(picker);
}

pub fn has_any(&self) -> bool {
!self.pickers.is_empty()
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, it would simplify if you only have a new(v: Vec<Arc<dyn Picker>>).


impl Picker for SchedulingIdlePicker {
fn pick(&self, _request: &Request) -> PickResult {
self.work_scheduler.schedule_work();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this special case is needed. When you call into the idle picker of the child, it should be responsible for requesting work if it needs it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes will delete.

sent_connecting_state: bool,
aggregated_state: ConnectivityState,
last_ready_pickers: Vec<Arc<dyn Picker>>,

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove extraneous blank lines. Blank lines should be intentionally visually separating things.

Comment on lines 92 to 98
pub fn new(
update_sharder: Box<dyn ResolverUpdateSharder<T>>
) -> Self {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't need to get reformatted. How are you running rustfmt?

pub fn new(
update_sharder: Box<dyn ResolverUpdateSharder<T>>
) -> Self {
ChildManager {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ChildManager {
Self {

I think is preferred, as it was before.

Comment on lines 151 to 152
// let mut transient_failure_picker = TransientFailurePickers::new("error string I guess".to_string());
// let mut connecting_pickers = ConnectingPickers::new();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented-out code should always be removed.

@cjqzhao cjqzhao force-pushed the childmanagerstates branch from a76ed88 to 1bdeab1 Compare July 2, 2025 22:34
}

/// Called to aggregate states from children policies then returns a update.
pub fn aggregate_states(&mut self) -> Option<LbState> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an Option? I would think it should always return a valid LbState, even if it's "TRANSIENT_FAILURE, "

Comment on lines 186 to 187
let should_update =
!self.compare_prev_to_new_pickers(&self.last_ready_pickers, &pickers_vec);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this exist? I thought this function would just produce the aggregate state / aggregate picker, always. Why wouldn't it want to do that?

Comment on lines +99 to +100
prev_state: ConnectivityState::Idle,
last_ready_pickers: Vec::new(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of state, I think, should generally be kept in a policy that uses the child manager nad not in the child manager itself.

}

fn exit_idle(&mut self, channel_controller: &mut dyn ChannelController) {
let child_idxes = mem::take(&mut *self.pending_work.lock().unwrap());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right. We want to call exit_idle on all children (or all idle children?) if this happens. Not just the ones that have requested work (which should be none of the idle ones).


fn update_picker(&mut self, update: LbState) {
self.picker_update = Some(update);
self.picker_update = Some(update.clone());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Comment on lines 477 to 479
fn add_picker(&mut self, picker: Arc<dyn Picker>) {
self.pickers.push(picker);
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it so all the pickers are known when new is called instead?

Ideally we set next initially to a random value in the range of the number of entries.

Comment on lines 485 to 487
if len == 0 {
return PickResult::Queue;
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this special case be handled by having a queuing picker instead, and not using this with empty lists?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants