-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(grpc): add gracefulswitch load balancing policy #2399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
551c605 to
ecd0f18
Compare
ecd0f18 to
28869a7
Compare
rename mock picker and remove spaces make test picker private
28869a7 to
962fb29
Compare
| /** | ||
| Struct for Graceful Switch. | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please write a good docstring for this.
| Struct for Graceful Switch. | ||
| */ | ||
| pub struct GracefulSwitchPolicy { | ||
| subchannel_to_policy: HashMap<WeakSubchannel, ChildKind>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not use the ChildManager for this policy? I was really hoping we could.
The goal was to make it so that no policy would need to manage its own tracking of children.
Ideally graceful switch could do something with the incoming resolver update so that the update sharder would produce 1 or 2 children with the proper configs.
If this is too much to change at the last minute for you, then we can figure out what TODOs to add.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found it a lot easier to manage children directly when I looked into using ChildManager but I can look into redesigning. The challenge about this is that the ResolverUpdate from shard_update would not be used and there would have to be some other design. I think that changes will need to be made for the Child Manager due to the Child manager forwarding resolver updates to children even when unnecessary. I will draft an appropriate design and get approval from you and Easwar.
| match config.unwrap().convert_to::<Arc<GracefulSwitchLbConfig>>() { | ||
| Ok(cfg) => (*cfg).clone(), | ||
| Err(e) => panic!("convert_to failed: {e}"), | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the match on the Result instead of .unwrap()?
| if update.service_config.as_ref().is_ok_and(|sc| sc.is_some()) { | ||
| return Err("can't do service configs yet".into()); | ||
| } | ||
| let cfg: Arc<GracefulSwitchLbConfig> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should be able to elide this type?
| if update.service_config.as_ref().is_ok_and(|sc| sc.is_some()) { | ||
| return Err("can't do service configs yet".into()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. Graceful switch shouldn't need to do anything special with service configs. It should just forward it to all children.
| // Determine if we can switch the new policy in. If there is no children | ||
| // yet or the new policy isn't the same as the latest policy, then | ||
| // we can swap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // Determine if we can switch the new policy in. If there is no children | |
| // yet or the new policy isn't the same as the latest policy, then | |
| // we can swap. | |
| // Determine if we can switch to the new policy. If there is no child | |
| // yet or the new policy isn't the same as the previous policy, then | |
| // we can swap. |
| if needs_switch { | ||
| target_child_kind = self.switch_to(config); | ||
| } | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add a scope here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to locking. I wanted to lock the managing_policy just to call resolve_update and then release it for resolve_child_controller.
General: - Add Debug to many traits and derive/impl in structs. - Pass LB config to LB policies via `Option<LbConfig>` instead of `Option<&LbConfig>`. It should be rare that policies want to store a config except for the leaf policy. Child manager: The original assumption was that all children would be the same type/configuration, but several policies (including gracefulswitch) will not have that property. So, several changes are made: - Children are considered unique by both their identifier and their LbPolicyBuilder's name(). - Make it so the sharder also can shard LbConfig and provide it via the ChildUpdate.child_update field in addition to the ResolverUpdate. - Make ResolverUpdateSharder a generic instead of Box<dyn>. - Add booleans so users of child manager can easily easily tell whether any child policies updated themselves, and which ones did. - Pass &mut self for sharder so that it can maintain and update its state if needed. - Change the sharder's output ChildUpdate.child_update field to an Option; if None then the child will not be called during the resolver update, but will remain in the child manager. - Change child_states into children and provide the whole Child struct, exposing the fields it contains. - Provide mutable access to the sharder. - Minor test cleanups Graceful switch: The previous implementation in hyperium#2399 contained a lot of logic to manage child policy delegation. It was intended that only ChildManager should need to have this kind of logic. - Create a new implementation of this policy that delegates to ChildManager. - Uses a Sharder that simply emits the active policy with no update alongside any new policy in the new LbConfig. - maybe_swap is called after every call into the ChildManager to determine if child updates necessitate a swap. - This logic is simple: if the active policy is not Ready, or if there is a new policy and it is not Connecting, then set the new policy as the active policy and call resolver_update on the ChildManager. The sharder will see that no LbConfig is provided and just emit the active policy with no config, causing the ChildManager to drop the previously active policy. If no swap is needed, update the picker of the active policy if it had an update. - Minor test cleanups/fixes vs. hyperium#2399.
General: - Add Debug to many traits and derive/impl in structs. - Pass LB config to LB policies via `Option<LbConfig>` instead of `Option<&LbConfig>`. It should be rare that policies want to store a config except for the leaf policy. Child manager: The original assumption was that all children would be the same type/configuration, but several policies (including gracefulswitch) will not have that property. So, several changes are made: - Children are considered unique by both their identifier and their LbPolicyBuilder's name(). - Make it so the sharder also can shard LbConfig and provide it via the ChildUpdate.child_update field in addition to the ResolverUpdate. - Make ResolverUpdateSharder a generic instead of Box<dyn>. - Add booleans so users of child manager can easily easily tell whether any child policies updated themselves, and which ones did. - Pass &mut self for sharder so that it can maintain and update its state if needed. - Change the sharder's output ChildUpdate.child_update field to an Option; if None then the child will not be called during the resolver update, but will remain in the child manager. - Change child_states into children and provide the whole Child struct, exposing the fields it contains. - Provide mutable access to the sharder. - Minor test cleanups Graceful switch: The previous implementation in hyperium#2399 contained a lot of logic to manage child policy delegation. It was intended that only ChildManager should need to have this kind of logic. - Create a new implementation of this policy that delegates to ChildManager. - Uses a Sharder that simply emits the active policy with no update alongside any new policy in the new LbConfig. - maybe_swap is called after every call into the ChildManager to determine if child updates necessitate a swap. - This logic is simple: if the active policy is not Ready, or if there is a new policy and it is not Connecting, then set the new policy as the active policy and call resolver_update on the ChildManager. The sharder will see that no LbConfig is provided and just emit the active policy with no config, causing the ChildManager to drop the previously active policy. If no swap is needed, update the picker of the active policy if it had an update. - Minor test cleanups/fixes vs. hyperium#2399.
General: - Add Debug to many traits and derive/impl in structs. - Pass LB config to LB policies via `Option<LbConfig>` instead of `Option<&LbConfig>`. It should be rare that policies want to store a config except for the leaf policy. Child manager: The original assumption was that all children would be the same type/configuration, but several policies (including gracefulswitch) will not have that property. So, several changes are made: - Children are considered unique by both their identifier and their LbPolicyBuilder's name(). - Make it so the sharder also can shard LbConfig and provide it via the ChildUpdate.child_update field in addition to the ResolverUpdate. - Make ResolverUpdateSharder a generic instead of Box<dyn>. - Add booleans so users of child manager can easily easily tell whether any child policies updated themselves, and which ones did. - Pass &mut self for sharder so that it can maintain and update its state if needed. - Change the sharder's output ChildUpdate.child_update field to an Option; if None then the child will not be called during the resolver update, but will remain in the child manager. - Change child_states into children and provide the whole Child struct, exposing the fields it contains. - Provide mutable access to the sharder. - Change the LB config to be a flat JSON array to facilitate use within another LB policy that should not need a struct to contain on the children. - Minor test cleanups Graceful switch: The previous implementation in hyperium#2399 contained a lot of logic to manage child policy delegation. It was intended that only ChildManager should need to have this kind of logic. - Create a new implementation of this policy that delegates to ChildManager. - Uses a Sharder that simply emits the active policy with no update alongside any new policy in the new LbConfig. - maybe_swap is called after every call into the ChildManager to determine if child updates necessitate a swap. - This logic is simple: if the active policy is not Ready, or if there is a new policy and it is not Connecting, then set the new policy as the active policy and call resolver_update on the ChildManager. The sharder will see that no LbConfig is provided and just emit the active policy with no config, causing the ChildManager to drop the previously active policy. If no swap is needed, update the picker of the active policy if it had an update. - Minor test cleanups/fixes vs. hyperium#2399.
Add graceful switch load balancing policy.
@dfawley @easwars