-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Description
There should be an option to configure RUSTFS operator to set the pod deletion policy when the node is down. The option can be set to FORCE DELETE, DELETE, DO NOTHING depending on the configuration. This feature was inspired by Longhorn Pod Deletion when Node is Down setting option.
Value Proposition
This would automate the deletion and creation of pods when the node is not ready. The feature useful in situations where the node goes down due to hard drive disconnect or unreliable network conditions. And the pod is stuck in a terminating state. But the RUSTFS operator software can force terminate the pod if it is stuck and create a new pod. Hence, we achieve high availability without human intervention.
Currently, the system admins must manually force delete the pod when it is stuck in terminating state, before the new pod is recreated. This adds more workload on the system admin, if the process can be automated within the configurations. But we need to make sure to implement this in a way that guaranteed the consistency of the objects in the RUSTFS S3 server. Otherwise, such the process can damage the data.
Goals
The feature should provide a way for system admin to automate the deletion of pods on node that go down. This will make the RUSTFS cluster more highly available, by reducing the need for system admins to delete pods that are stuck in the termination state before creating new StatefulSet pods to replace the old ones.
Non-Goals
This feature should be configured from RUSTFS operator for global configuration or each individual RUSTFS instance manifest yaml file for local configuration.
Discussion
I've been trying to get this implemented in keycloak IAM as well, since Keycloak uses StatefulSet instead of Deployment as their default deployment option.
Currently, this is not implemented by MinIO and this feature would distinguish this project for scalable self hosted object storage solutions while bringing the features closer to managed services like AWS S3 and Alibaba OSS. I've tried to use toleration and taint and it doesn't seem to work on MinIO to force evict terminating pods.
Notes
The idea is based on a similar feature offered by Longhorn Block Storage to automate the force termination of pods when the node goes down 1.
The downside is consistency guarantee since, we are terminating a StatefulSet before it is ready to be fully removed. The system admin should be made aware of this downside. Or an internal mechanism to guarantee object consistency needs to be implemented within the operator.