Skip to content

Conversation

@ForVic
Copy link
Contributor

@ForVic ForVic commented Nov 4, 2025

What changes were proposed in this pull request?

Adds support for K8s Deployment API to allocate pods.

Why are the changes needed?

Allocating individual pods is not ideal, and we can allocate with higher level APIs. #33508 helps this by adding an interface for arbitrary allocators and adds a statefulset allocator. However, dynamic allocation only works if you have implemented a PodDisruptionBudget associated with the decommission label. Since Deployment uses ReplicaSet, which supports pod-deletion-cost annotation, we can avoid needing to create a separate PDB resource, and allow dynamic allocation (w/ shuffle tracking) by adding a low deletion cost to executors we are scaling down. When we scale the Deployment, it will choose to scale down the pods with the low deletion cost.

Does this PR introduce any user-facing change?

Yes, adds user-facing configs

spark.kubernetes.executor.podDeletionCost

How was this patch tested?

New unit tests + passing existing unit tests + tested in a cluster with shuffle tracking and dynamic allocation enabled

Was this patch authored or co-authored using generative AI tooling?

No

@ForVic ForVic changed the title [todo] Add support for Deployment API on K8s [todo][K8S] Add support for Deployment API on K8s Nov 4, 2025
@ForVic ForVic changed the title [todo][K8S] Add support for Deployment API on K8s [SPARK-54173][K8S] Add support for Deployment API on K8s Nov 4, 2025
@ForVic ForVic marked this pull request as ready for review November 5, 2025 01:03
@ForVic
Copy link
Contributor Author

ForVic commented Nov 5, 2025

cc @sunchao

also @holdenk @dongjoon-hyun

@sunchao
Copy link
Member

sunchao commented Nov 5, 2025

Thanks @dongjoon-hyun for the review!

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ForVic ! Looks mostly good to me (already reviewed internally). Just one question around the new conf.


val KUBERNETES_EXECUTOR_POD_DELETION_COST =
ConfigBuilder("spark.kubernetes.executor.podDeletionCost")
.doc("Value to set for the controller.kubernetes.io/pod-deletion-cost" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can emphasize that this must be set when dynamic allocation is enabled and the deployment-based allocator is enabled.

Also, do we need this configuration at all? can we just set this to some large number such as 100 when deployment-based allocator and dynamic allocation is enabled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants