-
Notifications
You must be signed in to change notification settings - Fork 450
Adding autoKueue to helm chart #7368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: Edwinhr716 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
|
@Edwinhr716: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the expectation from Kueue team on this "feature"?
Ideally we would have tests on this. Otherwise I worry this will regress over time as TAS features get worked on.
| "helm.sh/hook-delete-policy": before-hook-creation | ||
| data: | ||
| resources.yaml: |- | ||
| apiVersion: kueue.x-k8s.io/v1beta1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| apiVersion: kueue.x-k8s.io/v1beta1 | |
| apiVersion: kueue.x-k8s.io/v1beta2 |
nit I think it should already work and seems better to use new versions going forward.
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRoleBinding | ||
| metadata: | ||
| name: {{ .Release.Name }}-kueue-hook-crb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| name: {{ .Release.Name }}-kueue-hook-crb | |
| name: {{ .Release.Name }}-autokueue-hook-crb |
And the name for the role I would say -autokueue-hook-clusterrole
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRole | ||
| metadata: | ||
| name: {{ .Release.Name }}-kueue-hook-clusterrole |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| name: {{ .Release.Name }}-kueue-hook-clusterrole | |
| name: {{ .Release.Name }}-autokueue-hook-clusterrole |
wdyt?
| serviceAccountName: {{ .Release.Name }}-kueue-hook-sa | ||
| containers: | ||
| - name: kubectl-apply | ||
| image: bitnami/kubectl:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use some image that is hosted by k8s registry?
Please explore https://explore.ggcr.dev/?repo=registry.k8s.io
there is registry.k8s.io/kubectl hosted.
| |-----|------|---------|-------------| | ||
| | autoKueue.tasLevels | list | `[{name: cloud.provider.com/topology-block}]` | Defines the TAS levels | | ||
| | autoKueue.nodeLabel | object | `{cloud.provider.com/node-group: "tas-group"}` | Sets the Resource flavor node label | | ||
| | autoKueue.clusterQueueName | string | `cq` | The name of the cluster queue that will be created | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | autoKueue.clusterQueueName | string | `cq` | The name of the cluster queue that will be created | | |
| | autoKueue.clusterQueueName | string | `default` | The name of the cluster queue that will be created | |
wdyt?
| kind: ResourceFlavor | ||
| apiVersion: kueue.x-k8s.io/v1beta1 | ||
| metadata: | ||
| name: "tas-flavor" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| name: "tas-flavor" | |
| name: "tas-gpu-default" |
since we may need other flavors for other accelerators.
wdyt?
I think this is a useful feature, but some demo and discussion on wg-batch will be useful in two weeks, wdyt?
Indeed we need some tests. Unit tests is the bare minimum as we have for helm. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds autoKueue, for automated deployment of Topology Aware Scheduling
Which issue(s) this PR fixes:
Fixes #7347
Special notes for your reviewer:
Does this PR introduce a user-facing change?