- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1
Fix deployment on arm64 architecture #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
Fix service build and deployment on arm64 architecture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enables deployment on arm64 architecture by updating Docker configurations and Kubernetes deployment templates to support multi-architecture builds and private container registries.
Key changes:
- Adds imagePullSecrets configuration to deployment templates for private registry authentication
- Creates new Dockerfiles for device plugins and kube-scheduler to enable custom image builds
- Updates deployment scripts to use configurable registry prefixes and image tags
Reviewed Changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description | 
|---|---|
| src/hivedscheduler/deploy/hivedscheduler.yaml.template | Adds imagePullSecrets for private registry access | 
| src/hivedscheduler/build/kube-scheduler.k8s.dockerfile | New Dockerfile for kube-scheduler image | 
| src/hivedscheduler/build/hivedscheduler.k8s.dockerfile | New Dockerfile replacing the common dockerfile with updated license header | 
| src/hivedscheduler/build/hivedscheduler.common.dockerfile | Removed in favor of k8s-specific dockerfile | 
| src/frameworkcontroller/deploy/frameworkcontroller.yaml.template | Adds imagePullSecrets for private registry access | 
| src/device-plugin/deploy/start.sh.template | Updates to use configurable registry and adds imagePullSecrets injection | 
| src/device-plugin/deploy/service.yaml | Adds device-plugin.yaml to template list | 
| src/device-plugin/deploy/device-plugin.yaml.template | Replaces hardcoded image with configurable registry and adds imagePullSecrets | 
| src/device-plugin/build/k8s-rocm-device-plugin.k8s.dockerfile | New Dockerfile for AMD ROCm device plugin | 
| src/device-plugin/build/k8s-rdma-shared-dev-plugin.k8s.dockerfile | New Dockerfile for RDMA device plugin | 
| src/device-plugin/build/k8s-nvidia-device-plugin.k8s.dockerfile | New Dockerfile for NVIDIA device plugin | 
| src/device-plugin/build/k8s-host-device-plugin.k8s.dockerfile | New Dockerfile for host device plugin | 
| contrib/kubespray/script/environment.sh | Adds force flag to symlink commands to prevent errors on re-runs | 
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Copyright (c) Microsoft Corporation. | ||
| # Licensed under the MIT License. | ||
|  | ||
| FROM ghcr.io/mellanox/k8s-rdma-shared-dev-plugin:v1.5.3 | 
    
      
    
      Copilot
AI
    
    
    
      Oct 26, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deployment script references v1.4.0 in start.sh.template (line 86), but this Dockerfile uses v1.5.3. This version mismatch could cause deployment issues. Consider aligning both to use the same version.
| FROM ghcr.io/mellanox/k8s-rdma-shared-dev-plugin:v1.5.3 | |
| FROM ghcr.io/mellanox/k8s-rdma-shared-dev-plugin:v1.4.0 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the v1.4.0 daemon set yaml uses latest tag, @hippogr to confirm current version
Use buildkit for Docker build.
Upgrade NVIDIA k8s device plugin version.
Fix service build and deployment on arm64 architecture.