-
Notifications
You must be signed in to change notification settings - Fork 647
[Legate] Only build CUDA 13 #12715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[Legate] Only build CUDA 13 #12715
Conversation
|
Since we already have a v25.10 version of this JLL I just want to be sure this build is the only one that can be pulled by the registry if someone asks for that version. |
|
The source of all my pain:
|
| if CUDA.is_supported(platform) && !haskey(platform, "cuda") | ||
| platform["cuda"] = "none" | ||
| else # only other build is 13.0 right now | ||
| platform["cuda"] = "13" #! THIS IS SUPERRRR SKETCHY BUT THE .0 BREAKS THINGS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where else is the platform tag set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean, this seems to have triggered the builds I want (i.e. no .0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it also looks like CUDA_SDK is downloaded now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean
Something must be setting the platforms tags, they aren't assigned automagically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the CUDA_SDK_jll line that sets it to just 13
I think the crux of the issue is that CUDA_SDK_jll has its platform as 13 but things like CUDA_Runtime_jll and NCCL_jll have it as 13.0. There's no way I am aware of to get binary builder to install both as my platform cannot simultaneously be 13 and 13.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the other I believe:
Line 190 in 47bd0da
| platform["cuda"] = "$(version.major).$(version.minor)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine doing the work to fix the issue, but I don't really get what needs to be changed. I know making CUDA_SDK_jll with 13.0 would fix things, but I dont think Im allowed to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NCCL for example has 13.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maleadt did it on purpose I believe, although CUDA 12.0 actually has the .0
This is to avoid some rather painful compat issues we've been having. The
host_platforminside JLLWrappers always seems to resolve to cuda = 13.0 unless there is previously a CUDA.jl install in that environment or the user explicitly sets something in theirLocalPreferences.toml. This means CUDA 13.0 artifacts get installed no matter what and everything also crashes as we do not support CUDA 13 until right now. We could just depend on CUDA.jl and manually set the LocalPreferences.toml to restrict the CUDA version but for that we also have to just pick a version of CUDA.If we only support CUDA 13 which should work on whatever modern device we avoid all that.