Skip to content

Commit d019ef7

Browse files
authored
Include the default PyPI for missing libucx-cu12 package version. (#1495)
The CPU/GPU container image builds are currently broken due to dependency resolution failures: <img width="856" height="242" alt="9MgPMDp9qN8qCqf" src="https://github.com/user-attachments/assets/b30c6024-0391-4c54-a441-502847673d7c" /> It appears, this version was removed from the Nvidia index since the last build. <img width="1099" height="281" alt="6NFm9uDQDA5efcV" src="https://github.com/user-attachments/assets/c4756d56-4f21-43fc-8591-3eccfcb74dc0" /> We ensure that a compatible package version `libucx-cu12==1.18.1` is available by including the default PyPI and specifying an appropriate [index strategy](https://docs.astral.sh/uv/concepts/indexes/#searching-across-multiple-indexes) to `uv`. In the near future, we may want to consider upgrading `cuml-cu12` and ecosystem to 25.6 or later.
1 parent da65765 commit d019ef7

File tree

3 files changed

+41
-33
lines changed

3 files changed

+41
-33
lines changed

Dockerfile.tmpl

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,9 @@ RUN uv pip install --no-build-isolation --system "git+https://github.com/Kaggle/
3434

3535
# b/408281617: Torch is adamant that it can not install cudnn 9.3.x, only 9.1.x, but Tensorflow can only support 9.3.x.
3636
# This conflict causes a number of package downgrades, which are handled in this command
37-
RUN uv pip install --system --force-reinstall --extra-index-url https://pypi.nvidia.com "cuml-cu12==25.2.1" \
37+
RUN uv pip install \
38+
--index-url https://pypi.nvidia.com --extra-index-url https://pypi.org/simple/ --index-strategy unsafe-first-match \
39+
--system --force-reinstall "cuml-cu12==25.2.1" \
3840
"nvidia-cudnn-cu12==9.3.0.75" "nvidia-cublas-cu12==12.5.3.2" "nvidia-cusolver-cu12==11.6.3.83" \
3941
"nvidia-cuda-cupti-cu12==12.5.82" "nvidia-cuda-nvrtc-cu12==12.5.82" "nvidia-cuda-runtime-cu12==12.5.82" \
4042
"nvidia-cufft-cu12==11.2.3.61" "nvidia-curand-cu12==10.3.6.82" "nvidia-cusparse-cu12==12.5.1.3" \
@@ -171,7 +173,7 @@ ENV PYTHONUSERBASE="/root/.local"
171173
ADD patches/kaggle_gcp.py \
172174
patches/kaggle_secrets.py \
173175
patches/kaggle_session.py \
174-
patches/kaggle_web_client.py \
176+
patches/kaggle_web_client.py \
175177
patches/kaggle_datasets.py \
176178
patches/log.py \
177179
$PACKAGE_PATH/

kaggle_requirements.txt

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,10 @@ easyocr
3535
# b/302136621: Fix eli5 import for learntools
3636
eli5
3737
emoji
38-
fastcore>=1.7.20
38+
fastcore
39+
# b/445960030: Requires a newer version of fastai than the currently used base image.
40+
# Remove when relying on a newer base image.
41+
fastai>=2.8.4
3942
fasttext
4043
featuretools
4144
fiona
@@ -89,7 +92,9 @@ nbconvert==6.4.5
8992
nbdev
9093
nilearn
9194
olefile
92-
onnx
95+
# b/445960030: Broken in 1.19.0. See https://github.com/onnx/onnx/issues/7249.
96+
# Fixed with https://github.com/onnx/onnx/pull/7254. Upgrade when version with fix is published.
97+
onnx==1.18.0
9398
openslide-bin
9499
openslide-python
95100
optuna

tests/test_fastai.py

Lines changed: 30 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,36 @@
11
import unittest
22

33
import fastai
4-
54
from fastai.tabular.all import *
65

6+
77
class TestFastAI(unittest.TestCase):
8-
# Basic import
9-
def test_basic(self):
10-
import fastai
11-
import fastcore
12-
import fastprogress
13-
import fastdownload
14-
15-
def test_has_version(self):
16-
self.assertGreater(len(fastai.__version__), 2)
17-
18-
# based on https://github.com/fastai/fastai/blob/master/tests/test_torch_core.py#L17
19-
def test_torch_tensor(self):
20-
a = tensor([1, 2, 3])
21-
b = torch.tensor([1, 2, 3])
22-
23-
self.assertTrue(torch.all(a == b))
24-
25-
def test_tabular(self):
26-
dls = TabularDataLoaders.from_csv(
27-
"/input/tests/data/train.csv",
28-
cont_names=["pixel"+str(i) for i in range(784)],
29-
y_names='label',
30-
procs=[FillMissing, Categorify, Normalize])
31-
learn = tabular_learner(dls, layers=[200, 100])
32-
with learn.no_bar():
33-
learn.fit_one_cycle(n_epoch=1)
34-
35-
self.assertGreater(learn.smooth_loss, 0)
8+
# Basic import
9+
def test_basic(self):
10+
import fastai
11+
import fastcore
12+
import fastprogress
13+
import fastdownload
14+
15+
def test_has_version(self):
16+
self.assertGreater(len(fastai.__version__), 2)
17+
18+
# based on https://github.com/fastai/fastai/blob/master/tests/test_torch_core.py#L17
19+
def test_torch_tensor(self):
20+
a = tensor([1, 2, 3])
21+
b = torch.tensor([1, 2, 3])
22+
23+
self.assertTrue(torch.all(a == b))
24+
25+
def test_tabular(self):
26+
dls = TabularDataLoaders.from_csv(
27+
"/input/tests/data/train.csv",
28+
cont_names=["pixel" + str(i) for i in range(784)],
29+
y_names="label",
30+
procs=[FillMissing, Categorify, Normalize],
31+
)
32+
learn = tabular_learner(dls, layers=[200, 100])
33+
with learn.no_bar():
34+
learn.fit_one_cycle(n_epoch=1)
35+
36+
self.assertGreater(learn.smooth_loss, 0)

0 commit comments

Comments
 (0)