You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add a compile flag `MSCCLPP_USE_IB` that explicitly specifies IB
on/off
* Fix `nvidia-peermem` check; no need for DMABUF supported systems
* Fix `mp_unit_tests` to skip all IB tests when built with
`-DMSCCLPP_USE_IB=OFF`
set(MSCCLPP_GPU_ARCHS ""CACHESTRING"Specify GPU architectures with delimiters (comma, space, or semicolon).")
@@ -141,23 +142,18 @@ else()
141
142
endif()
142
143
143
144
if(CMAKE_BUILD_TYPESTREQUAL"Debug")
144
-
add_compile_definitions(DEBUG_BUILD)
145
+
add_compile_definitions(DEBUG_BUILD)
145
146
endif()
146
147
147
-
find_package(IBVerbs)
148
+
if(MSCCLPP_USE_IB)
149
+
find_package(IBVerbs)
150
+
if(NOT IBVERBS_FOUND)
151
+
message(FATAL_ERROR "IBVerbs not found. Install libibverbs-dev or rdma-core-devel. If you want to disable InfiniBand, add `-DMSCCLPP_USE_IB=OFF` in your cmake command.")
152
+
endif()
153
+
endif()
148
154
find_package(NUMA REQUIRED)
149
155
find_package(Threads REQUIRED)
150
156
151
-
set(CMAKE_COLOR_DIAGNOSTICS ON)
152
-
function(msg_red text)
153
-
string(ASCII 27 ESC)
154
-
message("${ESC}[31m${text}${ESC}[0m")
155
-
endfunction()
156
-
157
-
if(NOT IBVERBS_FOUND)
158
-
msg_red("libibverbs not found. Install libibverbs-dev or rdma-core-devel.")
Copy file name to clipboardExpand all lines: docs/quickstart.md
+3-5Lines changed: 3 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -19,7 +19,7 @@
19
19
* AMD MI250X GPUs + ROCm >= 5.7
20
20
* AMD MI300X GPUs + ROCm >= 6.0
21
21
* OS
22
-
* Tested on Ubuntu 18.04 and later
22
+
* Tested on Ubuntu 20.04 and later
23
23
* Libraries
24
24
*[libnuma](https://github.com/numactl/numactl)
25
25
```bash
@@ -32,10 +32,7 @@
32
32
If you don't want to build Python module, you need to set `-DMSCCLPP_BUILD_PYTHON_BINDINGS=OFF` in your `cmake` command (see details in [Install from Source](#install-from-source)).
33
33
* (Optional, for benchmarks) MPI
34
34
* Others
35
-
* For NVIDIA platforms, `nvidia_peermem` driver should be loaded on all nodes. Check it via:
36
-
```bash
37
-
lsmod | grep nvidia_peermem
38
-
```
35
+
* For RDMA (InfiniBand or RoCE) support on NVIDIA platforms, [GPUDirect RDMA](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-operator-rdma.html#gpudirect-rdma-and-gpudirect-storage) should be supported by the system. See the detailed prerequisites from [this NVIDIA documentation](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/gpu-operator-rdma.html#common-prerequisites).
39
36
* For NVLink SHARP (NVLS) support on NVIDIA platforms, the Linux kernel version should be 5.6 or above.
40
37
41
38
(docker-images)=
@@ -91,6 +88,7 @@ There are a few optional CMake options you can set:
91
88
- `-DMSCCLPP_GPU_ARCHS=<arch-list>`: Specify the GPU architectures to build for. For example, `-DMSCCLPP_GPU_ARCHS="80,90"`for NVIDIA A100 and H100 GPUs, `-DMSCCLPP_GPU_ARCHS=gfx942`for AMD MI300x GPU.
92
89
- `-DMSCCLPP_BYPASS_GPU_CHECK=ON -DMSCCLPP_USE_CUDA=ON`: If the build environment doesn't have GPUs and only has CUDA installed, you can set these options to bypass GPU checks and use CUDA APIs. This is useful for building on CI systems or environments without GPUs.
93
90
- `-DMSCCLPP_BYPASS_GPU_CHECK=ON -DMSCCLPP_USE_ROCM=ON`: If the build environment doesn't have GPUs and only has ROCm installed, you can set these options to bypass GPU checks and use ROCm APIs.
0 commit comments