Skip to content

[P2P] Engines created in all GPUs #568

@praveingk

Description

@praveingk

I notice that uccl engines are created on all GPUs available, even if the memory is allocated in a single GPU.
I think we should create engines only in the engine where the uccl engine is initialized for P2P.

UCCL_RCMODE=1 NCCL_IB_GID_INDEX=3 python benchmark_nixl.py --role server --device cpu --local-gpu-idx 0 --iters 1 --op-type read --sizes 104857600 --backend uccl_p2p



| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            3542      C   python                                  526MiB |
|    1   N/A  N/A            3542      C   python                                  524MiB |
|    2   N/A  N/A            3542      C   python                                  524MiB |
|    3   N/A  N/A            3542      C   python                                  524MiB |
|    4   N/A  N/A            3542      C   python                                  524MiB |
|    5   N/A  N/A            3542      C   python                                  524MiB |
|    6   N/A  N/A            3542      C   python                                  524MiB |
|    7   N/A  N/A            3542      C   python                                  524MiB |
+-----------------------------------------------------------------------------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions