How to achieve batch_isend_irecv using CustomBackend implementation? #2

MC952-arch · 2025-01-17T03:41:47Z

Hello,

Could you help me explaining how to implement batch_isend_irecv via custom backend?

I've tried developing a simplified nccl backend send operation as follows:

c10::intrusive_ptr<Work> BackendFlagcx::send(
        std::vector<at::Tensor>& tensors,
        int dstRank,
        int tag)
{
    ...

    // Perform the send operation
    ncclSend(
        tensor.data_ptr(),
        tensor.numel(),
        ncclDataType,
        dstRank,
        comm,
        stream);

    auto future = c10::make_intrusive<c10::ivalue::Future>(
        c10::ListType::create(c10::TensorType::get()));
    future->markCompleted(c10::IValue(tensors));
    return c10::make_intrusive<WorkNCCL>(OpType::SEND, std::move(future));
}

However, I find no way to add group semantics for batch_isend_irecv op. I notice that in pytorch ProcessGroupNCCL, it defines several methods for group operation, including:

groupStart()
groupEnd()
startCoalescing()
endCoalescing()

Could you give me some advices about it?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to achieve batch_isend_irecv using CustomBackend implementation? #2

How to achieve batch_isend_irecv using CustomBackend implementation? #2

MC952-arch commented Jan 17, 2025 •

edited

Loading

How to achieve batch_isend_irecv using CustomBackend implementation? #2

How to achieve batch_isend_irecv using CustomBackend implementation? #2

Comments

MC952-arch commented Jan 17, 2025 • edited Loading

MC952-arch commented Jan 17, 2025 •

edited

Loading