Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do multiple GPUs communicate with each other #105

Open
Liujiaqi-jlu opened this issue Oct 23, 2024 · 2 comments
Open

How do multiple GPUs communicate with each other #105

Liujiaqi-jlu opened this issue Oct 23, 2024 · 2 comments

Comments

@Liujiaqi-jlu
Copy link

Hello, I would like to know how to explicitly observe the communication process between multiple GPUs and how they exchange memory information. I noticed that the Distribution function can map physical memory to different GPUs. Currently, my research focuses on GPU interconnect communication, so I would like to seek your advice on this. Thank you!

@syifan
Copy link
Contributor

syifan commented Oct 24, 2024

I would not say the distribution is about GPU-GPU communication but about how the memory is allocated to GPUs.

May I know what you want to understand about GPU-GPU communication? Two points you can try to examine. One is the RDMA engine, which performs cache-line level memory access across GPUs. https://github.com/sarchlab/mgpusim/tree/v3/timing/rdma. The second is the Endpoint, which is a network component that gathers all the outgoing/incoming communication of a device. https://github.com/sarchlab/akita/blob/v3/noc/networking/switching/endpoint.go

@Liujiaqi-jlu
Copy link
Author

I would not say the distribution is about GPU-GPU communication but about how the memory is allocated to GPUs.

May I know what you want to understand about GPU-GPU communication? Two points you can try to examine. One is the RDMA engine, which performs cache-line level memory access across GPUs. https://github.com/sarchlab/mgpusim/tree/v3/timing/rdma. The second is the Endpoint, which is a network component that gathers all the outgoing/incoming communication of a device. https://github.com/sarchlab/akita/blob/v3/noc/networking/switching/endpoint.go

Thank you for your reply!I have known where my problem is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants