Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamically RDMA device allocation #4485

Open
weizhoublue opened this issue Dec 31, 2024 · 0 comments
Open

dynamically RDMA device allocation #4485

weizhoublue opened this issue Dec 31, 2024 · 0 comments
Assignees

Comments

@weizhoublue
Copy link
Collaborator

weizhoublue commented Dec 31, 2024

What would you like to be added?

应用场景思考
(1)简化配置
(2)主机上 master 网卡名 不一致
(3)小于 8 卡,根据 GPU 亲和性 动态分配
(4)多 RDMA 域网络下,子网规划有偏差,根据调度节点,来动态 分配 IP 子网 ---
这个由 IPAM 和 子网通配 来解决,而不是 在 多个 multus 实例名中选择一个 来解决
ipam 判断,如果 master 网卡有 ip 地址,那么 ip 要属于 子网,才可用
https://github.com/spidernet-io/spiderpool/blob/main/docs/usage/network-topology-zh_CN.md

apiVersion: spiderpool.spidernet.io/v2beta1
kind: SpiderMultusConfig
metadata:
  name: gpu1-sriov
  namespace: spiderpool
spec:
  cniType: ib-sriov
  ibsriov:
    resourceName: spidernet.io/gpu1sriov
    rdmaIsolation: true
    ippools:
      ipv4: ["gpu1-*"]  // 或者  ipv4: ["gpu1-block1", "gpu1-block2"]

Why is this needed?

No response

How to implement it (if possible)?

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant