Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTS OFED 23.10 doesn't install rdma-core-devel from MOFED repos #461

Open
sjpb opened this issue Oct 24, 2024 · 2 comments
Open

LTS OFED 23.10 doesn't install rdma-core-devel from MOFED repos #461

sjpb opened this issue Oct 24, 2024 · 2 comments

Comments

@sjpb
Copy link
Collaborator

sjpb commented Oct 24, 2024

Older builds used OFED 24.04. This included the rdma-core-devel package from Mellanox.

#427 changed OFED to the LTS version 23.10, now that was supported for RL9. However this install uses rdma-core-devel from appstream, which doesn't feel right:

[rocky@rl9-login-0 ~]$ cat /var/lib/image/image.json 
{
    "branch": "fix/packer-sentinel-file",
    "build": "openhpc-rl9-241022-0038-a5affa58",
    "cuda": "-",
    "kernel": "5.14.0-427.40.1.el9_4.x86_64",
    "ofed": "23.10",
    "os": "Rocky 9.4",
    "slurm-ohpc": "23.11.6"
}
[root@rl9-login-0 rocky]# dnf list --installed rdma*
Installed Packages
rdma-core.x86_64                                                                                      2307mlnx47-1.2310322                                                                                 @System   
rdma-core-devel.x86_64                                                                                48.0-1.el9                                                                                           @appstream

Furthermore, adding the undocumented OFED repos for 23.10 shows there is a Mellanox rdma-core-devel package :-(

@sjpb sjpb changed the title LTS OFED 23.10 doesn't install rdma-core-devel LTS OFED 23.10 doesn't install rdma-core-devel from MOFED repos Oct 24, 2024
@sjpb
Copy link
Collaborator Author

sjpb commented Oct 24, 2024

Note that on the client build, installing lustre via something similar to #447 removed the rdma-core-devel package entirely.

@sjpb
Copy link
Collaborator Author

sjpb commented Oct 24, 2024

So turns out that our "nightly" build which installs OFED does install the Mellanox rdma-core-devel package, but during fatimage build, installing OHPC packages replaces it with the @appstream one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant