Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SELinux not disabled by default, causes Prometheus install to fail #432

Open
wtripp180901 opened this issue Sep 6, 2024 · 7 comments · May be fixed by #449
Open

SELinux not disabled by default, causes Prometheus install to fail #432

wtripp180901 opened this issue Sep 6, 2024 · 7 comments · May be fixed by #449
Assignees
Labels
bug Something isn't working

Comments

@wtripp180901
Copy link
Contributor

Produced using Rocky-9-GenericCloud-Base-9.4-20240523.0.x86_64.qcow2 using a custom (non .stackhpc) environment

TASK [cloudalchemy.prometheus : Install SELinux dependencies]

failed: [testcluster-control] (item=libselinux-python) => {
    "ansible_loop_var": "item",
    "attempts": 5,
    "changed": false,
    "failures": [
        "No package libselinux-python available."
    ],
    "item": "libselinux-python",
    "rc": 1,
    "results": []
}

MSG:

Failed to install some of the specified packages


failed: [testcluster-control] (item=policycoreutils-python) => {
    "ansible_loop_var": "item",
    "attempts": 5,
    "changed": false,
    "failures": [
        "No package policycoreutils-python available."
    ],
    "item": "policycoreutils-python",
    "rc": 1,
    "results": []
}
@wtripp180901 wtripp180901 added the bug Something isn't working label Sep 6, 2024
@wtripp180901
Copy link
Contributor Author

Fixed by adding selinux_state: disabled to group_vars, however default in environments/common/inventory/group_vars/all/selinux.yml sets

selinux_state: permissive
selinux_policy: targeted

any reason for this?

@verdurin
Copy link

verdurin commented Nov 7, 2024

Hmm, I saw this, even though I'm using one of the StackHPC images.

@sjpb
Copy link
Collaborator

sjpb commented Nov 8, 2024

selinux is not disabled by default, hence this occurs with any unmodified cookiecutter environment regardless of image. See https://github.com/stackhpc/ansible-slurm-appliance/blob/main/environments/common/inventory/group_vars/all/selinux.yml. We do disable it in CI: https://github.com/stackhpc/ansible-slurm-appliance/blob/main/environments/.stackhpc/inventory/group_vars/selinux/overrides.yml

@wtripp180901
Copy link
Contributor Author

Prometheus should work with selinux enabled once #449 merges

@sjpb
Copy link
Collaborator

sjpb commented Jan 3, 2025

Also a bit horrible that as the stackhpc env has selinux disabled, running site.yml using a default env (with it enabled) on our image means it reboots to change the status ...

@sjpb
Copy link
Collaborator

sjpb commented Jan 3, 2025

Ok so the (depreciated) cloudalchemy.prometheus role is looking for the wrong packages:

TASK [cloudalchemy.prometheus : Gather variables for each operating system] ********************************************************************************************************************************************************
Friday 03 January 2025  11:11:57 +0000 (0:00:00.042)       0:09:51.507 ******** 
ok: [rl9-control] => (item=/home/rocky/slurm-app-rl9/ansible/roles/cloudalchemy.prometheus/vars/redhat.yml)
#ansible/roles/cloudalchemy.prometheus/vars/redhat.yml:
---
prometheus_selinux_packages:
  - libselinux-python
  - policycoreutils-python

but:

#ansible/roles/cloudalchemy.prometheus/vars/redhat-8.yml:
---
prometheus_selinux_packages:
  - python3-libselinux
  - python3-policycoreutils

and

[root@rl9-control rocky]# dnf list python3-libselinux
Last metadata expiration check: 0:10:20 ago on Fri 03 Jan 2025 11:03:24 AM UTC.
Installed Packages
python3-libselinux.x86_64                                                                                     3.6-1.el9                                                                                     @appstream
[root@rl9-control rocky]# dnf list python3-policycoreutils
Last metadata expiration check: 0:10:25 ago on Fri 03 Jan 2025 11:03:24 AM UTC.
Installed Packages
python3-policycoreutils.noarch   

There is no redhat-9 vars file, so basically, it doesn't support EL9 properly.

However because these come from an include_vars task, that has higher priority than any inventory, so we can't override them :-(. And we aren't using a forked role

@sjpb
Copy link
Collaborator

sjpb commented Jan 3, 2025

Also explains why this works on a client with SELinux enabled (on control node), b/c they are on RockyLinux 8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants