Skip to content

Commit

Permalink
Merge branch 'main' into feat/hostkey-secrets
Browse files Browse the repository at this point in the history
  • Loading branch information
wtripp180901 committed Jan 9, 2025
2 parents f021167 + 6929272 commit 895f302
Show file tree
Hide file tree
Showing 23 changed files with 63 additions and 48 deletions.
10 changes: 5 additions & 5 deletions .github/workflows/stackhpc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,9 @@ jobs:
. venv/bin/activate
. environments/.stackhpc/activate
ansible-playbook ansible/adhoc/generate-passwords.yml
echo vault_testuser_password: "$TESTUSER_PASSWORD" > $APPLIANCES_ENVIRONMENT_ROOT/inventory/group_vars/all/test_user.yml
echo vault_demo_user_password: "$DEMO_USER_PASSWORD" > $APPLIANCES_ENVIRONMENT_ROOT/inventory/group_vars/all/test_user.yml
env:
TESTUSER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
DEMO_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}

- name: Provision nodes using fat image
id: provision_servers
Expand Down Expand Up @@ -163,12 +163,12 @@ jobs:
--spider \
--server-response \
--no-check-certificate \
--http-user=testuser \
--http-password=${TESTUSER_PASSWORD} https://${openondemand_servername} \
--http-user=demo_user \
--http-password=${DEMO_USER_PASSWORD} https://${openondemand_servername} \
2>&1)
(echo $statuscode | grep "200 OK") || (echo $statuscode && exit 1)
env:
TESTUSER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
DEMO_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}

# - name: Build environment-specific compute image
# id: packer_build
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ To deploy this infrastructure, ensure the venv and the environment are [activate

export OS_CLOUD=openstack
cd environments/$ENV/terraform/
tofu init
tofu apply

and follow the prompts. Note the OS_CLOUD environment variable assumes that OpenStack credentials are defined using a [clouds.yaml](https://docs.openstack.org/python-openstackclient/latest/configuration/index.html#clouds-yaml) file in a default location with the default cloud name of `openstack`.
Expand Down
1 change: 1 addition & 0 deletions ansible/roles/passwords/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ slurm_appliance_secrets:
vault_freeipa_admin_password: "{{ vault_freeipa_admin_password | default(lookup('password', '/dev/null')) }}"
vault_k3s_token: "{{ vault_k3s_token | default(lookup('ansible.builtin.password', '/dev/null', length=64)) }}"
vault_pulp_admin_password: "{{ vault_pulp_admin_password | default(lookup('password', '/dev/null', chars=['ascii_letters', 'digits'])) }}"
vault_demo_user_password: "{{ vault_demo_user_password | default(lookup('password', '/dev/null')) }}"

secrets_openhpc_mungekey_default:
content: "{{ lookup('pipe', 'dd if=/dev/urandom bs=1 count=1024 2>/dev/null | base64') }}"
Expand Down
2 changes: 1 addition & 1 deletion ansible/roles/passwords/tasks/validate.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
- name: Assert secrets created
assert:
that: (hostvars[inventory_hostname].keys() | select('contains', 'vault_') | length) > 1 # 1 as may have vault_testuser_password defined in dev
that: (hostvars[inventory_hostname].keys() | select('contains', 'vault_') | length) > 1 # 1 as may have vault_demo_user_password defined in dev
fail_msg: "No inventory variables 'vault_*' found: Has ansible/adhoc/generate-passwords.yml been run?"
12 changes: 7 additions & 5 deletions docs/openondemand.README.md → docs/openondemand.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,10 @@ The above functionality is configured by running the `ansible/portal.yml` playbo

See the [ansible/roles/openondemand/README.md](../ansible/roles/openondemand/README.md) for more details on the variables described below.

At minimum the following must be defined:
- `openondemand_servername` - this must be defined for both `openondemand` and `grafana` hosts (when Grafana is enabled). It is suggested to place it groupvars for `all`.
- `openondemand_auth` and any corresponding options.
- `openondemand_desktop_partition` and `openondemand_jupyter_partition` if the corresponding inventory groups are defined.
- `openondemand_host_regex` if `openondemand_desktop` or `openondemand_jupyter` inventory groups are defined and/or proxying Grafana via Open Ondemand is required.
The following variables have been given default values to allow Open Ondemand to work in a newly created environment without additional configuration, but generally should be overridden in `environment/site/inventory/group_vars/all/` with site-specific values:
- `openondemand_servername` - this must be defined for both `openondemand` and `grafana` hosts (when Grafana is enabled). Default is `ansible_host` (i.e. the IP address) of the first host in the `openondemand` group.
- `openondemand_auth` and any corresponding options. Defaults to `basic_pam`.
- `openondemand_desktop_partition` and `openondemand_jupyter_partition` if the corresponding inventory groups are defined. Defaults to the first compute group defined in the `compute` Terraform variable in `environments/$ENV/terraform`.

It is also recommended to set:
- `openondemand_dashboard_support_url`
Expand All @@ -45,3 +44,6 @@ If shared filesystems other than `$HOME` are available, add paths to `openondema
The appliance automatically configures Open Ondemand to proxy Grafana and adds a link to it on the Open Ondemand dashboard. This means no external IP (or SSH proxying etc) is required to access Grafana (which by default is deployed on the control node). To allow users to authenticate to Grafana, the simplest option is to enable anonymous (View-only) login by setting `grafana_auth_anonymous` (see [environments/common/inventory/group_vars/all/grafana.yml](../environments/common/inventory/group_vars/all/grafana.yml)[^1]).

[^1]: Note that if `openondemand_auth` is `basic_pam` and anonymous Grafana login is enabled, the appliance will (by default) configure Open Ondemand's Apache server to remove the Authorisation header from proxying of all `node/` addresses. This is done as otherwise Grafana tries to use this header to authenticate, which fails with the default configuration where only the admin Grafana user `grafana` is created. Note that the removal of this header in this configuration means it cannot be used to authenticate proxied interactive applications - however the appliance-deployed remote desktop and Jupyter Notebook server applications use other authentication methods. An alternative if using `basic_pam` is not to enable anonymous Grafana login and to create Grafana users matching the local users (e.g. in `environments/<env>/hooks/post.yml`).

# Access
By default the appliance authenticates against OOD with basic auth through PAM. When creating a new environment, a new user with username `demo_user` will be created. Its password is found under `vault_openondemand_default_user` in the appliance secrets store in `environments/{ENV}/inventory/group_vars/all/secrets.yml`. Other users can be defined by overriding the `basic_users_users` variable in your environment (templated into `environments/{ENV}/inventory/group_vars/all/basic_users.yml` by default).
4 changes: 4 additions & 0 deletions docs/production.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,10 @@ and referenced from the `site` and `production` environments, e.g.:

- Configure Open OpenOndemand - see [specific documentation](openondemand.README.md).

- Remove the `demo_user` user from `environments/$ENV/inventory/group_vars/all/basic_users.yml`

- Consider whether having (read-only) access to Grafana without login is OK. If not, remove `grafana_auth_anonymous` in `environments/$ENV/inventory/group_vars/all/grafana.yml`

- Modify `environments/site/terraform/nodes.tf` to provide fixed IPs for at least
the control node, and (if not using FIPs) the login node(s):

Expand Down
1 change: 0 additions & 1 deletion environments/.caas/inventory/group_vars/all/selinux.yml

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
test_user_password: "{{ lookup('env', 'TESTUSER_PASSWORD') | default(vault_testuser_password, true) }}" # CI uses env, debug can set vault_testuser_password
test_demo_user_password: "{{ lookup('env', 'DEMO_USER_PASSWORD') | default(vault_demo_user_password, true) }}" # CI uses env, debug can set vault_demo_user_password

basic_users_users:
- name: testuser # can't use rocky as $HOME isn't shared!
password: "{{ test_user_password | password_hash('sha512', 65534 | random(seed=inventory_hostname) | string) }}" # idempotent
- name: demo_user # can't use rocky as $HOME isn't shared!
password: "{{ test_demo_user_password | password_hash('sha512', 65534 | random(seed=inventory_hostname) | string) }}" # idempotent
uid: 1005
4 changes: 2 additions & 2 deletions environments/.stackhpc/inventory/group_vars/all/freeipa.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

# NB: Users defined this way have expired passwords
freeipa_users:
- name: testuser # can't use rocky as $HOME isn't shared!
password: "{{ test_user_password }}"
- name: demo_user # can't use rocky as $HOME isn't shared!
password: "{{ test_demo_user_password }}"
givenname: test
sn: test

Expand Down
3 changes: 3 additions & 0 deletions environments/.stackhpc/inventory/group_vars/all/openhpc.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
openhpc_config_extra:
SlurmctldDebug: debug
SlurmdDebug: debug
Original file line number Diff line number Diff line change
@@ -1 +1,8 @@
openondemand_servername: "{{ hostvars[ groups['openondemand'] | first].ansible_host }}" # Use a SOCKS proxy to acccess
openondemand_auth: basic_pam
openondemand_jupyter_partition: standard
openondemand_desktop_partition: standard
#openondemand_dashboard_support_url:
#openondemand_dashboard_docs_url:
#openondemand_filesapp_paths:
ondemand_package: ondemand-"{{ ondemand_package_version }}"
ondemand_package_version: '3.1.10'
13 changes: 0 additions & 13 deletions environments/.stackhpc/inventory/group_vars/openhpc/overrides.yml

This file was deleted.

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"cluster_image": {
"RL8": "openhpc-RL8-250107-1534-b03caaf3",
"RL9": "openhpc-RL9-250107-1535-b03caaf3"
"RL8": "openhpc-RL8-250108-1703-e515b902",
"RL9": "openhpc-RL9-250108-1703-e515b902"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,12 @@

# NB: Variables prefixed ood_ are all from https://github.com/OSC/ood-ansible

# openondemand_servername: '' # Must be defined when using openondemand
openondemand_servername: "{{ hostvars[groups['openondemand'].0].ansible_host if groups['openondemand'] else '' }}"

openondemand_auth: basic_pam

openondemand_jupyter_partition: "{{ openhpc_slurm_partitions[0]['name'] }}"
openondemand_desktop_partition: "{{ openhpc_slurm_partitions[0]['name'] }}"

# Regex defining hosts which openondemand can proxy; the default regex is compute nodes (for apps) and grafana host,
# e.g. if the group `compute` has hosts `compute-{0,1,2,..}` this will be '(compute-\d+)|(control)'.
Expand Down
2 changes: 1 addition & 1 deletion environments/common/inventory/group_vars/all/selinux.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
---

selinux_state: permissive
selinux_state: disabled
selinux_policy: targeted
6 changes: 4 additions & 2 deletions environments/common/layouts/everything
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ login
[block_devices:children]
# Environment-specific so not defined here

[basic_users]
[basic_users:children]
# Add `openhpc` group to add Slurm users via creation of users on each node.
openhpc

[openondemand:children]
# Host to run Open Ondemand server on - subset of login
Expand All @@ -51,8 +52,9 @@ compute
# Subset of compute to run a Jupyter Notebook servers on via Open Ondemand
compute

[etc_hosts]
[etc_hosts:children]
# Hosts to manage /etc/hosts e.g. if no internal DNS. See ansible/roles/etc_hosts/README.md
cluster

[cuda]
# Hosts to install NVIDIA CUDA on - see ansible/roles/cuda/README.md
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
basic_users_users:
- name: demo_user
password: "{% raw %}{{ vault_demo_user_password | password_hash('sha512', 65534 | random(seed=inventory_hostname) | string) }}{% endraw %}" # idempotent
uid: 1005
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
grafana_auth_anonymous: true
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,22 @@ module "compute" {

for_each = var.compute

# must be set for group:
nodes = each.value.nodes
flavor = each.value.flavor

cluster_name = var.cluster_name
cluster_domain_suffix = var.cluster_domain_suffix
cluster_net_id = data.openstack_networking_network_v2.cluster_net.id
cluster_subnet_id = data.openstack_networking_subnet_v2.cluster_subnet.id

flavor = each.value.flavor
# can be set for group, defaults to top-level value:
image_id = lookup(each.value, "image_id", var.cluster_image_id)
vnic_type = lookup(each.value, "vnic_type", var.vnic_type)
vnic_profile = lookup(each.value, "vnic_profile", var.vnic_profile)
volume_backed_instances = lookup(each.value, "volume_backed_instances", var.volume_backed_instances)
root_volume_size = lookup(each.value, "root_volume_size", var.root_volume_size)

key_pair = var.key_pair
environment_root = var.environment_root
k3s_token = var.k3s_token
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ variable "cluster_name" {
variable "cluster_domain_suffix" {
type = string
description = "Domain suffix for cluster"
default = "invalid"
default = "internal"
}

variable "cluster_net" {
Expand Down Expand Up @@ -52,6 +52,8 @@ variable "compute" {
image_id: Overrides variable cluster_image_id
vnic_type: Overrides variable vnic_type
vnic_profile: Overrides variable vnic_profile
volume_backed_instances: Overrides variable volume_backed_instances
root_volume_size: Overrides variable root_volume_size
EOF
}

Expand Down

0 comments on commit 895f302

Please sign in to comment.