Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change docs/ references from Terraform to OpenTofu #544

Merged
merged 5 commits into from
Jan 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ And generate secrets for it:

Create an OpenTofu variables file to define the required infrastructure, e.g.:

# environments/$ENV/terraform/terraform.tfvars:
# environments/$ENV/tofu/tofu.tfvars:

cluster_name = "mycluster"
cluster_net = "some_network" # *
Expand All @@ -105,12 +105,12 @@ Create an OpenTofu variables file to define the required infrastructure, e.g.:
}
}

Variables marked `*` refer to OpenStack resources which must already exist. The above is a minimal configuration - for all variables and descriptions see `environments/$ENV/terraform/terraform.tfvars`.
Variables marked `*` refer to OpenStack resources which must already exist. The above is a minimal configuration - for all variables and descriptions see `environments/$ENV/tofu/tofu.tfvars`.

To deploy this infrastructure, ensure the venv and the environment are [activated](#create-a-new-environment) and run:

export OS_CLOUD=openstack
cd environments/$ENV/terraform/
cd environments/$ENV/tofu/
tofu init
tofu apply

Expand Down
2 changes: 1 addition & 1 deletion ansible/ci/retrieve_inventory.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
gather_facts: no
vars:
cluster_prefix: "{{ undef(hint='cluster_prefix must be defined') }}" # e.g. ci4005969475
ci_vars_file: "{{ appliances_environment_root + '/terraform/' + lookup('env', 'CI_CLOUD') }}.tfvars"
ci_vars_file: "{{ appliances_environment_root + '/tofu/' + lookup('env', 'CI_CLOUD') }}.tfvars"
cluster_network: "{{ lookup('ansible.builtin.ini', 'cluster_net', file=ci_vars_file, type='properties') | trim('\"') }}"
tasks:
- name: Get control host IP
Expand Down
2 changes: 1 addition & 1 deletion ansible/roles/block_devices/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This is a convenience wrapper around the ansible modules:

To avoid issues with device names changing after e.g. reboots, devices are identified by serial number and mounted by filesystem UUID.

**NB:** This role is ignored[^1] during Packer builds as block devices will not be attached to the Packer build VMs. This role is therefore deprecated and it is suggested that `cloud-init` is used instead. See e.g. `environments/skeleton/{{cookiecutter.environment}}/terraform/control.userdata.tpl`.
**NB:** This role is ignored[^1] during Packer builds as block devices will not be attached to the Packer build VMs. This role is therefore deprecated and it is suggested that `cloud-init` is used instead. See e.g. `environments/skeleton/{{cookiecutter.environment}}/tofu/control.userdata.tpl`.

[^1]: See `environments/common/inventory/group_vars/builder/defaults.yml`

Expand Down
2 changes: 1 addition & 1 deletion ansible/roles/compute_init/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ The following roles/groups are currently fully functional:
- `openhpc`: all functionality

The above may be enabled by setting the compute_init_enable property on the
terraform compute variable.
tofu compute variable.

# Development/debugging

Expand Down
4 changes: 2 additions & 2 deletions ansible/roles/freeipa/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Support FreeIPA in the appliance. In production use it is expected the FreeIPA s

## Usage
- Add hosts to the `freeipa_client` group and run (at a minimum) the `ansible/iam.yml` playbook.
- Host names must match the domain name. By default (using the skeleton Terraform) hostnames are of the form `nodename.cluster_name.cluster_domain_suffix` where `cluster_name` and `cluster_domain_suffix` are Terraform variables.
- Host names must match the domain name. By default (using the skeleton OpenTofu) hostnames are of the form `nodename.cluster_name.cluster_domain_suffix` where `cluster_name` and `cluster_domain_suffix` are OpenTofu variables.
- Hosts discover the FreeIPA server FQDN (and their own domain) from DNS records. If DNS servers are not set this is not set from DHCP, then use the `resolv_conf` role to configure this. For example when using the in-appliance FreeIPA development server:

```ini
Expand All @@ -28,7 +28,7 @@ Support FreeIPA in the appliance. In production use it is expected the FreeIPA s
- For production use with an external FreeIPA server, a random one-time password (OTP) must be generated when adding hosts to FreeIPA (e.g. using `ipa host-add --random ...`). This password should be set as a hostvar `freeipa_host_password`. Initial host enrolment will use this OTP to enrol the host. After this it becomes irrelevant so it does not need to be committed to git. This approach means the appliance does not require the FreeIPA administrator password.
- For development use with the in-appliance FreeIPA server, `freeipa_host_password` will be automatically generated in memory.
- The `control` host must define `appliances_state_dir` (on persistent storage). This is used to back-up keytabs to allow FreeIPA clients to automatically re-enrol after e.g. reimaging. Note that:
- This is implemented when using the skeleton Terraform; on the control node `appliances_state_dir` defaults to `/var/lib/state` which is mounted from a volume.
- This is implemented when using the skeleton OpenTofu; on the control node `appliances_state_dir` defaults to `/var/lib/state` which is mounted from a volume.
- Nodes are not re-enroled by a [Slurm-driven reimage](../../collections/ansible_collections/stackhpc/slurm_openstack_tools/roles/rebuild/README.md) (as that does not run this role).
- If both a backed-up keytab and `freeipa_host_password` exist, the former is used.

Expand Down
2 changes: 1 addition & 1 deletion docs/openondemand.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ See the [ansible/roles/openondemand/README.md](../ansible/roles/openondemand/REA
The following variables have been given default values to allow Open OnDemand to work in a newly created environment without additional configuration, but generally should be overridden in `environment/site/inventory/group_vars/all/` with site-specific values:
- `openondemand_servername` - this must be defined for both `openondemand` and `grafana` hosts (when Grafana is enabled). Default is `ansible_host` (i.e. the IP address) of the first host in the `openondemand` group.
- `openondemand_auth` and any corresponding options. Defaults to `basic_pam`.
- `openondemand_desktop_partition` and `openondemand_jupyter_partition` if the corresponding inventory groups are defined. Defaults to the first compute group defined in the `compute` Terraform variable in `environments/$ENV/terraform`.
- `openondemand_desktop_partition` and `openondemand_jupyter_partition` if the corresponding inventory groups are defined. Defaults to the first compute group defined in the `compute` OpenTofu variable in `environments/$ENV/tofu`.

It is also recommended to set:
- `openondemand_dashboard_support_url`
Expand Down
4 changes: 2 additions & 2 deletions docs/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,10 @@ This is a usually a two-step process:

- If new nodes are required, define a new node group by adding an entry to the `compute` mapping in `environments/$ENV/tofu/main.tf` assuming the default OpenTofu configuration:
- The key is the partition name.
- The value should be a mapping, with the parameters defined in `environments/$SITE_ENV/terraform/compute/variables.tf`, but in brief will need at least `flavor` (name) and `nodes` (a list of node name suffixes).
- The value should be a mapping, with the parameters defined in `environments/$SITE_ENV/tofu/compute/variables.tf`, but in brief will need at least `flavor` (name) and `nodes` (a list of node name suffixes).
- Add a new partition to the partition configuration as described under [Modifying Slurm Partition-specific Configuration](#Modifying-Slurm-Partition-specific-Configuration).

Deploying the additional nodes and applying these changes requires rerunning both Terraform and the Ansible site.yml playbook - follow [Deploying a Cluster](#Deploying-a-Cluster).
Deploying the additional nodes and applying these changes requires rerunning both OpenTofu and the Ansible site.yml playbook - follow [Deploying a Cluster](#Deploying-a-Cluster).

# Adding Additional Packages
By default, the following utility packages are installed during the StackHPC image build:
Expand Down
8 changes: 4 additions & 4 deletions docs/persistent-state.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,14 @@ If using the `environments/common/layout/everything` Ansible groups template (wh

Note that if `appliances_state_dir` is defined, the path it gives must exist and should be owned by root. Directories will be created within this with appropriate permissions for each item of state defined above. Additionally, the systemd units for the services listed above will be modified to require `appliances_state_dir` to be mounted before service start (via the `systemd` role).

A new cookiecutter-produced environment supports persistent state in the default Terraform (see `environments/skeleton/{{cookiecutter.environment}}/terraform/`) by:
A new cookiecutter-produced environment supports persistent state in the default OpenTofu (see `environments/skeleton/{{cookiecutter.environment}}/tofu/`) by:

- Defining a volume with a default size of 150GB - this can be controlled by the Terraform variable `state_volume_size`.
- Defining a volume with a default size of 150GB - this can be controlled by the OpenTofu variable `state_volume_size`.
- Attaching it to the control node.
- Defining cloud-init userdata for the control node which formats and mounts this volume at `/var/lib/state`.
- Defining `appliances_state_dir: /var/lib/state` for the control node in the (Terraform-templated) `inventory/hosts` file.
- Defining `appliances_state_dir: /var/lib/state` for the control node in the (OpenTofu-templated) `inventory/hosts` file.

**NB: The default Terraform is provided as a working example and for internal CI use - therefore this volume is deleted when running `terraform destroy` - this may not be appropriate for a production environment.**
**NB: The default OpenTofu is provided as a working example and for internal CI use - therefore this volume is deleted when running `tofu destroy` - this may not be appropriate for a production environment.**

In general, the Prometheus data is likely to be the only sizeable state stored. The size of this can be influenced through [Prometheus role variables](https://github.com/cloudalchemy/ansible-prometheus#role-variables), e.g.:
- `prometheus_storage_retention` - [default](../environments/common/inventory/group_vars/all/prometheus.yml) 31d
Expand Down
10 changes: 5 additions & 5 deletions docs/production.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,15 @@ and referenced from the `site` and `production` environments, e.g.:
- OpenTofu configurations should be defined in the `site` environment and used
as a module from the other environments. This can be done with the
cookie-cutter generated configurations:
- Delete the *contents* of the cookie-cutter generated `terraform/` directories
- Delete the *contents* of the cookie-cutter generated `tofu/` directories
from the `production` and `staging` environments.
- Create a `main.tf` in those directories which uses `site/terraform/` as a
- Create a `main.tf` in those directories which uses `site/tofu/` as a
[module](https://opentofu.org/docs/language/modules/), e.g. :

```
...
module "cluster" {
source = "../../site/terraform/"
source = "../../site/tofu/"

cluster_name = "foo"
...
Expand All @@ -61,7 +61,7 @@ and referenced from the `site` and `production` environments, e.g.:
into the module block.
- Environment-independent variables (e.g. maybe `cluster_net` if the
same is used for staging and production) should be set as *defaults*
in `environments/site/terraform/variables.tf`, and then don't need to
in `environments/site/tofu/variables.tf`, and then don't need to
be passed in to the module.

- Vault-encrypt secrets. Running the `generate-passwords.yml` playbook creates
Expand Down Expand Up @@ -102,7 +102,7 @@ and referenced from the `site` and `production` environments, e.g.:

- Consider whether having (read-only) access to Grafana without login is OK. If not, remove `grafana_auth_anonymous` in `environments/$ENV/inventory/group_vars/all/grafana.yml`

- Modify `environments/site/terraform/nodes.tf` to provide fixed IPs for at least
- Modify `environments/site/tofu/nodes.tf` to provide fixed IPs for at least
the control node, and (if not using FIPs) the login node(s):

```
Expand Down
2 changes: 1 addition & 1 deletion docs/upgrades.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ All other commands should be run on the Ansible deploy host.

1. If required, build an "extra" image with local modifications, see [docs/image-build.md](./image-build.md).

1. Modify your site-specific environment to use this image, e.g. via `cluster_image_id` in `environments/$SITE_ENV/terraform/variables.tf`.
1. Modify your site-specific environment to use this image, e.g. via `cluster_image_id` in `environments/$SITE_ENV/tofu/variables.tf`.

1. Test this in your staging cluster.

Expand Down
2 changes: 1 addition & 1 deletion environments/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ typically contains all the environment specific config. It must output an ansibl
that conforms to the structure we expect. Providing that the inventory conforms to this
structure, the ansible code will still be able to interface with that inventory.
This allows the ansible code to be decoupled from the code that deployed the infrastructure
and can therefore be tool and cloud agnostic i.e we don't care if you use terraform or ansible.
and can therefore be tool and cloud agnostic.

A pattern we use is to chain multiple ansible inventories to provide a crude form of inheritance. e.g

Expand Down
2 changes: 1 addition & 1 deletion environments/common/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This contains an inventory that defines variables which are common between the
`production` and `development` environments. It is not intended to be used in
a standalone fashion to deploy infrastructure (i.e no terraform), but is instead
a standalone fashion to deploy infrastructure, but is instead
referenced in `ansible.cfg` from the `production` and `development` configurations.

The pattern we use is that all resources referenced in the inventory
Expand Down
Loading