Skip to content

Commit

Permalink
Merge pull request #204 from konflux-ci/mpc-update
Browse files Browse the repository at this point in the history
doc(KFLUXINFRA-742): Update MPC arch doc
  • Loading branch information
hugares authored Aug 15, 2024
2 parents 3881043 + 2a8225b commit 728fa63
Showing 1 changed file with 51 additions and 25 deletions.
76 changes: 51 additions & 25 deletions architecture/multi-platform-controller.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,32 +9,30 @@ The Multi Platform Controller is a Kubernetes Controller that can provision virt

The Multi Platform Controller has no understanding of the build pipeline logic. It only understands how to provision hosts and allocate SSH keys. For the Multi Platform Controller to act on a `TaskRun` it must meet the following conditions:

- It must have the `build.appstudio.redhat.com/multi-platform-required` label
- It must mount a secret with the name `multi-platform-ssh-$(context.taskRun.name)`
- It must have a parameter called `PLATFORM` which specifies the required platform

If these conditions are met the controller will reconcile on the task and attempt to provision a virtual machine for the task. If it is successful it will create secret (`multi-platform-ssh-$(context.taskRun.name)`) with the following entries:

- `id_rsa` the SSH key to use to connect to the host (NOTE: this will be replaced by one-time-password)
- `otp` a one-time-password that can be used to obtain the VM's SSH key from the OTP server.
- `otp-ca` a certificate-authority certificate used to validate the HTTPS connection to the OTP server.
- `otp-server` the url of the OTP server.
- `host` the host name or IP to connect to, including the username in the form `user@host`
- `build-dir` the working directory to perform the build in
- `one-time-password` a one time password that can be exchanged for an SSH Key.
- `user-dir` the working directory to perform the build in

Note that these SSH credentials are only valid while the `TaskRun` is executing.

Note that the first implementation of this will put the `id_rsa` key directly into the secret, at some point in the near future this will change to a one time password instead.

If the operator is unsuccessful then it will still create the secret, however it will only have a single item called `error` with the error message. The task is expected to echo this to the logs and then fail if this error is present.

The reason why this works is that the `TaskRun` will create a pod that references a non-existent secret (e.g. if the `TaskRun` was called `devfile-sample-t7rd7-build-container-arm64` then the resulting pod will reference a secret called `multi-platform-ssh-devfile-sample-t7rd7-build-container-arm64`). As this secret does not exist yet, the pod will not start but instead Kubernetes will wait until the secret is ready before running the pod. This gives the controller time to provision a SSH based host, and then create the secret with the required details. The secret is created even on failure to prevent the task hanging until timeout and to allow any error messages to be propagated to the user.

## The Controller

The controller is responsible for allocating SSH credentials to TaskRuns that meet the contract specified above. It currently supports fixed pools and fully dynamic host provisioning.
The controller is responsible for allocating SSH credentials to TaskRuns that meet the contract specified above. It currently supports fixed pools and fully dynamic host provisioning.

### Configuration

At present the controller requires no CRDs (although this might change for dynamic pooling). Configuration is provided through a `ConfigMap` that defines the available hosts. This `ConfigMap` is called `host-config` and lives in the `multi-platform-controller` namespace. This `ConfigMap` must have the `build.appstudio.redhat.com/multi-platform-config` label.
At present the controller requires no CRDs. Configuration is provided through a `ConfigMap` that defines the available hosts. This `ConfigMap` is called `host-config` and lives in the `multi-platform-controller` namespace. This `ConfigMap` must have the `build.appstudio.redhat.com/multi-platform-config` label.

An example is shown below:

Expand All @@ -47,7 +45,9 @@ metadata:
name: host-config
namespace: multi-platform-controller
data:
local-platforms: linux/amd64
dynamic-platforms: linux/arm64
dynamic-pool-platforms: linux-m4xlarge/amd64
instance-tag: unique-identifier

dynamic.linux-arm64.type: aws
Expand All @@ -58,8 +58,20 @@ data:
dynamic.linux-arm64.aws-secret: awsiam
dynamic.linux-arm64.ssh-secret: awskeys
dynamic.linux-arm64.security-group: "launch-wizard-1"
dynamic.linux-arm64.max-instances: "2"

dynamic.linux-arm64.max-instances: "10"

dynamic.linux-m4xlarge-amd64.type: aws
dynamic.linux-m4xlarge-amd64.region: us-east-1
dynamic.linux-m4xlarge-amd64.ami: ami-026ebd4cfe2c043b2
dynamic.linux-m4xlarge-amd64.instance-type: m6a.4xlarge
dynamic.linux-m4xlarge-amd64.key-name: konflux-stage-ext-mab01
dynamic.linux-m4xlarge-amd64.aws-secret: aws-account
dynamic.linux-m4xlarge-amd64.ssh-secret: aws-ssh-key
dynamic.linux-m4xlarge-amd64.security-group-id: sg-05bc8dd0b52158567
dynamic.linux-m4xlarge-amd64.max-instances: "4"
dynamic.linux-m4xlarge-amd64.max-concurrency: "4"
dynamic.linux-m4xlarge-amd64.subnet-id: subnet-030738beb81d3863a
dynamic.linux-m4xlarge-amd64.instance-tag: "some-pool-identifier"

host.ppc1.address: "150.240.147.198"
host.ppc1.platform: "linux/ppc64le"
Expand All @@ -74,7 +86,7 @@ data:
host.ibmz1.concurrency: "4"
```
The `dynamic-platforms` entry tells the system which platforms are using dynamic provisioning. The `instance-tag` is a unique value that is used by the system to determine which virtual machines it owns in the remote environment.
The `local-platforms` entry tells the system which platforms actually run locally in the cluster rather then on remote VMs. The `dynamic-platforms` entry tells the system which platforms are using dynamic provisioning. The `dynamic-pool-platforms` entry tells the system which platforms are using dynamically provisioned pools of hosts. The `instance-tag` is a unique value that is used by the system to determine which virtual machines it owns in the remote environment.

Dynamic configuration properties depend on the provider. At the moment `aws`, `ibmz` and `ibmp` are supported.

Expand All @@ -86,55 +98,71 @@ For static host pools each host has a unique name, and provides different parame
- `user` The username to use to connect to the host
- `platform` The hosts platform

Dynamic pool entries use a combination of dynamic and static host settings.


### Allocation Process

When a `TaskRun` is identified that meets the contract above the controller will attempt to allocate a virtual machine for them.

The host selection process depends on if the platform is configured to use dynamic provisioning or a pool:
The host selection process depends on if the platform is configured to use local, dynamic provisioning, a static pool or a dynamic pool:

*"Local" provisioning*

For platform that are configured as "local", no provisioning is done. Instead the secret required by the user task is generated so that it only contains a `host` value which is set to `localhost`. Tasks need to detect this configuration and switch to a code path that would perform the task's function locally within the running POD, as opposed to shelling out to a remote VM.

The purpose behind this configuration is to allow for hiding the fact with a task is executing locally or remotely from the pipeline users and instead have a single task that does both.

*Host Pool*

The controller looks at the current configuration, and also looks for all running tasks with a host assigned. Using this information it picks one of the hosts at random that has the most amount of free slots. This algorithm is very simple, but works fine when all hosts are the same size. If we are mixing hosts of different sizes then we may need a better algorithm.

*Dynamic*

The controller checks that the maximum number of virtual machines has not been exceeded, and if not asks the cloud provider to spin up a new virtual machine. It then schedules regular requeues to check when the machine is ready (so the controller loop is not blocked waiting for the VM).
The controller checks that the maximum number of virtual machines has not been exceeded, and if not asks the cloud provider to spin up a new virtual machine. It then schedules regular requeue requests to check when the machine is ready (so the controller loop is not blocked waiting for the VM).

Once the host is selected the controller runs a 'provisioning' task to prepare the host. This task currently does the following:

- Create a new user specifically for this build
- Generate a new pair of SSH keys for this user and install them
- Copy the private key for this user back to the cluster and create the required secret with this information
- Send the private key for this user to the OTP service
- Create the required secret and use it to store OTP returned from the service,
as well as other details as listed above.

Once the secret has been created then the original `TaskRun` will start, and can use the new user on the remote host to perform the build. This approach means that every build is run under a different user, and only low privilege ephemeral SSH keys end up in the user's namespace.
Once the secret has been created then the original `TaskRun` will start, and can use the new user on the remote host to perform the build. This approach means that every build is run under a different user.

Which hosts are allocated with which tasks are tracked by adding additional labels onto the `TaskRun` object.

If the provisioning task fails then a label is added to the `TaskRun` indicating that it has failed and allocation is re-tried, excluding all hosts in the failed list. If all of them fail then the error secret is created.

Note that this failed list is per `TaskRun`, there is no global tracking of the health of the hosts. This means that if a host is down it will potentially be retried many times, however when it comes back up it will be in service immediately.

### One Time Passwords
*Dynamic Pool*

Putting an SSH key directly into a secret opens up a potential security issue, where a user could steal this key, connect to a virtual machine, and compromise a build. Our existing tamper protections cannot detect this case.
The controller uses a pool of VMs in a similar way to how a static Host Pool is allocated. When it runs out of execution slots within the pool it allocates new VMs up to the given `max-instances` limit.

For our existing builds currently it would be possible for an admin to push a bad image, or run a bad task, however this would fail the enterprise contract.
VMs are kept in the pool up to the amount of minutes defined by the `max-age` configuration option. Beyond that no new tasks are scheduled to those VMs, and when they become idle, they are de-allocated.

To avoid this the Multi Platform Controller is going to move to a One Time Password approach to managing SSH keys. Instead of the multi platform controller creating a secret with an SSH key, instead it will create a one time password and add this to the secret. This secret can then be exchanged exactly once for an SSH key.
The controller does not maintain any state about the VM pool, instead it reconstructs it in-memory as needed. The controller tells different pools apart within the cloud provider's API by using the `instance-tag` option, therefore its important to define a different value for each dynamic pool.

If an attacker reads the secret and gains access the VM the task will immediately fail as it will not be able to access the VM.
### One Time Passwords

#### One Time Password Service
Putting an SSH key directly into a secret opens up a potential security issue, where a user could steal this key, connect to a virtual machine, and compromise a build. Our existing tamper protections cannot detect this case.

The one time password service will be a HTTP based service that provides two endpoints, one to accept a password + OTP pair combo, and another to serve the keypair when provided with the password.
To avoid this the Multi Platform Controller is leveraging a One Time Password approach to managing SSH keys. Instead of the multi platform controller creating a secret with an SSH key, it creates a one time password and adds this to the secret. This secret can then be exchanged exactly once for an SSH key.

There will be a single OTP service per cluster, which will be secured via HTTPS.
If an attacker reads the secret and gains access the VM the task will immediately fail as it will not be able to access the VM.

### Clean Up

When the `TaskRun` is completed then the user is removed from the remote host via a cleanup task.

## One Time Password Service

The one time password service is a HTTP based service that provides two endpoints, one to accept a private key and return a OTP, and another to serve the private key when provided with the OTP.

There is a single OTP service per cluster, which is be secured via HTTPS.

## Host Requirements

In order to work with the multi platform controller remote hosts need to meet the following requirements:
Expand All @@ -150,8 +178,6 @@ There are still unanswered questions about these hosts and additional requiremen
- Should the 'main' user be locked down so that it can only add and remove the build users? At present the main user used by the controller has root access (not the users used by the build, but the user that creates these users).
- Do we need any kind of intrusion detection or additional monitoring software on these VMs?



## IBM Cloud Configuration

This section explains what is needed to configure an IBM Cloud Account. Power PC and System Z are both very different and have different configuration, so these are divided into different sections.
Expand Down

0 comments on commit 728fa63

Please sign in to comment.