From ab4692cd54bd0838f501311eca1ff506625b4d1d Mon Sep 17 00:00:00 2001 From: Stefan Prodan Date: Sat, 27 Apr 2024 12:27:15 +0300 Subject: [PATCH 1/5] [RFC] Flux Bootstrap for OCI-compliant Container Registries Signed-off-by: Stefan Prodan --- rfcs/000X-flux-bootstrap-oci/README.md | 319 +++++++++++++++++++++++++ 1 file changed, 319 insertions(+) create mode 100644 rfcs/000X-flux-bootstrap-oci/README.md diff --git a/rfcs/000X-flux-bootstrap-oci/README.md b/rfcs/000X-flux-bootstrap-oci/README.md new file mode 100644 index 0000000000..5223835f31 --- /dev/null +++ b/rfcs/000X-flux-bootstrap-oci/README.md @@ -0,0 +1,319 @@ +# [RFC] Flux Bootstrap for OCI-compliant Container Registries + +**Status:** provisional + +**Creation date:** 2024-04-27 + +**Last update:** 2024-04-27 + +## Summary + +Flux should allow a Git-less bootstrap procedure where the cluster desired state is stored in OCI artifacts. + +On the client-side, the Flux CLI should offer a command for packaging its own Kubernetes manifests into +an OCI artifact and pushing the artifact to a container registry. + +On the server-side, the Flux controllers should be configured to self-update from the registry +and reconcile the cluster state from OCI artifacts stored in the same or a different registry. + +## Motivation + +Given that OCI registries are evolving into a generic artifact storage solution, +we should allow Flux users who don't want to run a Git server as part of their +production infrastructure to bootstrap and manage their Kubernetes clusters using OCI artifacts. + +To decouple the clusters reconciliation from the Git repositories, Flux allows packaging and publishing +the Kubernetes manifests stored in Git to an OCI registry by running the `flux push artifact` +command in CI pipelines. + +### Goals + +- Add support to the Flux CLI for bootstrapping with a container registry as the source of truth. +- Make it easy for users to switch from Git repositories to OCI repositories. + +### Non-Goals + +- Automate the migration of Flux manifests from a Git bootstrap repository to OCI. + +## Proposal + +Implement the `flux bootstrap oci` command with the following specifications: + +```shell +flux bootstrap oci \ +--url=/: \ +--username= \ +--password= \ +--kustomization= \ +--cluster-url=/: \ +--cluster-path= +``` + +The Terraform/OpenTofu counterpart is the `flux_bootstrap_oci` provider that exposes +the same configuration options as the CLI. + +The bootstrap operations are split into two phases: + +- Install and self-update configuration for the Flux components. +- Cluster state reconciliation configuration. + +### Install and self-update configuration + +The command performs the following steps based on the `url`, `username`, +`password` and `kustomization` arguments: + +1. Logs in to the OCI registry using the provided credentials. +2. Generates an OCI artifact from the Flux components manifests and the `kustomization.yaml` file. +3. Applies the Flux components manifests along with their customisations to the cluster. +4. Pushes the OCI artifact to the container registry using the specified tag. +5. Generates an image pull secret, an OCIRepository that points to the OCI artifact and + a Flux Kustomization object that reconciles the OCI artifact contents. +6. Applies the image pull secret, OCIRepository and Flux Kustomization to the cluster. + +Artifacts pushed to the registry: +- `/:` (immutable artifact) +- `/:` (tag pointing to the immutable artifact) + +Objects created by the command in the `flux-system` namespace: +- `flux-components` Secret +- `flux-components` OCIRepository +- `flux-components` Kustomization + +### Cluster state reconciliation configuration + +After the OCIRepository and Flux Kustomization called `flux` become ready, the command +continues with the following steps: + +1. Logs in to the OCI registry where the cluster artifacts are stored using the provided credentials. +2. If the cluster OCI artifact is not found, an empty artifact is created + and pushed to the registry using the provided tag. +3. Generates an image pull secret, an OCIRepository and a Flux Kustomization object + that reconciles the cluster OCI artifact contents. +4. Applies the image pull secret, OCIRepository and Flux Kustomization to the cluster. + +Objects created by the command in the `flux-system` namespace: +- `flux-system` Secret +- `flux-system` OCIRepository +- `flux-system` Kustomization + +If the cluster registry is the same as the Flux components registry, the command could reuse the +`flux-components` image pull secret. + +### Registry authentication + +The `flux bootstrap oci` command supports the following authentication methods: + +- Basic authentication with `--username` and `--password`. The credentials are stored in a Kubernetes Secret. +- OIDC authentication with `--provider=`. No credentials are stored in the cluster, source-controller + will use Kubernetes Workload Identity to authenticate to the registry. + +To avoid passing the credentials as CLI flags, the password can be read from the standard input, e.g.: +`echo | flux bootstrap oci` or using an environment variable `OCI_PASSWORD`. + +If the registry is self-hosted and uses a self-signed TLS certificate, +the root CA certificate can be provided with the `--ca-file` flag. + +If the registry is exposed on HTTP and not HTTPS, the `--allow-insecure-http` +flag can be used to force non-TLS connections. + +### Signing and verification + +The `flux bootstrap oci` command supports the following signing and verification methods: + +- Cosign +- Notation + +TODO: Add more details about the signing and verification methods, flags and options. + +### User Stories + +#### Story 1 + +> As a platform operator I want to bootstrap a Kubernetes cluster with Flux +> using OCI artifacts stored in a container registry. + +The following example demonstrates how to bootstrap a Flux instance using GitHub Container Registry +as the OCI registry for Flux components and the cluster state. + +```shell +flux bootstrap oci \ +--url=ghcr.io/stefanprodan/flux-manifests:production \ +--username= \ +--password= \ +--kustomization=flux-manifests/kustomization.yaml \ +--cluster-url=ghcr.io/stefanprodan/fleet-manifests:production \ +--cluster-username= \ +--cluster-password= \ +--cluster-path=clusters/production +``` + +Generated OCI artifacts: + +- `ghcr.io/stefanprodan/flux-manifests:88b028f` +- `ghcr.io/stefanprodan/flux-manifests:production` +- `ghcr.io/stefanprodan/fleet-manifests:6f7a258` +- `ghcr.io/stefanprodan/fleet-manifests:production` + +Objects created in the `flux-system` namespace: + +Flux components reconciliation: + +```yaml +apiVersion: source.toolkit.fluxcd.io/v1beta2 +kind: OCIRepository +metadata: + name: flux-components + namespace: flux-system +spec: + interval: 1m + url: oci://ghcr.io/stefanprodan/flux-manifests + ref: + tag: production + secretRef: + name: flux-components +--- +apiVersion: kustomize.toolkit.fluxcd.io/v1 +kind: Kustomization +metadata: + name: flux-components + namespace: flux-system +spec: + interval: 1h + retryInterval: 5m + sourceRef: + kind: OCIRepository + name: flux-components + path: ./ + prune: true +``` + +Cluster state reconciliation: + +```yaml +apiVersion: source.toolkit.fluxcd.io/v1beta2 +kind: OCIRepository +metadata: + name: flux-system + namespace: flux-system +spec: + interval: 1m + url: oci://ghcr.io/stefanprodan/fleet-manifests + ref: + tag: production + secretRef: + name: flux-system +--- +apiVersion: kustomize.toolkit.fluxcd.io/v1 +kind: Kustomization +metadata: + name: flux-system + namespace: flux-system +spec: + interval: 1h + retryInterval: 5m + sourceRef: + kind: OCIRepository + name: flux-system + path: clusters/production + prune: true +``` + +#### Story 2 + +> As a platform operator I want to sync the cluster state with the fleet Git repository. + +Push changes from the fleet Git repository to the container registry: + +```shell +# clone the fleet Git repository +git clone https://github.com/stefanprodan/fleet.git +cd fleet +git switch main + +# push the contents the fleet OCI repository and tag it with the commit short SHA +flux push artifact oci://ghcr.io/stefanprodan/fleet-manifests:$(git rev-parse --short HEAD) \ +--path="./" \ +--source="$(git config --get remote.origin.url)" \ +--revision="$(git branch --show-current)@sha1:$(git rev-parse HEAD)" + +# tag the new version for production +flux tag artifact oci://ghcr.io/stefanprodan/fleet-manifests:$(git rev-parse --short HEAD) \ +--tag=production +``` + +This operation can be automated using the Flux GitHub Action. + +The Git repository structure would be similar to the +[flux2-kustomize-helm-example](https://github.com/fluxcd/flux2-kustomize-helm-example) with the following changes: + +- The `clusters/production/flux-system` directory is no more. +- The Flux Kustomization objects defined in the `clusters/production` directory, such as + `infrastructure.yaml` and `apps.yaml`, have the `.spec.sourceRef` set to + `kind: OCIRepository` and `name: flux-system`. + +#### Story 3 + +> As a platform operator I want to update the Flux controllers on my production cluster +> from CI without access to the Kubernetes API. + +Download the latest CLI version and update Flux directly in the registry, without rerunning bootstrap: + +```shell +# pull the latest manifests from the registry +flux pull artifact oci://ghcr.io/stefanprodan/flux-manifests:production \ +--output=./flux-manifests + +# update the Flux components manifests +flux install --export > ./flux-manifests/flux-system/gotk-components.yaml + +# calculate the checksum of the manifests +checksum=$(grep -ar -e . ./flux-manifests/ | shasum | cut -c-16) + +# extract the Flux version and commit +flux_version=$(flux version --client | awk '{print $2}') +flux_commit=$(go version -m $(which flux) | grep vcs.revisio | awk -F= '{print $NF}') + +# push the updated manifests to the registry using the checksum as tag +flux push artifact oci://ghcr.io/stefanprodan/flux-manifests:${checksum} \ +--path="./flux-manifests" \ +--source="https://github.com/fluxcd/flux2" \ +--revision="${flux_version}@sha1:${flux_commit}" + +# tag the new version for production +flux tag artifact oci://ghcr.io/stefanprodan/flux-manifests:${checksum} \ +--tag=production +``` + +This operation could be simplified by implementing a dedicated CLI command and/or GitHub Action. + +#### Story 4 + +> As a platform operator I want to update the registry credentials on my clusters. + +To rotate the registry credentials, generate a new GitHub token and overwrite the image pull secret: + +```shell +flux create secret oci flux-system \ +--url=ghcr.io \ +--username= \ +--password= +``` + +Another option is to rerun the bootstrap command with the new credentials. + +## Design Details + +The bootstrap feature will be implemented as a Go package under `fluxcd/flux2/pkg/bootstrap/oci` +using the [fluxcd/pkg/oci](https://github.com/fluxcd/pkg/tree/main/oci) +library for OCI operations such as auth, push, pull, tag, etc. + +Both the Flux CLI and the Terraform/OpenTofu provider will use the `fluxcd/flux2/pkg/bootstrap/oci` package +and expose the same configuration options. + +### Enabling the feature + +The feature is enabled by default. + +## Implementation History + +* NONE \ No newline at end of file From d6113982c7ca9e75513d12767a35127ca4eecd5f Mon Sep 17 00:00:00 2001 From: Stefan Prodan Date: Sun, 28 Apr 2024 10:30:14 +0300 Subject: [PATCH 2/5] Add workload identity user story Signed-off-by: Stefan Prodan --- rfcs/000X-flux-bootstrap-oci/README.md | 49 +++++++++++++++++++++----- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/rfcs/000X-flux-bootstrap-oci/README.md b/rfcs/000X-flux-bootstrap-oci/README.md index 5223835f31..f04cc3ef44 100644 --- a/rfcs/000X-flux-bootstrap-oci/README.md +++ b/rfcs/000X-flux-bootstrap-oci/README.md @@ -18,9 +18,14 @@ and reconcile the cluster state from OCI artifacts stored in the same or a diffe ## Motivation +After the implementation of [RFC-0003](../0003-kubernetes-oci/README.md) in 2022 and the introduction +of the `OCIRepository` source, we had a recurring ask from users about improving the UX of running +Flux fully decoupled from Git. + Given that OCI registries are evolving into a generic artifact storage solution, -we should allow Flux users who don't want to run a Git server as part of their -production infrastructure to bootstrap and manage their Kubernetes clusters using OCI artifacts. +we should assist Flux users who don't want to depend on a Git (for any reason, +including auth and SSH key management) in their production infrastructure to +bootstrap and manage their Kubernetes clusters using OCI artifacts. To decouple the clusters reconciliation from the Git repositories, Flux allows packaging and publishing the Kubernetes manifests stored in Git to an OCI registry by running the `flux push artifact` @@ -41,11 +46,11 @@ Implement the `flux bootstrap oci` command with the following specifications: ```shell flux bootstrap oci \ ---url=/: \ +--url=oci:///: \ --username= \ --password= \ --kustomization= \ ---cluster-url=/: \ +--cluster-url=oci:///: \ --cluster-path= ``` @@ -70,6 +75,9 @@ The command performs the following steps based on the `url`, `username`, a Flux Kustomization object that reconciles the OCI artifact contents. 6. Applies the image pull secret, OCIRepository and Flux Kustomization to the cluster. +Note that the creation of the image pull secret is skipped when +[Kubernetes Workload Identity](#story-2) is used for authentication to the container registry. + Artifacts pushed to the registry: - `/:` (immutable artifact) - `/:` (tag pointing to the immutable artifact) @@ -91,6 +99,9 @@ continues with the following steps: that reconciles the cluster OCI artifact contents. 4. Applies the image pull secret, OCIRepository and Flux Kustomization to the cluster. +Note that the creation of the image pull secret is skipped when +[Kubernetes Workload Identity](#story-2) is used for authentication to the container registry. + Objects created by the command in the `flux-system` namespace: - `flux-system` Secret - `flux-system` OCIRepository @@ -137,11 +148,11 @@ as the OCI registry for Flux components and the cluster state. ```shell flux bootstrap oci \ ---url=ghcr.io/stefanprodan/flux-manifests:production \ +--url=oci://ghcr.io/stefanprodan/flux-manifests:production \ --username= \ --password= \ --kustomization=flux-manifests/kustomization.yaml \ ---cluster-url=ghcr.io/stefanprodan/fleet-manifests:production \ +--cluster-url=oci://ghcr.io/stefanprodan/fleet-manifests:production \ --cluster-username= \ --cluster-password= \ --cluster-path=clusters/production @@ -220,6 +231,28 @@ spec: #### Story 2 +> As a platform operator I want to bootstrap an EKS cluster with Flux +> using OCI artifacts stored in ECR. + +The following example demonstrates how to bootstrap a Flux instance using ECR using IAM auth. +Assuming the EKS nodes have read-only access to ECR and the bastion host where +the Flux CLI is running has read and write access to ECR: + +```shell +flux bootstrap oci \ +--provider=aws \ +--url=oci://aws_account_id.dkr.ecr.us-west-2.amazonaws.com/flux-manifests:production \ +--kustomization=flux-manifests/kustomization.yaml \ +--cluster-url=oci://aws_account_id.dkr.ecr.us-west-2.amazonaws.com/fleet-manifests:production \ +--cluster-path=clusters/production +``` + +Note that when using Kubernetes Workload Identity instead of the worker node IAM role, +the `kustomization.yaml` must contain patches for the source-controller Service Account +as described [here](https://fluxcd.io/flux/installation/configuration/workload-identity/). + +#### Story 3 + > As a platform operator I want to sync the cluster state with the fleet Git repository. Push changes from the fleet Git repository to the container registry: @@ -251,7 +284,7 @@ The Git repository structure would be similar to the `infrastructure.yaml` and `apps.yaml`, have the `.spec.sourceRef` set to `kind: OCIRepository` and `name: flux-system`. -#### Story 3 +#### Story 4 > As a platform operator I want to update the Flux controllers on my production cluster > from CI without access to the Kubernetes API. @@ -286,7 +319,7 @@ flux tag artifact oci://ghcr.io/stefanprodan/flux-manifests:${checksum} \ This operation could be simplified by implementing a dedicated CLI command and/or GitHub Action. -#### Story 4 +#### Story 5 > As a platform operator I want to update the registry credentials on my clusters. From ae9f3128e88d71964c20716b6bebf988672b6ac8 Mon Sep 17 00:00:00 2001 From: Stefan Prodan Date: Sun, 28 Apr 2024 10:49:25 +0300 Subject: [PATCH 3/5] Add artifact contents to spec Signed-off-by: Stefan Prodan --- rfcs/000X-flux-bootstrap-oci/README.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/rfcs/000X-flux-bootstrap-oci/README.md b/rfcs/000X-flux-bootstrap-oci/README.md index f04cc3ef44..7cefbd6a9d 100644 --- a/rfcs/000X-flux-bootstrap-oci/README.md +++ b/rfcs/000X-flux-bootstrap-oci/README.md @@ -82,6 +82,14 @@ Artifacts pushed to the registry: - `/:` (immutable artifact) - `/:` (tag pointing to the immutable artifact) +The OCI artifact has the following content: + +```shell +./flux-system/ +├── gotk_components.yaml +└── kustomization.yaml +``` + Objects created by the command in the `flux-system` namespace: - `flux-components` Secret - `flux-components` OCIRepository @@ -110,6 +118,13 @@ Objects created by the command in the `flux-system` namespace: If the cluster registry is the same as the Flux components registry, the command could reuse the `flux-components` image pull secret. +If the cluster OCI artifact is not found, the generated one contains the following: + +```shell +./clusters/my-cluster/ # taken from --cluster-path +└── kustomization.yaml # empty overlay with no resources +``` + ### Registry authentication The `flux bootstrap oci` command supports the following authentication methods: From 59d1c7e62f98827695ee24cdd8130b0ebe7856b9 Mon Sep 17 00:00:00 2001 From: Stefan Prodan Date: Sun, 28 Apr 2024 10:59:40 +0300 Subject: [PATCH 4/5] Add trace back to Git story Signed-off-by: Stefan Prodan --- rfcs/000X-flux-bootstrap-oci/README.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/rfcs/000X-flux-bootstrap-oci/README.md b/rfcs/000X-flux-bootstrap-oci/README.md index 7cefbd6a9d..19e4f1941b 100644 --- a/rfcs/000X-flux-bootstrap-oci/README.md +++ b/rfcs/000X-flux-bootstrap-oci/README.md @@ -349,6 +349,20 @@ flux create secret oci flux-system \ Another option is to rerun the bootstrap command with the new credentials. +#### Story 6 + +> As a platform operator I want to know the Git repository used to generate the OCI artifact +> where a Kubernetes Deployment running on the cluster belongs to. + +To determine the source of a Kubernetes object in-cluster, run: + +```shell +flux -n default trace deploy/podinfo +``` + +The trace command will display the OCI artifact URL, tag and digest along +with the Git repository URL, branch and commit. + ## Design Details The bootstrap feature will be implemented as a Go package under `fluxcd/flux2/pkg/bootstrap/oci` From 9c881826ff5cc0aa636e34eb6a8ff9e4525ccd22 Mon Sep 17 00:00:00 2001 From: Stefan Prodan Date: Sun, 28 Apr 2024 22:50:13 +0300 Subject: [PATCH 5/5] Add common flags Signed-off-by: Stefan Prodan --- rfcs/000X-flux-bootstrap-oci/README.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/rfcs/000X-flux-bootstrap-oci/README.md b/rfcs/000X-flux-bootstrap-oci/README.md index 19e4f1941b..336a81370f 100644 --- a/rfcs/000X-flux-bootstrap-oci/README.md +++ b/rfcs/000X-flux-bootstrap-oci/README.md @@ -54,6 +54,21 @@ flux bootstrap oci \ --cluster-path= ``` +The bootstrap shares the following flags with the `flux install` command: + +```text +--cluster-domain string internal cluster domain (default "cluster.local") +--components strings list of components, accepts comma-separated values (default [source-controller,kustomize-controller,helm-controller,notification-controller]) +--components-extra strings list of components in addition to those supplied or defaulted, accepts values such as 'image-reflector-controller,image-automation-controller' +--image-pull-secret string Kubernetes secret name used for pulling the toolkit images from a private registry +--log-level logLevel log level, available options are: (debug, info, error) (default info) +--network-policy deny ingress access to the toolkit controllers from other namespaces using network policies (default true) +--registry string container registry where the toolkit images are published (default "ghcr.io/fluxcd") +--toleration-keys strings list of toleration keys used to schedule the components pods onto nodes with matching taints +--version string toolkit version, when specified the manifests are downloaded from https://github.com/fluxcd/flux2/releases +--watch-all-namespaces watch for custom resources in all namespaces, if set to false it will only watch the namespace where the toolkit is installed (default true) +``` + The Terraform/OpenTofu counterpart is the `flux_bootstrap_oci` provider that exposes the same configuration options as the CLI.