tpctl
is used to create AWS EKS clusters that run the Tidepool services
in a HIPAA compliant way.
tpctl
is a bash script that runs tpctl.sh
in a Docker container. tpctl.sh
requires a number of tools be installed in its environment. The Docker container cointains those tools.
However, we need to communicate using ssh. Using your local ssh-agent
with Docker is challenging if you are running Docker for Mac. We have not attempted to do so.
Instead, we run another ssh-agent
inside a separate Docker container which shares credentials with the container that runs tpctl.sh
. This will require you to enter in a passphrase if your SSH credentials are protected by one.
You also need an AWS account with an identity that has the right:
- to create a Kubernetes cluster in EKS,
- to create secrets in the AWS Secrets Manager; and,
- to create stacks in AWS CloudFormation.
You may also run tpctl.sh
nativel if you first install the required tools. Most of these can be installed using `'brew bundle on the following Brewfile on MacOs:
tap "weaveworks/tap"
brew "awscli"
brew "kubernetes-helm"
brew "eksctl"
brew "kubernetes-cli"
brew "aws-iam-authenticator"
brew "jq"
brew "yq"
brew "derailed/k9s/k9s"
brew "fluxctl"
brew "coreutils"
brew "python3"
brew "hub"
brew "kubecfg"
brew "cfssl"
brew "weaveworks/tap/eksctl"
brew "fzf"
In addition, you will need to install python3
with three packages:
pip3 install --upgrade --user awscli boto3 environs
You will also need:
go get github.com/google/go-jsonnet/cmd/jsonnet
You may pull down the latest version Docker image of tpctl
from Docker Hub with tag tidepool/tpctl:latest
.
docker pull tidepool/tpctl
Retrieve the tpctl
driver file and to make it executable:
wget https://raw.githubusercontent.com/tidepool-org/tpctl/master/cmd/tpctl
Alternatively, you may build your own local Docker image from the source by cloning the Tidepool tpctl
repo and running the cmd/build.sh
script:
git clone [email protected]:tidepool-org/tpctl
cd cmd
./build.sh
Thereafter, you may use the tpctl
script provided.
tpctl
interacts with several external services on your behalf. tpctl
must authenticate itself.
To do so, tpctl
must access your credentials stored on your local machine. This explains the need for the numerous directories that are mounted into the Docker container.
We explain these in detail below. If the assumptions we make are incorrect for your environment, you may set the environment variables used in the file to match your environment:
HELM_HOME=${HELM_HOME:-~/.helm}
KUBE_CONFIG=${KUBECONFIG:-~/.kube/config}
AWS_CONFIG=${AWS_CONFIG:-~/.aws}
GIT_CONFIG=${GIT_CONFIG:-~/.gitconfig}
In order to update your Git configuration repo with the tags of new versions of Docker images that you use, you must provide a GitHub personal access token with repo scope to access private repositories.
export GITHUB_TOKEN=....
In order to create and query AWS resources, you must provide access to your AWS credentials. We assume that you store those credentials in the standard place,
~/.aws/credentials
tpctl
mounts ~/.aws
inside the Docker container to access the credentials.
In order to access your Kubernetes cluster, you must provide access to the file that stores your Kubernetes configurations. We assume that you store that file in:
~/.kube/config
tpctl
mounts ~/.kube
inside the Docker container to access that file.
In order to provide you access to the Kubernetes cluster via the helm
client, you must provide access to the directory that stores your helm
client credentials. That directory is typically stored at:
~/.helm
tpctl
populates that directory with a TLS certificate and keys that are needed to communicate with the helm
installer.
In order to make Git commits, tpctl
needs your Git username and email. This is typically stored in:
~/.gitconfig
tpctl
mounts that file.
Check your ~/.gitconfig
. It must have entries for email
and name
such as:
[user]
email = [email protected]
name = Derrick Burns
If it does not, then add them by running this locally:
git config --global user.email "[email protected]"
git config --global user.name "Your Name"
In order to clone the flux
tool repo, tpctl
needs access to your GitHub public key. This is typically stored in:
~/.ssh/id_rsa
Most of the operations of tpctl
either use or manipulate a GitHub repository. You may use tpctl
to configure an existing GitHub repository. To do so, provide the name of the repository as the full name (including git@
):
export [email protected]:tidepool-org/cluster-test1
Alternatively, if you have not already created a GitHub repository you may create one using tpctl
:
tpctl repo ${ORG}/${REPO_NAME}
To create a EKS cluster running the Tidepool services with GitOps and a service mesh that provides HIPAA compliance, you perform a series of steps:
This creates an empty private GitHub repository for storing the desired state of your EKS cluster. We call this the config repo.
tpctl repo
This creates a file in your GitHub config repo called values.yaml
that contains
all the data needed to construct the other Kubernetes configuration files. Under normal circumstances, this is the only file that you will manually edit.
tpctl values
In this file, you find parameters that you may change to customize the installation.
By default, the cluster name is derived from the GitHub repository name. You may override it.
In addition, the default values.yaml
file defines a single Tidepool environment named qa2
. You must modify this environment or add others.
Importantly, be sure to set the DNS names for your Tidepool services. Assuming that you have the authority to do so, TLS certificates are automatically generated for the names that your provide and DNS aliases to the DNS names you provide are also created.
From the values.yaml
file tpctl
can generate all the Kubernetes manifest files, the AWS IAM roles and policies, and the eksctl
ClusterConfig
file that is used to build a cluster. Do this after you have created and edited your values.yaml
file. If you edit your values.yaml
file, rerun this step:
tpctl config
Once you have generated the manifest files, you may create your EKS cluster.
tpctl cluster
This step takes 15-20 minutes, during which time AWS provisions a new EKS cluster. It will result in a number of AWS Cloudformation stacks being generated. These stacks will have the prefix: eksctl-${ClusterName}-
.
A service mesh encrypt inter-service traffic to ensure that personal health information (PHI) is protected in transit from exposure tounauthorized parties.
You may install a service mesh as follows.
tpctl mesh
This must be done before the next step because the mesh intercepts future requests to install resources into your cluster. In some cases, it will add a sidecar to your pods. This is called automatic sidecar injection
. So, if your mesh is not running, those pods will not have a sidecar to encrypt their traffic.
If that happens, install the mesh then delete the pods manually that were added when the mesh was non-operational.
The Flux GitOps controller keeps your Kubernetes cluster up to date with the contents of the GitHub configuration repo. It also keeps your GitHub configuration repo up to date with the latest versions of Docker images of your services that are published in Docker Hub.
To install the GitOps operator:
tpctl flux
In addition, this command installs the tiller
server (the counterpart to the Helm
client) and creates and installs TLS certificates that the Helm client needs to communicate with tiller
server.
Sometimes, one of the steps will fail. Most of the time, you can simply retry that step. However, in the case of tpctl cluster
and tpctl mesh
, certain side-effects
persist that may impede your progress.
To reverse the side-effects of tpctl cluster
, you may delete your cluster and await the completion of the deletion:
tpctl delete_cluster await_deletion
Deleting a cluster will take roughtly 10 minutes.
To reverse the side-effects of tpctl mesh
, you may delete your mesh with:
tpctl remove_mesh
In addition to the basic commands above, you may:
We do not recommend that you make manual changes to the files in your config repo, except the values.yaml
file.
However, you may access the GitHub configuration repo using standard Git commands.
If you need to modify the configuration parameters in the values.yaml
file, you may do so with standard Git commands to operate on your Git repo.
If you are launching a new cluster, you must provide S3 assets for email verification. You may copy the standard assets by using this command:
tpctl buckets
If you are creating a new environment, you can generate a new set of secrets and persist those secrets in AWS Secrets Manager and modify your configuration repot to access those secrets:
tpctl secrets
If you have additional system:master
users to add to your cluster, you may add them to your values.yaml
file and run this command to install them in your cluster:
tpctl users
This operation is idempotent.
You may inspect the existing set of users with:
kubectl describe configmap -n kube-system aws-auth
Here is example output:
$ kubectl describe configmap -n kube-system aws-auth
Name: aws-auth
Namespace: kube-system
Labels: <none>
Annotations: <none>
Data
====
mapRoles:
----
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::118346523422:role/eksctl-qatest-nodegroup-ng-1-NodeInstanceRole-1L2G21MV64ISS
username: system:node:{{EC2PrivateDNSName}}
- groups:
- system:bootstrappers
- system:nodes
rolearn: arn:aws:iam::118346523422:role/eksctl-qatest-nodegroup-ng-kiam-NodeInstanceRole-1TKZB1U4OVJDW
username: system:node:{{EC2PrivateDNSName}}
- groups:
- system:masters
rolearn: arn:aws:iam::118346523422:user/lennartgoedhart-cli
username: lennartgoedhart-cli
- groups:
- system:masters
rolearn: arn:aws:iam::118346523422:user/benderr-cli
username: benderr-cli
- groups:
- system:masters
rolearn: arn:aws:iam::118346523422:user/derrick-cli
username: derrick-cli
- groups:
- system:masters
rolearn: arn:aws:iam::118346523422:user/mikeallgeier-cli
username: mikeallgeier-cli
In order to manipulate your Github config repo, Flux needs to be authorized to do so. This authorization step is normally performed when flux
is installed with tpctl flux
.
Should you delete and reinstall Flux manually, it will create a new public key that you must provide to your GitHub repo in order to authenticate Flux and authorize it to modify the repo. You do that with:
tpctl fluxkey
You may inspect your Github config repo to see that the key was deployed by going to the Settings
tab of the config repo and looking under Deploy Keys
.
If you wish to delete a AWS EKS cluster that you created with tpctl
, you may do so with:
tpctl delete_cluster
Note that this only starts the process. The command returns before the process has completed. The entire process may take up to 20 minutes.
To await the completion of the deletion of an AWS EKS cluster, you may do this:
tpctl await_deletion
You may change which cluster that kubectl
accesses by changing the file that is uses to access your cluster or by changing its contents. That file is identified in the environment variable KUBECONFIG
.
If you are only managing a single cluster, then you can simply set that environment variable to point to that file.
However, in the common case that you are manipulating several clusters, it may be inconvenient to change that environment variable every time you want to switch clusters.
To address this common case, a single KUBECONFIG
file may contain the information needed to access multiple clusters. It also contains an indication of which of those clusters to access.
The latter indicator may be easily modified with the kubectx
command.
We store a KUBECONFIG
file in your config repo that only contains the info needed for the associated cluster.
You may merge the KUBECONFIG
file from your config repo into a local KUBECONFIG
file called ~/.kube/config
using:
tpctl merge_kubeconfig
Then, you may use kubectx
to select which cluster to modify.
Your primary configuration file, values.yaml
, contains all the information needed to create your Kubernetes cluster and its services.
The first section of the file contains configuration values that are shared across the cluster or describe the properties of the configuration repo.
This section establishes where the GitHub repo is located.
general:
email: [email protected]
github:
git: [email protected]:tidepool-org/cluster-dev
https: https://github.com/tidepool-org/cluster-dev
kubeconfig: $HOME/.kube/config
logLevel: debug
sops:
keys:
arn: arn:aws:kms:us-west-2:118346523422:key/02d4583e-a7be-41c0-b5c0-2a9c569f3c87
pgp: CDE5317D7CCA7B80294FB32721A60B1450343446
sso:
allowed_groups:
- [email protected]
This section provides the AWS account number and the IAM users who are to
be granted system:master
privileges on the cluster:
aws:
accountNumber: 118346523422 # AWS account number
iamUsers: # AWS IAM users who will be grants system:master privileges to the cluster
- derrickburns-cli
- lennartgoedhard-cli
- benderr-cli
- jamesraby-cli
- haroldbernard-cli
This sections provides a description of the AWS cluster itself, including its name, region, size, networking config, and IAM policies.
cluster:
cloudWatch:
clusterLogging:
enableTypes:
- authenticator
- api
- controllerManager
- scheduler
managedNodeGroups:
- desiredCapacity: 3
instanceType: c5.xlarge
labels:
role: worker
maxSize: 7
minSize: 3
name: ngm
tags:
nodegroup-role: worker
nodeGroups:
- desiredCapacity: 3
instanceType: c5.xlarge
labels:
role: worker
maxSize: 7
minSize: 3
name: ng
tags:
nodegroup-role: worker
metadata:
rootDomain: tidepool.org
domain: dev.tidepool.org
name: qa1
region: us-west-2
version: auto
vpc:
cidr: 10.47.0.0/16
Kubernetes services run in namespaces. Within each namespace, you may configure a set of packages to run:
namespaces:
amazon-cloudwatch:
namespace:
enabled: true
cloudwatch-agent:
enabled: true
fluentd:
enabled: true
cadvisor:
namespace:
enabled: true
cadvisor:
enabled: true
cert-manager:
namespace:
enabled: true
config:
create: false
certmanager:
enabled: true
global: true
elastic-system:
namespace:
enabled: true
elastic-operator:
enabled: true
storage: 20Gi
external-dns:
namespace:
enabled: true
external-dns:
enabled: true
flux:
namespace:
enabled: true
flux:
enabled: true
fluxcloud:
enabled: true
username: derrickburns
fluxrecv:
enabled: true
export: true
sidecar: false
gloo-system:
namespace:
goldilocks: true
meshed: true
enabled: true
gloo:
enabled: true
global: true
proxies:
gatewayProxy:
replicas: 2
internalGatewayProxy:
replicas: 2
pomeriumGatewayProxy:
replicas: 1
version: 1.3.15
glooe-monitoring:
enabled: true
sso:
port: 80
serviceName: glooe-grafana
glooe-prometheus-server:
enabled: false
sso:
externalName: glooe-metrics
port: 80
storage: 64Gi
goldilocks:
namespace:
enabled: true
goldilocks:
enabled: true
sso:
externalName: goldilocks
serviceName: goldilocks-dashboard
port: 80
jaeger-operator:
namespace:
enabled: true
jaeger-operator:
enabled: true
kube-system:
namespace:
enabled: true
logging: true
labels:
config.linkerd.io/admission-webhooks: disabled
cluster-autoscaler:
enabled: false
metrics-server:
enabled: true
kubernetes-dashboard:
namespace:
enabled: true
kubernetes-dashboard:
enabled: true
linkerd:
namespace:
enabled: true
labels:
config.linkerd.io/admission-webhooks: disabled
linkerd.io/is-control-plane: "true"
linkerd:
enabled: true
global: true
linkerd-web:
enabled: true
sso:
port: 8084
monitoring:
namespace:
enabled: true
goldilocks: true
grafana:
enabled: true
sso:
port: 80
serviceName: monitoring-kube-prometheus-stack-grafana
prometheus:
enabled: true
global: true
sso:
externalName: metrics
kube-prometheus-stack:
alertmanager:
enabled: false
enabled: true
global: true
grafana:
enabled: true
thanos:
bucket: tidepool-thanos
enabled: false
none:
common:
enabled: true
pomerium:
namespace:
meshed: true
enabled: true
pomerium:
enabled: true
reloader:
namespace:
enabled: true
reloader:
enabled: true
sumologic:
namespace:
enabled: true
sumologic:
enabled: true
tracing:
namespace:
enabled: true
jaeger:
enabled: true
sso:
port: 16686
serviceName: jaeger-query
externalName: tracing
oc-collector:
enabled: true
elasticsearch:
storage: 15Gi
enabled: true
velero:
velero:
enabled: false
You also provide the configuration of your Tidepool environments in the namespaces
section:
namespaces:
dev1:
namespace:
enabled: true
logging: true
meshed: true
tidepool:
buckets:
asset: tidepool-dev1-asset
data: tidepool-dev1-data
chart:
version: 0.4.0
dnsNames:
- dev1.dev.tidepool.org
enabled: true
gateway:
domain: dev.tidepool.org
host: dev1.dev.tidepool.org
gitops:
default: glob:master-*
hpa:
enabled: false