Skip to content

Commit

Permalink
Merge branch 'gesis-at-github' into gesis
Browse files Browse the repository at this point in the history
  • Loading branch information
rgaiacs committed Jan 2, 2025
2 parents e269b48 + e240f58 commit 6a8f113
Show file tree
Hide file tree
Showing 17 changed files with 246 additions and 15 deletions.
2 changes: 1 addition & 1 deletion .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ charset = utf-8
end_of_line = lf
indent_size = 2
indent_style = space
insert_final_newline = true
insert_final_newline = true
34 changes: 33 additions & 1 deletion .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,29 @@ stages:
- deploy-acceptance-helm
- test-acceptance
- deploy-production-nginx
- deploy-production-helm

.gesis-manual-web:
rules:
- if: $CI_SERVER_HOST == 'git.gesis.org' && $CI_PIPELINE_SOURCE == 'web'
when: manual
allow_failure: true

.gesis-merge-request:
rules:
- if: $CI_SERVER_HOST == 'git.gesis.org' && $CI_PIPELINE_SOURCE == "merge_request_event"
changes:
- .gitlab.yml
when: manual
- if: $CI_SERVER_HOST == 'git.gesis.org' && $CI_PIPELINE_SOURCE == "merge_request_event"
changes:
- ansible/**/*
- mybinder/**/*
- config/**/*
- secrets/**/*

.gesis-push-main:
rules:
- if: $CI_SERVER_HOST == 'git.gesis.org' && $CI_PIPELINE_SOURCE == "push" && $CI_COMMIT_BRANCH == 'main'

include:
- component: $CI_SERVER_FQDN/rse/docker/images/ansible/[email protected]
Expand Down Expand Up @@ -59,6 +81,7 @@ include:
--values ./config/gesis-${HELM_ENVIRONMENT}.yaml \
--values ./secrets/config/common/common.yaml \
--values ./secrets/config/common/cryptnono.yaml \
--values ./secrets/config/common/gesis.yaml \
--values ./secrets/config/gesis-${HELM_ENVIRONMENT}.yaml
- |
helm upgrade \
Expand All @@ -72,17 +95,26 @@ include:
--values ./config/gesis-${HELM_ENVIRONMENT}.yaml \
--values ./secrets/config/common/common.yaml \
--values ./secrets/config/common/cryptnono.yaml \
--values ./secrets/config/common/gesis.yaml \
--values ./secrets/config/gesis-${HELM_ENVIRONMENT}.yaml
gesis helm acceptance deploy:
resource_group: acceptance
stage: deploy-acceptance-helm
rules:
- !reference [.gesis-manual-web, rules]
- !reference [.geis-merge-request, rules]
- !reference [.geis-push-main, rules]
variables:
HELM_ENVIRONMENT: acceptance
extends:
- .gesis helm deploy

smoke test to acceptance cluster:
stage: test-acceptance
rules:
- !reference [.gesis-manual-web, rules]
- !reference [.geis-merge-request, rules]
- !reference [.geis-push-main, rules]
script:
- curl https://notebooks-test.gesis.org/binder/
2 changes: 1 addition & 1 deletion .gitlab/agents/stage/config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
ci_access:
projects:
- id: methods-hub/interactive-environment
- id: methods-hub/interactive-environment
54 changes: 54 additions & 0 deletions ansible/inventories/gesis-stage
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
[all]
#svko-ilcm04 ansible_host=194.95.75.14 ansible_ssh_user=ansible ansible_become_pass='{{ become_pass_194_95_75_14 }}'
; svko-css-backup-node ansible_host=194.95.75.20 ansible_ssh_user=ansible ansible_become_pass='{{ become_pass_194_95_75_20 }}'
svko-k8s-test01 ansible_host=194.95.75.21 ansible_ssh_user=ansible ansible_become_pass='{{ become_pass_194_95_75_21 }}'
svko-k8s-test02 ansible_host=194.95.75.22 ansible_ssh_user=ansible ansible_become_pass='{{ become_pass_194_95_75_22 }}'
svko-k8s-test03 ansible_host=194.95.75.23 ansible_ssh_user=ansible ansible_become_pass='{{ become_pass_194_95_75_23 }}'

[all:vars]
INVENTORY_NAME=stage
K8S_CONTROL_PLANE_ENDPOINT=194.95.75.21
K8S_CONTROL_PLANE_ALIAS=svko-k8s-test01
; Replace this variable with a filter
; This must match the group ingress
K8S_INGRESS=194.95.75.22

[notebooks_gesis_org]
; svko-css-backup-node
svko-k8s-test02

[kubernetes_control_panel]
svko-k8s-test01

[kubernetes_control_panel:vars]
GRAFANA_CAPACITY_STORAGE=2Gi
JUPYTERHUB_CAPACITY_STORAGE=2Gi
PROMETHEUS_CAPACITY_STORAGE=15Gi

[kubernetes_workers]
#svko-ilcm04
; svko-css-backup-node
svko-k8s-test02
svko-k8s-test03

[ingress]
; svko-css-backup-node
svko-k8s-test02

[harbor]
; svko-css-backup-node

[binderhub]
svko-k8s-test02

[jupyterhub]
svko-k8s-test02

[jupyterhub_single_user]
svko-k8s-test03

[prometheus]
; svko-css-backup-node

[grafana]
; svko-css-backup-node
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
"""Kill pods in Kubernetes cluster after timeout"""

import argparse
import logging
import datetime
import logging

from kubernetes import client, config

Expand Down Expand Up @@ -48,9 +48,7 @@ def kill_pod(pod):
api_response = v1.delete_namespaced_pod(pod.metadata.name, NAMESPACE)
logger.info("Pod %s deleted.", api_response.metadata.name)
except client.exceptions.ApiException as exception:
logger.info(
"Fail to delete pod %s due %s", pod.metadata.name, exception
)
logger.info("Fail to delete pod %s due %s", pod.metadata.name, exception)


def kill_timed_out_pods():
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,7 @@ def kill_pod(pod):
api_response = v1.delete_namespaced_pod(pod.metadata.name, NAMESPACE)
logger.info("Pod %s deleted.", api_response.metadata.name)
except client.exceptions.ApiException as exception:
logger.info(
"Fail to delete pod %s due %s", pod.metadata.name, exception
)
logger.info("Fail to delete pod %s due %s", pod.metadata.name, exception)


def kill_succeeded_pods():
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,9 @@
import logging
import os

from kubernetes import client, config, watch

from invoke import Responder
from fabric import Connection
from invoke import Responder
from kubernetes import client, config, watch

logging.basicConfig(
format="%(asctime)s %(levelname)-8s | %(message)s", datefmt="%Y-%m-%d %H:%M:%S"
Expand Down Expand Up @@ -99,7 +98,6 @@ def monitor_cluster():
logger.info(
"Fail to delete pod %s due %s", pod_name, exception
)

elif event["object"].type == "Normal":
logger.debug(
"Found Normal event in %s ... skipping!",
Expand Down
3 changes: 3 additions & 0 deletions ansible/roles/k8s-control-panel/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -164,3 +164,6 @@
name: "remove timeout"
job: "python3 /home/ansible/bin/kill-after-timeout-pods.py --verbose >> /home/ansible/kill-after-timeout-pods.log 2>&1"
minute: "*/5"
- name: Add MetalLB to Kubernetes cluster
ansible.builtin.import_tasks:
file: metallb.yml
50 changes: 50 additions & 0 deletions ansible/roles/k8s-control-panel/tasks/metallb.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
- name: Add a MetalLB Helm repository
kubernetes.core.helm_repository:
repo_name: metallb
repo_url: https://metallb.github.io/metallb
- name: Create MetalLB Kubernetes namespace
kubernetes.core.k8s:
state: present
definition:
apiVersion: v1
kind: Namespace
metadata:
name: metallb
labels:
# Required labels
# https://metallb.universe.tf/installation/#installation-with-helm
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/warn: privileged
- name: Deploy MetalLB
kubernetes.core.helm:
release_name: metallb
release_namespace: metallb
chart_ref: metallb/metallb
create_namespace: false
history_max: 3
- name: Create MetalLB Kubernetes IP Address Pool
kubernetes.core.k8s:
state: present
definition:
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
name: "{{ k8s_control_panel_metallb_ip_address_pool_name }}"
namespace: metallb
spec:
addresses:
# TODO Use Jinja filter to automate this.
- "{{ K8S_INGRESS }}-{{ K8S_INGRESS }}"
- name: Configure L2 Advertisement for MetalLB
kubernetes.core.k8s:
state: present
definition:
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
name: "{{ k8s_control_panel_metallb_ip_address_pool_name }}-l2-advertisement"
namespace: metallb
spec:
ipAddressPools:
- "{{ k8s_control_panel_metallb_ip_address_pool_name }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# This section includes base Calico installation configuration.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.Installation
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
name: default
spec:
# Configures Calico networking.
calicoNetwork:
ipPools:
- name: default-ipv4-ippool
blockSize: 26
cidr: '{{ k8s_control_panel_cidr }}'
encapsulation: VXLANCrossSubnet
natOutgoing: Enabled
nodeSelector: all()

---

# This section configures the Calico API server.
# For more information, see: https://docs.tigera.io/calico/latest/reference/installation/api#operator.tigera.io/v1.APIServer
apiVersion: operator.tigera.io/v1
kind: APIServer
metadata:
name: default
spec: {}
2 changes: 2 additions & 0 deletions ansible/roles/k8s-control-panel/vars/main.yml
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
k8s_control_panel_calico_version: "3.28.2"
k8s_control_panel_cidr: "10.244.0.0/16"
k8s_control_panel_metallb_ip_address_pool_name: "gesis"
22 changes: 22 additions & 0 deletions ansible/vault/gesis-stage.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
$ANSIBLE_VAULT;1.1;AES256
65666231316164316637653330376337383937373938613334343066376139326661643962376237
3739366536353237356539656138383164326139333139390a333134313565323232646639313162
61656433306461343266393566626465316239353933303136633034343231666337363838623563
6633633234626132390a333632353730353066326438623663383634343532333539366363333334
34646163313065393732306363353231633239313637646339623032626366626436346234376130
66636432383138383838616434303931316334386665303563376336623930356638666366333561
66353830353361343335623737653130383862353638393336303866323738303865623934303830
66663164353837626636653766646233666164393564396233656665646636643862643035383733
65376535346438623032316666333265643135653035373139626232646430623733383134656533
34323737613565663536643430613832636666653030383066316632336363323734326339376162
39343665393661623530303236353165656130396137373634363265346362623832653563613338
31313261646333656362636134306162666133373334653933366531643063643537353663353932
39386538626664393536363035646265643832303961323636653037356433346266353963666164
32653334653936633130316463303061343938363630376663613639636338343331353732363837
37616137373834333836393137333131643432653239313432623462616537353337303432393736
34333463636566373330346437653037313366633762623161616564376639376561333561366530
37356235373336303563373137393263626532356333666166396435346565333964316263393665
32636239396563326635363636396435623731613364376632336261643064336530616235386631
37336230323331323838326331303831616337363833616563306131393733666663303836636366
38656336373763353836643536376239316463353862323332626661346366636236613530366464
36363832656263633161303335613332396237353865643964626462653565386562
2 changes: 1 addition & 1 deletion config/gesis-stage.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ ingress-nginx:
enabled: true
service:
externalTrafficPolicy: null
type: ClusterIP
type: LoadBalancer
prometheus:
enabled: true
server:
Expand Down
3 changes: 3 additions & 0 deletions docs/source/deployment/gesis-diagram.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/source/deployment/gesis-load-balancer.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions docs/source/deployment/gesis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# How to deploy a change to notebooks.gesis.org?

[GESIS Leibniz Institute for the Social Sciences](https://www.gesis.org) is a member of the [mybinder.org federation](https://mybinder.readthedocs.io/en/latest/about/status.html). GESIS has on-premise servers and use it for the mybinder.org server. The use of on-premise servers requires a separate deployment because the access to the servers using SSH requires the tunelling using a VPN.

<!--
sequenceDiagram
actor developer as Developer
participant git as GitHub
participant github-actions as GitHub Actions
participant gesis-gitlab as GESIS GitLab
participant gcp as Google Cloud
participant gesis-notebooks as notebooks.gesis.org
developer->>developer: git commit
developer->>git: git push
git->>github-actions: trigger
github-actions->>github-actions: validation
github-actions->>gcp: helm upgrade
git->>gesis-gitlab: trigger
gesis-gitlab->>gesis-gitlab: validation
gesis-gitlab->>gesis-notebooks: helm upgrade
-->

![Sequence diagram illustrating the deployment.](./gesis-diagram.svg)

## GESIS GitLab CI/CD Server

GESIS GitLab server runs [GitLab Community Edition v16.11.6](https://gitlab.com/gitlab-org/gitlab-foss/-/tags/v16.11.6) with [continuous integration (CI) and continuous delivery (CD)](https://about.gitlab.com/topics/ci-cd/) enable.

The CI/CD jobs are defined in [`.gitlab-ci.yml`](https://github.com/jupyterhub/mybinder.org-deploy/tree/main/.gitlab-ci.yml).

## Kubernetes on bare metal

Cloud environments provide a load balancer to the Kubernetes clusters. Unfortunately, Kubernetes cluster does not includes a default implementation of a load balancer for the scenario that it is running on bare metal. Because of this, the deployment of mybinder.org to GESIS servers must include the configuration of a load balancer. We are using [MetalLB](https://metallb.universe.tf/) with [Ingress NGINX Controller](https://kubernetes.github.io/ingress-nginx/).

![Sequence diagram illustrating the load balancer.](./gesis-load-balancer.drawio.svg)

## Virtual Private Server configuration with Ansible

We use [Ansible](https://www.ansible.com/) to automate the configuration of the virtual private server (VPS) provided by GESIS. After a successful configuration, we will have a operational Kubernetes cluster to deploy mybinder.org.
1 change: 1 addition & 0 deletions docs/source/deployment/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ Deployment and Operation
prereqs
how
what
gesis

0 comments on commit 6a8f113

Please sign in to comment.