add doc for workload-rabalancer

Signed-off-by: chaosi-zju <[email protected]>
karmada-io · May 22, 2024 · 412f301 · 412f301
1 parent c76b569
commit 412f301
Show file tree

Hide file tree

Showing 2 changed files with 331 additions and 0 deletions.
diff --git a/docs/userguide/scheduling/workload-rebalancer.md b/docs/userguide/scheduling/workload-rebalancer.md
@@ -0,0 +1,330 @@
+---
+title: Using Workload Rebalancer to achieve a fresh Rescheduling
+---
+
+In general case, after replicas of workloads is scheduled, it will keep the scheduling result inert
+and the replicas distribution will not change. Even if reschedule is triggered by modifying replicas or placement,
+it will maintain the exist replicas distribution as closely as possible, only making minimal adjustments when necessary,
+which minimizes disruptions and preserves the balance across clusters.
+
+However, in some scenarios, users hope to have approach to actively trigger a fresh rescheduling, which disregards the
+previous assignment entirely and seeks to establish an entirely new replica distribution across clusters.
+
+## Applicable Scenarios
+
+### Scenario 1
+
+In cluster failover scenario, replicas are distributed in member1 + member2 two clusters, however they would all migrate to
+member2 cluster if member1 cluster fails.
+
+As a cluster administrator, I hope the replicas redistribute to two clusters when member1 cluster recovered, so that
+the resources of the member1 cluster will be re-utilized, also for the sake of high availability.
+
+### Scenario 2
+
+In application-level failover, low-priority applications may be preempted, resulting in shrinking from multi clusters
+to single cluster due to cluster resources are in short supply
+(refer to [Application-level Failover](https://karmada.io/docs/next/userguide/failover/application-failover#why-application-level-failover-is-required)).
+
+As a user, I hope the replicas of low-priority applications can be redistributed to multi clusters when
+cluster resources are sufficient to ensure the high availability of application.
+
+### Scenario 3
+
+In `Aggregated` schedule type, replicas may still distribute across multiple clusters due to resource constraints.
+
+As a user, I hope the replicas to be redistributed in an aggregated strategy when any cluster has
+sufficient resource to accommodate all replicas, so that the application better meets actual business requirements.
+
+
+### Scenario 4
+
+In disaster-recovery scenario, replicas migrated from primary cluster to backup cluster when primary cluster failure.
+
+As a cluster administrator, I hope that replicas can migrate back when cluster restored, so that:
+
+1. restore to the disaster-recovery mode to ensure the reliability and stability of the cluster federation.
+2. save the cost of the backup cluster.
+
+## Prerequisites
+
+### Karmada has been installed
+
+We can install Karmada by referring to [quick-start](https://github.com/karmada-io/karmada#quick-start), or directly 
+run `hack/local-up-karmada.sh` script which is also used to run our E2E cases.
+
+## Example
+
+### Step 1: create a Deployment and a ClusterRole
+
+You should first prepare a Deployment named `demo-deploy-1`, and a ClusterRole named `demo-role`.
+
+To achieve this, you can create new file `/tmp/deployments-and-services.yaml` and copy text below to it:
+
+<details>
+<summary>/tmp/deployments-and-services.yaml</summary>
+
+```yaml
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: demo-deploy-1
+  labels:
+    app: test
+spec:
+  replicas: 3
+  selector:
+    matchLabels:
+      app: demo-deploy-1
+  template:
+    metadata:
+      labels:
+        app: demo-deploy-1
+    spec:
+      terminationGracePeriodSeconds: 0
+      containers:
+        - image: nginx
+          name: demo-deploy-1
+          resources:
+            limits:
+              cpu: 10m
+              memory: 10Mi
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRole
+metadata:
+  name: demo-role
+rules:
+  - apiGroups:
+      - '*'
+    resources:
+      - '*'
+    verbs:
+      - '*'
+---
+apiVersion: policy.karmada.io/v1alpha1
+kind: ClusterPropagationPolicy
+metadata:
+  name: default-pp
+spec:
+  placement:
+    clusterTolerations:
+      - effect: NoSchedule
+        key: workload-rebalancer-test
+        operator: Exists
+        tolerationSeconds: 0
+    clusterAffinity:
+      clusterNames:
+        - member1
+        - member2
+    replicaScheduling:
+      replicaDivisionPreference: Weighted
+      replicaSchedulingType: Divided
+      weightPreference:
+        dynamicWeight: AvailableReplicas
+  resourceSelectors:
+    - apiVersion: apps/v1
+      kind: Deployment
+      name: demo-deploy-1
+      namespace: default
+    - apiVersion: rbac.authorization.k8s.io/v1
+      kind: ClusterRole
+      name: demo-role
+```
+
+</details>
+
+Then run the following command to create those resources:
+
+```bash
+kubectl --context karmada-apiserver apply -f /tmp/deployments-and-services.yaml
+```
+
+And you can check whether this step succeed like this:
+
+```bash
+$ kubectl --context karmada-apiserver get deploy demo-deploy-1
+NAME            READY   UP-TO-DATE   AVAILABLE   AGE
+demo-deploy-1   3/3     3            3           3m18s
+$ kubectl --context member1 get po                                          
+NAME                             READY   STATUS    RESTARTS   AGE
+demo-deploy-1-784cd456bf-dv6xw   1/1     Running   0          3m18s
+demo-deploy-1-784cd456bf-fgjn7   1/1     Running   0          3m18s
+$ kubectl --context member2 get po                                          
+NAME                             READY   STATUS    RESTARTS   AGE
+demo-deploy-1-784cd456bf-856rf   1/1     Running   0          3m18s
+
+$ kubectl --context karmada-apiserver get clusterrole demo-role 
+NAME        CREATED AT
+demo-role   2024-05-22T11:10:29Z
+```
+
+take `deployment/demo-deploy-1` as example, 2 replicas propagated to member1 cluster and 1 replica propagated to member2 cluster.
+
+### Step 2: add `NoExecute` taint to member1 cluster to mock cluster failover
+
+* Run the following command to add `NoExecute` taint to member1 cluster:
+
+```bash
+kubectl --context karmada-apiserver patch cluster member1 --type='json' -p '[{"op": "replace", "path": "/spec/taints", "value": [{"key": "workload-rebalancer-test", "effect": "NoExecute"}]}]'
+```
+
+Then, reschedule will be triggered for the reason of cluster failover, and all replicas will be propagated to member2 cluster,
+you can see:
+
+```bash
+$ kubectl --context member1 get po
+No resources found in default namespace.
+$ kubectl --context member2 get po  
+NAME                             READY   STATUS    RESTARTS   AGE
+demo-deploy-1-784cd456bf-856rf   1/1     Running   0          5m27s
+demo-deploy-1-784cd456bf-b5977   1/1     Running   0          35s
+demo-deploy-1-784cd456bf-pqthv   1/1     Running   0          35s
+```
+
+* Run the following command to remove the above `NoExecute` taint from member1 cluster:
+
+```bash
+kubectl --context karmada-apiserver patch cluster member1 --type='json' -p '[{"op": "replace", "path": "/spec/taints", "value": []}]'
+```
+
+Removing the taint will not lead to replicas propagation changed for the reason of scheduling result inert, 
+all replicas will keep in member2 cluster unchanged.
+
+### Step 3. apply a WorkloadRebalancer to trigger rescheduling.
+
+Assuming you want to trigger the rescheduling of above resources, you can create new file `/tmp/workload-rebalancer.yaml` 
+and copy text below to it:
+
+```yaml
+apiVersion: apps.karmada.io/v1alpha1
+kind: WorkloadRebalancer
+metadata:
+  name: demo
+spec:
+  workloads:
+    - apiVersion: apps/v1
+      kind: Deployment
+      name: demo-deploy-1
+      namespace: default
+    - apiVersion: rbac.authorization.k8s.io/v1
+      kind: ClusterRole
+      name: demo-role
+    - apiVersion: apps/v1
+      kind: Deployment
+      name: demo-deploy-2
+      namespace: default
+```
+
+> tip: `Deployment/demo-deploy-2` represents a non-existing resource.
+
+Then run the following command to apply it:
+
+```bash
+kubectl --context karmada-apiserver apply -f /tmp/workload-rebalancer.yaml
+```
+
+you will get a `workloadrebalancer.apps.karmada.io/demo created` result, which means the API created success.
+
+### Step 4: check the status of WorkloadRebalancer.
+
+Run the following command:
+
+```bash
+$ kubectl --context karmada-apiserver get workloadrebalancer demo -o yaml
+apiVersion: apps.karmada.io/v1alpha1
+kind: WorkloadRebalancer
+metadata:
+  ...
+  creationTimestamp: "2024-05-22T11:16:10Z"
+  name: demo
+  ...
+spec:
+  ...
+status:
+  finishTime: "2024-05-22T11:16:10Z"
+  observedGeneration: 1
+  observedWorkloads:
+    - result: Successful
+      workload:
+        apiVersion: apps/v1
+        kind: Deployment
+        name: demo-deploy-1
+        namespace: default
+    - reason: ReferencedBindingNotFound
+      result: Failed
+      workload:
+        apiVersion: apps/v1
+        kind: Deployment
+        name: demo-deploy-2
+        namespace: default
+    - result: Successful
+      workload:
+        apiVersion: rbac.authorization.k8s.io/v1
+        kind: ClusterRole
+        name: demo-role
+```
+
+Thus, you can observe the rescheduling result at `status.observedWorkloads` field of `workloadrebalancer/demo`. 
+As you can see, `Deployment/demo-deploy-1` and `ClusterRole/demo-role` rescheduled successfully, 
+while non-existing resource `deployment/demo-deploy-2` failed with `ReferencedBindingNotFound` result.
+
+### Step 5: Observe the real effect of WorkloadRebalancer
+
+Take `deployment/demo-deploy-1` as an example, you can observe the real replicas propagation status:
+
+```bash
+$ kubectl --context member1 get po                                          
+NAME                             READY   STATUS    RESTARTS   AGE
+demo-deploy-1-784cd456bf-82kt6   1/1     Running   0          89s
+demo-deploy-1-784cd456bf-k9fhl   1/1     Running   0          89s
+$ kubectl --context member2 get po
+NAME                             READY   STATUS    RESTARTS   AGE
+demo-deploy-1-784cd456bf-856rf   1/1     Running   0          9m23s
+```
+
+As you see, rescheduling happened and 2 replicas migrated back to member1 cluster while 1 replica in member2 cluster keep unchanged.
+
+Besides, you can observe a schedule event emitted by `default-scheduler`, such as:
+
+```bash
+$ kubectl --context karmada-apiserver describe deployment demo-deploy-1
+...
+Events:
+  Type    Reason                  Age                From                                Message
+  ----    ------                  ----               ----                                -------
+  ...
+  Normal  ScheduleBindingSucceed  31s                default-scheduler                   Binding has been scheduled successfully. Result: {member2:2, member1:1}
+  Normal  GetDependenciesSucceed  31s                dependencies-distributor            Get dependencies([]) succeed.
+  Normal  SyncSucceed             31s                execution-controller                Successfully applied resource(default/demo-deploy-1) to cluster member1
+  Normal  AggregateStatusSucceed  31s (x4 over 31s)  resource-binding-status-controller  Update resourceBinding(default/demo-deploy-1-deployment) with AggregatedStatus successfully.
+  Normal  SyncSucceed             31s                execution-controller                Successfully applied resource(default/demo-deploy-1) to cluster member2
+```
+
+### Step 6: Update and Auto-clean WorkloadRebalancer
+
+Assuming you want the WorkloadRebalancer resource been auto cleaned in the future, you can just edit it and set
+`spec.ttlSecondsAfterFinished` field to `300`, just like:
+
+```yaml
+apiVersion: apps.karmada.io/v1alpha1
+kind: WorkloadRebalancer
+metadata:
+  name: demo
+spec:
+  ttlSecondsAfterFinished: 300
+  workloads:
+    - apiVersion: apps/v1
+      kind: Deployment
+      name: demo-deploy-1
+      namespace: default
+    - apiVersion: rbac.authorization.k8s.io/v1
+      kind: ClusterRole
+      name: demo-role
+    - apiVersion: apps/v1
+      kind: Deployment
+      name: demo-deploy-2
+      namespace: default
+```
+
+After you applied this modification, this WorkloadRebalancer resource will be auto deleted after 300 seconds.
diff --git a/sidebars.js b/sidebars.js
@@ -72,6 +72,7 @@ module.exports = {
                         "userguide/scheduling/descheduler",
                         "userguide/scheduling/scheduler-estimator",
                         "userguide/scheduling/cluster-resources",
+                        "userguide/scheduling/workload-rebalancer",
                     ],
                 },
                 {