-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If there are any plans for WorkloadRebalancer to support resourceSelectors, similar to what is supported in PropagationPolicy #5527
Comments
Hi @weidalin, thank you for your interest in this feature. Actually, this feature was originally designed to support batch selecting of resources through Simply put, there was a lack of user's support at that time. If there are more real users calling for |
I'm thinking of moving forward with the plan you mentioned, could you provide me with some information.
|
|
Hi @chaosi-zju thank you very much for your reply. Additionally, when you mentioned "we were worried that batch rescheduling was a dangerous operation," could you clarify what kind of scenarios you were referring to? |
Sorry, not yet.
The scene you provided is very interesting. I think we will decide as soon as possible whether to support this ability or not.
I mean replicas distribution changes dramatically. We previously mainly supported deployment, which is a bit different from inference training tasks. In that case, the user does not want the pods which are running fine to undergo major changes, however, |
|
Hi @weidalin @so2bin, thanks for the feedback and input. I think we can iterate the WorkloadRebalancer based on your scenario. The We might need to ask a few more questions to better understand your use case. As @so2bin mentioned above, you decided to use CPP with DynamicWeight strategy, can you share a copy of CPP you are using? Is there a load balancer across clusters in your case? Since you are re-scheduling inference workloads, how do you handle the traffic upon replicas in two clusters? |
@chaosi-zju I hope the following words may help you to understand our scenario, thanks. Backgroud
Current onlinedesign
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: starcode-triton-server-v3
namespace: ai-app
spec:
host: starcode-triton-server-v3.ai-app.svc.cluster.local
trafficPolicy:
connectionPool:
...
loadBalancer:
localityLbSetting:
distribute:
- from: <Cluster-Name-A>/*
to:
<Cluster-Name-A>/*: 80
<Cluster-Name-E>/*: 20
...
- from: <Cluster-Name-E>/*
to:
<Cluster-Name-A>/*: 80
<Cluster-Name-E>/*: 20
...
- ... # there can extend more clusters here
outlierDetection:
...
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
labels:
...
name: team-1005-app-0-cpp-default-day-time
spec:
conflictResolution: Abort
placement:
#...
replicaScheduling:
replicaDivisionPreference: Weighted
replicaSchedulingType: Divided
weightPreference:
staticWeightList:
- targetCluster:
clusterNames:
- <Custer-Name-A>
weight: 80
- targetCluster:
clusterNames:
- <Custer-Name-E>
weight: 20
- ... # more clusters
preemption: Never
priority: 30
resourceSelectors:
- apiVersion: v1
kind: ConfigMap # we use cm but CRD
labelSelector:
matchLabels:
atms-app/appid: "10471"
atms-app/teamid: "1005"
...
Next versiondesign
apiVersion: policy.karmada.io/v1alpha1
kind: ClusterPropagationPolicy
metadata:
labels:
...
name: team-1005-app-0-cpp-default-day-time
spec:
conflictResolution: Abort
placement:
#...
replicaScheduling:
replicaDivisionPreference: Weighted
replicaSchedulingType: Divided
weightPreference:
# Here we do not use `AvailableReplicas` but extend a new DynamicWeightFactor for the custom-estimator use case.
# In the future, we may push a PR to explain it.
dynamicWeight: EstimatorCalculatedReplicas
preemption: Never
priority: 30
resourceSelectors:
- apiVersion: v1
kind: ConfigMap # we use cm but CRD
labelSelector:
matchLabels:
atms-app/appid: "10471"
atms-app/teamid: "1005"
... |
That is due to the
Change the CPP also involves re-schedule, so you don't need |
Hi @so2bin @weidalin , thank you very much for the above valuable practice you provided. I am amazed at the depth of your exploration and the clear and detailed reply 👍 . I think I understand why you have this appeal. But I'm interested in a question: as you said above, one sentence is "each team use a CPP to distribute the pods to worker clusters", another is "the CPPs are managed by the platform manager", I wonder who is exactly in charge of this cpp. Do you mean the CPP is declared by user, but the operation of |
it seems they not only wants to take available resources account, but also use the custom estimator to get more accurate results for their scenarios. |
@RainbowMango karmada/pkg/scheduler/helper.go Line 51 in ba360c9
|
@chaosi-zju |
Yes, we need to take into account a fine team-level resource distribution from custom-custimator server. |
I guess each team probably has multiple applications on your platform, just out of curiosity, do you manage one CPP per team or one CPP per application? |
@RainbowMango |
Hi @so2bin, thank you for your above explanation~
A new question comes, as the role of platform manager, is there any detail way to "take care of it and avoid this problem"? And, if you don't know much about the specific team apps, how do you judge that the team's apps are not affected? If your method is common or inspiring for most other people, I think we don't need to concern so much and just start to push forward the feature you mentioned. |
After carefully reviewing [[#4805 (comment)](https://github.com//issues/4805#issuecomment-2283607217)](#4805 (comment)), I believe the two main issues and optimizations regarding static weight are as described below. Is my understanding correct?
We also encounter these problems when using a static weighting strategy. I agree that making static weight take available resources into account (if any cluster lacks sufficient resources, the scheduling fails) is a good approach to avoid Pods Pending, but simply letting scheduling fail may not be a good approach. I would like to explain the impact of this change based on the actual usage scenarios of our AI applications. I hope it can help you evaluate this static weight change.
In this scenario, static weight optimization can help us avoid entering the Pending state, which is indeed an optimization method. But it cannot satisfy our need to switch the weight of the tidal cluster. In the tidal cluster weight switching scenario, it is more suitable for us to use dynamic weight and WorkloadRebalancer in combination. So we hope that WorkloadRebalancer can support LabelSelector.
Therefore, I think it is simpler and more efficient to maintain the current static weight behavior (doesn't take the available replicas in the cluster into account). I hope this feedback is helpful to you. |
thank you very much for your two provided scenarios!
I don't understand this sentence well, hi @whitewindmills, is this consistent with what you proposed? |
I can't say for sure yet cause it's pending now, but I'd like to share my options. usually we are mainly concerned about whether the selected cluster can accommodate the replicas to be allocated. as for whether the final result is exactly in line with the proportion of the static weight setting, it is not so important. we are likely to make an approximate assignment. |
Just put this issue into the Karmada backlog, I think we can discuss it at one of the Community meetings to see how to move this forward. @weidalin @so2bin I'm not sure if the time slot meets you well. Please find a time that is suitable for you and add an agenda to the Meeting Notes. (Note: By joining the google groups you will be able to edit the meeting notes. |
Hello, we have added an agenda to the Meeting Notes of the 2024-09-24 meeting. |
Please provide an in-depth description of the question you have:
The current WorkloadRebalancer (#4698) provides a great entry point for rescheduling workloads, allowing the use of .spec.workloads to specify the resources for rescheduling and supporting array-based scheduling. I would like to ask if there are any plans for WorkloadRebalancer to support resourceSelectors, similar to what is supported in PropagationPolicy?
For example:
This would allow me to reschedule resources for the same team based on labels, making workload rescheduling more efficient.
What do you think about this question?:
I believe this feature would make the WorkloadRebalancer even more flexible, allowing dynamic resource selection through label selectors, similar to what PropagationPolicy offers.
Environment:
Karmada version: v1.10.4
Kubernetes version: v1.25.6
@chaosi-zju
The text was updated successfully, but these errors were encountered: