Update cluster-resources.md #713

LavredisG · 2024-10-14T20:53:32Z

/kind documentation

Some parts were translated from the original Chinese text to be more accurate and
easier to understand for the reader.

RainbowMango

/assign

LavredisG · 2024-10-15T15:38:47Z

/assign

Also, on the page's last example, the Pod requires 3C, 20Gi and it is stated that every node from Grade 2 and above meets this requirement. However, if a node has 2C and 16Gi available, it will be classified as Grade 2 without being able to provide the resources needed to the pod. Is that right or I am misunderstanding something? Similarly, in the request of 3C, 60Gi, Grade 3 nodes with less than 60Gi available can't satsify the request as per my understanding.
In addition to these, docs state that "when a Pod needs to be scheduled to a specific cluster, they (scheduler) will compare the number of nodes in the model that satisfy the resource request of the Pod in different clusters, and schedule the Pod to the cluster with more nodes". But when we look at the numerical example, the choice is being made by choosing the cluster that could accommodate as many instances of that Pod (hence member3 is chosen in the first example), because if it chose based on the number of nodes able to accomodate the pod, it would have to choose member2 which has 8 nodes instead of member3 with 1 node (assuming that all of Grade 2 nodes can meet pod's requirements in member 2 as noted above).

RainbowMango · 2024-10-16T03:41:44Z

docs/userguide/scheduling/cluster-resources.md


-Therefore, we introduce a `CustomizedClusterResourceModeling` for each cluster that records the resource portrait of each node. Karmada will collect node and pod information for each cluster. After calculation, this node will be divided into the appropriate resource model configured by the users.
+Therefore, we introduce a `CustomizedClusterResourceModeling` for each cluster that records the resource profile of each node.
+Karmada collects node and pod information from each cluster and computes the appropriate -user configured- resource model to categorize the node into.


Suggested change

Karmada collects node and pod information from each cluster and computes the appropriate -user configured- resource model to categorize the node into.

Karmada collects node and pod information from each cluster and computes the appropriate `user configured` resource model to categorize the node info.

Is it a typo here?

It was meant to say that a node is being categorized into a resource model.

Oh. I get it. Thanks.

RainbowMango · 2024-10-16T09:35:44Z

However, if a node has 2C and 16Gi available, it will be classified as Grade 2 without being able to provide the resources needed to the pod. Is that right or I am misunderstanding something?

Yeah, you are right! Nice finding!
It seems grade 2 doesn't fulfill the request and shouldn't be counted. And I checked the code that, only when the minimum value > requested value, the node will be counted. See the code here.

I guess the documentation might be incorrect. @chaosi-zju can you help to confirm that?

RainbowMango · 2024-10-16T09:52:39Z

In addition to these, docs state that "when a Pod needs to be scheduled to a specific cluster, they (scheduler) will compare the number of nodes in the model that satisfy the resource request of the Pod in different clusters, and schedule the Pod to the cluster with more nodes". But when we look at the numerical example, the choice is being made by choosing the cluster that could accommodate as many instances of that Pod (hence member3 is chosen in the first example), because if it chose based on the number of nodes able to accomodate the pod, it would have to choose member2 which has 8 nodes instead of member3 with 1 node (assuming that all of Grade 2 nodes can meet pod's requirements in member 2 as noted above).

Yes, you are right. This is not just about checking the number of nodes, but calculating which cluster can actually accommodate the most Pods.

RainbowMango · 2024-10-16T09:53:38Z

@LavredisG I think we can fix the incorrect docs after @chaosi-zju confirms it by another PR, let this PR focus on gramma issues.

LavredisG · 2024-10-16T10:55:38Z

In addition to these, docs state that "when a Pod needs to be scheduled to a specific cluster, they (scheduler) will compare the number of nodes in the model that satisfy the resource request of the Pod in different clusters, and schedule the Pod to the cluster with more nodes". But when we look at the numerical example, the choice is being made by choosing the cluster that could accommodate as many instances of that Pod (hence member3 is chosen in the first example), because if it chose based on the number of nodes able to accomodate the pod, it would have to choose member2 which has 8 nodes instead of member3 with 1 node (assuming that all of Grade 2 nodes can meet pod's requirements in member 2 as noted above).

Yes, you are right. This is not just about checking the number of nodes, but calculating which cluster can actually accommodate the most Pods.

Thank you for the clarifications!

LavredisG · 2024-10-16T11:02:32Z

However, if a node has 2C and 16Gi available, it will be classified as Grade 2 without being able to provide the resources needed to the pod. Is that right or I am misunderstanding something?

Yeah, you are right! Nice finding! It seems grade 2 doesn't fulfill the request and shouldn't be counted. And I checked the code that, only when the minimum value > requested value, the node will be counted. See the code here.

I guess the documentation might be incorrect. @chaosi-zju can you help to confirm that?

Oh ok, so even if some nodes from Grade 2 can accommodate the pod (those with more than 3C and 20Gi), we don't count them in this case because they have to meet the demands with their Min values, is that correct?

Signed-off-by: LavredisG <[email protected]>

RainbowMango · 2024-10-17T01:15:37Z

Oh ok, so even if some nodes from Grade 2 can accommodate the pod (those with more than 3C and 20Gi), we don't count them in this case because they have to meet the demands with their Min values, is that correct?

Yes, nodes from Grade 2 would be ignored as we are not sure all nodes in this grade fulfill the request.

chaosi-zju · 2024-10-17T03:56:23Z

Yeah, you are right! Nice finding! It seems grade 2 doesn't fulfill the request and shouldn't be counted. And I checked the code that, only when the minimum value > requested value, the node will be counted. See the code here.

I guess the documentation might be incorrect. @chaosi-zju can you help to confirm that?

Yes, you are right, the minimum value should greater than requested value.

I did the corresponding test, the log of scheduler is (PS: in this test, a resource expects 3 replicas, using dynamic divided strategy propagating to member1/member2/member3):

017 03:46:02.866387       1 general.go:77] cluster member1 has max available replicas: 0 according to cluster resource models
I1017 03:46:02.866420       1 general.go:77] cluster member2 has max available replicas: 0 according to cluster resource models
I1017 03:46:02.866432       1 general.go:77] cluster member3 has max available replicas: 0 according to cluster resource models
I1017 03:46:02.866441       1 util.go:82] Invoked MaxAvailableReplicas of estimator general-estimator for workload(apps/v1, kind=Deployment, default/nginx): [{member1 0} {member2 0} {member3 0}]
I1017 03:46:02.866457       1 util.go:102] Target cluster calculated by estimators (available cluster && maxAvailableReplicas): [{member1 0} {member2 0} {member3 0}]
I1017 03:46:02.866469       1 select_clusters.go:35] Select all clusters
I1017 03:46:02.869810       1 generic_scheduler.go:101] Selected clusters: [member1 member2 member3]
I1017 03:46:02.869845       1 generic_scheduler.go:107] Assigned Replicas: [{member1 1} {member2 1} {member3 1}]
I1017 03:46:02.869860       1 scheduler.go:524] ResourceBinding(default/nginx-deployment) scheduled to clusters [{member1 1} {member2 1} {member3 1}]
I1017 03:46:02.870145       1 scheduler.go:530] "End scheduling resource binding with ClusterAffinity" resourceBinding="default/nginx-deployment"
I1017 03:46:02.870166       1 scheduler.go:821] Begin to patch status condition to ResourceBinding(default/nginx-deployment)
I1017 03:46:02.870453       1 event.go:389] "Event occurred" object="default/nginx-deployment" fieldPath="" kind="ResourceBinding" apiVersion="work.karmada.io/v1alpha2" type="Normal" reason="ScheduleBindingSucceed" message="Binding has been scheduled successfully. Result: {member1:1, member2:1, member3:1}"
I1017 03:46:02.873545       1 event.go:389] "Event occurred" object="default/nginx" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScheduleBindingSucceed" message="Binding has been scheduled successfully. Result: {member1:1, member2:1, member3:1}"

Pay attention to "cluster member1 has max available replicas: 0 according to cluster resource models", the final 3 replicas are equally weighted to the 3 clusters

chaosi-zju · 2024-10-17T04:00:54Z

I think we can fix the incorrect docs after @chaosi-zju confirms it by another PR

So, is there any line in previous doc should be modified?

LavredisG · 2024-10-17T11:07:07Z

I think we can fix the incorrect docs after @chaosi-zju confirms it by another PR

So, is there any line in previous doc should be modified?

I'd say both the numerical examples and the part about choosing the cluster with the most nodes satisfying the request should be modified ("the cluster that can accommodate the most pod replicas" should be the right case)

RainbowMango

/lgtm
/approve

As discussed above, we can have the following PR to correct the mistakes.
@LavredisG Let me know if you'd like to help with that.

karmada-bot · 2024-10-17T11:48:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~docs/OWNERS~~ [RainbowMango]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

RainbowMango · 2024-10-17T11:58:45Z

Opened #715 to track this. The cn doc will be updated after en doc.

RainbowMango · 2024-10-18T06:42:46Z

Just a short update:
After further investigation, I think it is a bug and sent a PR karmada-io/karmada#5706.

karmada-bot added the kind/documentation Categorizes issue or PR as related to documentation. label Oct 14, 2024

karmada-bot requested review from Poor12 and Tingtal October 14, 2024 20:53

karmada-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 14, 2024

RainbowMango reviewed Oct 15, 2024

View reviewed changes

karmada-bot assigned RainbowMango Oct 15, 2024

RainbowMango reviewed Oct 16, 2024

View reviewed changes

Update cluster-resources.md

470c49e

Signed-off-by: LavredisG <[email protected]>

LavredisG force-pushed the patch-8 branch from 3a54138 to 470c49e Compare October 16, 2024 12:25

RainbowMango approved these changes Oct 17, 2024

View reviewed changes

karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2024

karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 17, 2024

karmada-bot merged commit 7d50d0a into karmada-io:main Oct 17, 2024
7 checks passed

RainbowMango mentioned this pull request Oct 17, 2024

Fix mistake on ResourceModel page #715

Open

3 tasks

RainbowMango mentioned this pull request Oct 18, 2024

Fixes incorrect resource model selection karmada-io/karmada#5706

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update cluster-resources.md #713

Update cluster-resources.md #713

LavredisG commented Oct 14, 2024

RainbowMango left a comment

LavredisG commented Oct 15, 2024 •

edited

Loading

RainbowMango Oct 16, 2024

LavredisG Oct 16, 2024

RainbowMango Oct 17, 2024

RainbowMango commented Oct 16, 2024

RainbowMango commented Oct 16, 2024

RainbowMango commented Oct 16, 2024

LavredisG commented Oct 16, 2024

LavredisG commented Oct 16, 2024

RainbowMango commented Oct 17, 2024

chaosi-zju commented Oct 17, 2024

chaosi-zju commented Oct 17, 2024

LavredisG commented Oct 17, 2024

RainbowMango left a comment

karmada-bot commented Oct 17, 2024

RainbowMango commented Oct 17, 2024

RainbowMango commented Oct 18, 2024

	Karmada collects node and pod information from each cluster and computes the appropriate -user configured- resource model to categorize the node into.
	Karmada collects node and pod information from each cluster and computes the appropriate `user configured` resource model to categorize the node info.

Update cluster-resources.md #713

Update cluster-resources.md #713

Conversation

LavredisG commented Oct 14, 2024

RainbowMango left a comment

Choose a reason for hiding this comment

LavredisG commented Oct 15, 2024 • edited Loading

RainbowMango Oct 16, 2024

Choose a reason for hiding this comment

LavredisG Oct 16, 2024

Choose a reason for hiding this comment

RainbowMango Oct 17, 2024

Choose a reason for hiding this comment

RainbowMango commented Oct 16, 2024

RainbowMango commented Oct 16, 2024

RainbowMango commented Oct 16, 2024

LavredisG commented Oct 16, 2024

LavredisG commented Oct 16, 2024

RainbowMango commented Oct 17, 2024

chaosi-zju commented Oct 17, 2024

chaosi-zju commented Oct 17, 2024

LavredisG commented Oct 17, 2024

RainbowMango left a comment

Choose a reason for hiding this comment

karmada-bot commented Oct 17, 2024

RainbowMango commented Oct 17, 2024

RainbowMango commented Oct 18, 2024

LavredisG commented Oct 15, 2024 •

edited

Loading