-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: pod crashed when creating init pod, new pod always fail because the init pod already exists #226
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please DCO sign the commits?
iPod, err := p.launchPod(ctx, config) | ||
if err != nil { | ||
_, err := p.launchPod(ctx, config) | ||
if err != nil && !k8serror.IsAlreadyExists(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check if the config is compatible with the existing pod?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need check the config.
But if we do, I have two questions:
1, When do we need to check? Only when this "k8serror.IsAlreadyExists" happens?
2, What kinds of config should be checked? Compare "pod commands"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would probably say we need to compare the entire pod spec? I wonder if the error code here already has such information?
CC @niladrih
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some research, I find that it's difficult to compare entire pod spec between the helperPod and existingPod. Because existingPod would be added lots of default fields by Kubernetes.
For example, the two pods's volume would be as following, the Type
filed is modified.
helperPod.Spec.Volumes[i]: {data {&HostPathVolumeSource{Path:/opt/storage,Type:nil,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}
existingPod.Spec.Volumes[i]: {data {&HostPathVolumeSource{Path:/opt/storage,Type:*,} nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil nil}}
By the way, do we really need to check the compatiblity? Or, do we need to compare the entire pod spec? I didn't find out a good way to check it. Are there any ideas?
@tiagolobocastro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @niladrih
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could work, I think:
Add labels containing the PVC name and namespace to the init pod and the cleanup pod, e.g.
openebs.io/pvc-name: <pvc-name>
openebs.io/pvc-namespace: <pvc-namespace>
During launchPod()
, check to see if there exists a pod with the LabelSelector openebs.io/pvc-name=<pvc-name>,openebs.io/pvc-namespace=<pvc-namespace>
.
WDYT @styshoo @tiagolobocastro?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with that.
And I've modified the code, pls help me review it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to prevent confusion in init/cleanup
, do we need add an external label? such as :
openebs.io/helper-pod-name: <pvc-name>
or
openebs.io/`helper-pod-type` : "init"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good idea to me. How about calling it
openebs.io/helper-type: hostpath-init
... in case the helper shows up in other openebs projects later..
if podList.Items != nil && len(podList.Items) != 0 { | ||
return &podList.Items[0], nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a good idea to klog.Info
the slice of Pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm what should the log say?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've modified it to this:
if podList.Items != nil && len(podList.Items) != 0 {
klog.Infof("existing helper podList length: %d", len(podList.Items))
klog.Infof("existing helper pod: %v", podList.Items[0])
return &podList.Items[0], nil
}
However, it would print a lot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think we need to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what should we do next? Remove the log code, or change it from "Info" to "debug"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could put it behind a higher level of verbosity, e.g. klog.V(2).Info()
. Also maybe all that is required is the name of the pod and the namespace (or maybe not even that). We need not %v print the whole Pod object, I feel.
The reason for this log is to be able to debug contention introduced by more than one init container, by reviewing the logs of the provisioner. Maybe even just printing the number of Pods found at level 2 verbosity is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've logged the pod count and pod namespace/name, I think level 2 verbosity
won't print too much.
if podList.Items != nil && len(podList.Items) != 0 {
klog.V(2).Infof("existing helper podList length: %d", len(podList.Items))
klog.V(2).Infof("existing helper pod: %s/%s", podList.Items[0].Namespace, podList.Items[0].Name)
return &podList.Items[0], nil
}
@@ -2,6 +2,7 @@ package app | |||
|
|||
import ( | |||
"context" | |||
"k8s.io/apimachinery/pkg/labels" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import order seems off. Please run go fmt cmd/provisioner-localpv/app/helper_hostpath.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have modified the code, please review it for me again. Thanks.
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## develop #226 +/- ##
===========================================
- Coverage 37.91% 37.35% -0.57%
===========================================
Files 36 1 -35
Lines 3373 779 -2594
===========================================
- Hits 1279 291 -988
+ Misses 2012 479 -1533
+ Partials 82 9 -73
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: styshoo <[email protected]>
Signed-off-by: styshoo <[email protected]>
Signed-off-by: styshoo <[email protected]>
Signed-off-by: styshoo <[email protected]>
Signed-off-by: styshoo <[email protected]>
…the init pod already exists Signed-off-by: styshoo <[email protected]>
matchLabels := map[string]string{ | ||
"openebs.io/pvc-name": config.pOpts.name, | ||
"openebs.io/pvc-namespace": p.namespace, | ||
"openebs.io/helper-type": "hostpath-" + config.podName, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this further, this may be problematic for upgrades, if we crash between createPod and exitPod?
Since we haven't changed any of the pod definition on this commit here, I think it's probably fine to just keep as you had initially since we're not making changes to the Pod spec in this PR itself, we can take this up when needed again.
Otherwise I'd suggest to using Apply instead of Create, which should succeed when it already exists and is compatible: https://github.com/kubernetes/client-go/blob/v0.27.2/dynamic/interface.go#L43 ?
could you help me to understand, is it because |
Pull Request template
Why is this PR required? What issue does it fix?:
dynamic-localpv-provisioner pod crashed(for example: crashed because of renewing lease failed) when creating init pod, new provisioner pod starts, and it continues creating volume. But it always fails because the init pod already exists. So the pvc will always be pending status.
What this PR does?:
When creating init pods, it would ignore
pod existes error
.Does this PR require any upgrade changes?:
No.
If the changes in this PR are manually verified, list down the scenarios covered::
Any additional information for your reviewer? :
Mention if this PR is part of any design or a continuation of previous PRs
Checklist:
<type>(<scope>): <subject>