Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar type of events not getting generated when configmap is changed #1001

Open
saurabhwani5 opened this issue Aug 21, 2023 · 4 comments
Open
Assignees
Labels
Customer Impact: Minor (1) misleading msgs, operational oddities not affecting workload. Failure of non critical services Customer Probability: Medium (3) Issue occurs in normal path but specific limited timing window, or other mitigating factor Found In: 2.10.0 Severity: 3 Indicates the the issue is on the priority list for next milestone. Type: Bug Indicates issue is an undesired behavior, usually caused by code error.

Comments

@saurabhwani5
Copy link
Member

Describe the bug

Create and apply configmap with wrong values which will show event in cso and then apply the right configmap which won't give any ValidationWarning event as everything is correct but when wrong configmap is applied again it won't give any ValidationWarning event in configmap which is not expected

How to Reproduce?

For this issue, I have used Images in PR1000 ( #1000 ).

  1. Install CSI 2.10.0 with the following Images
[root@saurabh6-master pr1000]# oc get pods
NAME                                                  READY   STATUS    RESTARTS       AGE
ibm-spectrum-scale-csi-79pp5                          3/3     Running   0              2m59s
ibm-spectrum-scale-csi-attacher-b6b6d4948-l8kmw       1/1     Running   2 (3m2s ago)   3m42s
ibm-spectrum-scale-csi-attacher-b6b6d4948-prgwj       1/1     Running   2 (3m2s ago)   3m42s
ibm-spectrum-scale-csi-operator-6877d5465c-szr95      1/1     Running   0              3m46s
ibm-spectrum-scale-csi-provisioner-b456fbb49-xxxkt    1/1     Running   2 (3m1s ago)   3m42s
ibm-spectrum-scale-csi-resizer-84d84bfdf6-8zlm2       1/1     Running   2 (3m1s ago)   3m42s
ibm-spectrum-scale-csi-snapshotter-656d4bd64f-s9zzq   1/1     Running   2 (3m1s ago)   3m42s
ibm-spectrum-scale-csi-vhdmf                          3/3     Running   0              2m59s
[root@saurabh6-master pr1000]# oc get cso
NAME                     VERSION   SUCCESS
ibm-spectrum-scale-csi   2.10.0    True
[root@saurabh6-master pr1000]# oc describe pod | grep quay
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:31172dc13f5cc514cf2474cb440697c1da20d035f3fd12ec761f12b06cc2e0a7
  Normal   Pulled     3m9s  kubelet            Container image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494" already present on machine
    Image:         quay.io/badri_pathak/ibm-spectrum-scale-csi-operator:events_gen_v11
    Image ID:      quay.io/badri_pathak/ibm-spectrum-scale-csi-operator@sha256:cb12d4adec4321bc9f4f6091e698d91b70df3d043e31592b39c3686084a1a836
      CSI_DRIVER_IMAGE:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494
  Normal  Pulled     3m55s  kubelet            Container image "quay.io/badri_pathak/ibm-spectrum-scale-csi-operator:events_gen_v11" already present on machine
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:31172dc13f5cc514cf2474cb440697c1da20d035f3fd12ec761f12b06cc2e0a7
  Normal   Pulled     3m9s  kubelet            Container image "quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:ee9bd3e431cf0d3fb1e407a6e2ed51d6be957dd1445c2cd88f329cbb5b1ea494" already present on machine
[root@saurabh6-master pr1000]#
  1. Apply wrong configmap as shown below
[root@saurabh6-master pr1000]# cat wrong_cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: ibm-spectrum-scale-csi-config
  namespace: ibm-spectrum-scale-csi-driver
data:
  VAR_DRIVER_LOGLEVEL: debug0s
  VAR_DRIVER_PERSISTENT_LO_G: ENABLED
  VAR_DRIVER_VOLUME_STATS_C_APABILITY: DISABLED
  VAR_DRIVER_NODEPUBLISH_METHOD: symlink
  DRIVER_UPGRADE_MAXUNAVAILABLE: 90%
[root@saurabh6-master pr1000]# oc apply -f wrong_cm.yaml
configmap/ibm-spectrum-scale-csi-config created
  1. Check the cso event
Events:
  Type     Reason             Age                  From              Message
  ----     ------             ----                 ----              -------
  Warning  CreateDirFailed    5m13s (x11 over 8m)  CSIScaleOperator  Failed to create a symlink directory with relative path spectrum-scale-csi-volume-store/.volumes on filesystem fs1
  Warning  ValidationWarning  20s                  CSIScaleOperator  There are few entries [VAR_DRIVER_PERSISTENT_LO_G VAR_DRIVER_VOLUME_STATS_C_APABILITY] with wrong key which will not be processed and few entries having wrong values map[VAR_DRIVER_LOGLEVEL:debug0s] in the configmap ibm-spectrum-scale-csi-config, default values will be used
  Warning  UpdateFailed       16s (x2 over 17s)    CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
  Normal   CSIConfigured      14s (x12 over 4m3s)  CSIScaleOperator  The CSI driver resources have been created/updated successfully

Above ValidationWarning event is getting generated which is expected

  1. Apply correct configmap
[root@saurabh6-master pr1000]# cat correct_cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: ibm-spectrum-scale-csi-config
  namespace: ibm-spectrum-scale-csi-driver
data:
  VAR_DRIVER_LOGLEVEL: DEBUG
  VAR_DRIVER_PERSISTENT_LOG: ENABLED
  VAR_DRIVER_VOLUME_STATS_CAPABILITY: DISABLED
  VAR_DRIVER_NODEPUBLISH_METHOD: SYMLINK
  DRIVER_UPGRADE_MAXUNAVAILABLE: 90%
[root@saurabh6-master pr1000]# oc apply -f correct_cm.yaml
configmap/ibm-spectrum-scale-csi-config configured
  1. Check the CSO event (It won't show any new event as everything is right)
Events:
  Type     Reason             Age                     From              Message
  ----     ------             ----                    ----              -------
  Warning  CreateDirFailed    6m41s (x11 over 9m28s)  CSIScaleOperator  Failed to create a symlink directory with relative path spectrum-scale-csi-volume-store/.volumes on filesystem fs1
  Warning  ValidationWarning  108s                    CSIScaleOperator  There are few entries [VAR_DRIVER_PERSISTENT_LO_G VAR_DRIVER_VOLUME_STATS_C_APABILITY] with wrong key which will not be processed and few entries having wrong values map[VAR_DRIVER_LOGLEVEL:debug0s] in the configmap ibm-spectrum-scale-csi-config, default values will be used
  Warning  UpdateFailed       21s                     CSIScaleOperator  Synchronization of node/driver ibm-spectrum-scale-csi DaemonSet failed for the CSISCaleOperator instance ibm-spectrum-scale-csi
  Warning  UpdateFailed       19s (x4 over 105s)      CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
  Normal   CSIConfigured      15s (x20 over 5m31s)    CSIScaleOperator  The CSI driver resources have been created/updated successfully

As cm was correct no ValidationWarning event is generated after applying correct cm which is expected

  1. Reapply the wrong configmap and check if cso event is shown or not.
[root@saurabh6-master pr1000]# cat wrong_cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: ibm-spectrum-scale-csi-config
  namespace: ibm-spectrum-scale-csi-driver
data:
  VAR_DRIVER_LOGLEVEL: debug0s
  VAR_DRIVER_PERSISTENT_LO_G: ENABLED
  VAR_DRIVER_VOLUME_STATS_C_APABILITY: DISABLED
  VAR_DRIVER_NODEPUBLISH_METHOD: symlink
  DRIVER_UPGRADE_MAXUNAVAILABLE: 90%
[root@saurabh6-master pr1000]# oc apply -f wrong_cm.yaml
configmap/ibm-spectrum-scale-csi-config configured
  1. Check the CSO status:
Events:
  Type     Reason             Age                   From              Message
  ----     ------             ----                  ----              -------
  Warning  CreateDirFailed    8m18s (x11 over 11m)  CSIScaleOperator  Failed to create a symlink directory with relative path spectrum-scale-csi-volume-store/.volumes on filesystem fs1
  Warning  ValidationWarning  3m25s                 CSIScaleOperator  There are few entries [VAR_DRIVER_PERSISTENT_LO_G VAR_DRIVER_VOLUME_STATS_C_APABILITY] with wrong key which will not be processed and few entries having wrong values map[VAR_DRIVER_LOGLEVEL:debug0s] in the configmap ibm-spectrum-scale-csi-config, default values will be used
  Warning  UpdateFailed       118s                  CSIScaleOperator  Synchronization of node/driver ibm-spectrum-scale-csi DaemonSet failed for the CSISCaleOperator instance ibm-spectrum-scale-csi
  Warning  UpdateFailed       116s (x4 over 3m22s)  CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
  Normal   CSIConfigured      112s (x20 over 7m8s)  CSIScaleOperator  The CSI driver resources have been created/updated successfully

No ValidationWarning event is shown for wrong cm. If we check the event last ValidationWarning event was 3m25s old and after applying wrong cm recently it is not showing anything.

Note: This issue is seen when wrong cm is deleted and applied again.

Expected behavior

ValidationWarning Event should be generated if wrong cm is applied and same for other failure cases if that is not working

@saurabhwani5 saurabhwani5 added Type: Bug Indicates issue is an undesired behavior, usually caused by code error. Severity: 3 Indicates the the issue is on the priority list for next milestone. Customer Impact: Minor (1) misleading msgs, operational oddities not affecting workload. Failure of non critical services Customer Probability: Medium (3) Issue occurs in normal path but specific limited timing window, or other mitigating factor Found In: 2.10.0 labels Aug 21, 2023
@Jainbrt Jainbrt added this to the v2.10.0 milestone Aug 21, 2023
@saurabhwani5
Copy link
Member Author

Another scenario is seen as follows:

  1. Apply cm having wrong key , it will generate cso event
  2. Apply cm having wrong value , it will generate cso event
  3. Now apply cm having wrong key in one parameter and wrong value which has correct key , it won't generate cso event which is not expected

@badri-pathak
Copy link
Member

@saurabhwani5 I have validated the events with various scenario along with the above mentioned ones. My observation is that Kubernetes suppress similar events whenever cso tries to adding events too frequently. The events details are visible for certain time-period but can be shown after sometime when new events gets generate. The failure values will gets increase with the hidden counts also.
e.g. Warning ValidationWarning 9s (x12 over 4h55m) CSIScaleOperator There are few entries having wrong key which will not be processed or few entries having wrong values in the configmap ibm-spectrum-scale-csi-config, check operator logs for details

In the above example, after 7-8 counts the events were not visible but tried after some time, its gets new failure events with totalm failed with x12 which is total failed from initial time.

@badri-pathak
Copy link
Member

@amdabhad The same behaviour can be noticed with generic message along with unique message when keep on generating events too frequently. I think there won't be any changes required as of now.

@amdabhad
Copy link
Member

amdabhad commented Sep 5, 2023

@badri-pathak , please check on 2 things:

  1. This is the event generated repeatedly, see if this can be reduced if it is the last event:
Normal   CSIConfigured      112s (x20 over 7m8s)  CSIScaleOperator  The CSI driver resources have been created/updated successfully
  1. See if there is any official k8s doc mentioning about surpassing frequent events, we may have to document that

@Jainbrt Jainbrt removed this from the v2.10.0 milestone Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Customer Impact: Minor (1) misleading msgs, operational oddities not affecting workload. Failure of non critical services Customer Probability: Medium (3) Issue occurs in normal path but specific limited timing window, or other mitigating factor Found In: 2.10.0 Severity: 3 Indicates the the issue is on the priority list for next milestone. Type: Bug Indicates issue is an undesired behavior, usually caused by code error.
Projects
None yet
Development

No branches or pull requests

4 participants