Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG (FIXED)] ot-container-kit/redis-operator (v0.9.0) throws an error when updating volumeClaimTemplate #101

Closed
Essoz opened this issue May 29, 2022 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@Essoz
Copy link
Contributor

Essoz commented May 29, 2022

NOTE: This bug only relies in v0.9.0 and has been fixed in v0.10.0. For follow-up discussion about the fix of this bug, please refer to #95.

Expected

When adding key:value under spec.storage.volumeClaimTemplate, I was expecting to see the key:value showing up in both statefulset.apps/test-cluster-leader and statefulset.apps/test-cluster-follower resources.

Actual behavior

Modifying (adding, deleting, changing) any fields under spec.storage.volumeClaimTemplate will lead to the following error message from the K8S statefulset controller

2022-05-27T21:47:16.397Z	ERROR	controller_redis	Cannot create statefulset for Redis	{"Request.StateFulSet.Namespace": "default", "Request.StateFulSet.Name": "test-cluster-leader", "Setup.Type": "leader", "error": "StatefulSet.apps \"test-cluster-leader\" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden"}

Steps to reproduce the behavior

  1. Deploy the redis-operator: quay.io/opstree/redis-operator:v0.9.0
  2. Deploy a simple redis cluster using the following yaml file
apiVersion: redis.redis.opstreelabs.in/v1beta1
kind: RedisCluster
metadata:
  name: redis-cluster
spec:
  clusterSize: 3
  kubernetesConfig:
    image: quay.io/opstree/redis:v6.2.5
    imagePullPolicy: IfNotPresent
    imagePullSecrets:
      - name: regcred
  storage:
    volumeClaimTemplate:
      spec:
        # storageClassName: standard
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi
  1. Add additional fields under the field spec.storage.volumeClaimTemplate by applying the following yaml file
apiVersion: redis.redis.opstreelabs.in/v1beta1
kind: RedisCluster
metadata:
  name: test-cluster
spec:
  clusterSize: 3
  kubernetesConfig:
    image: quay.io/opstree/redis:v6.2.5
    imagePullPolicy: IfNotPresent
    imagePullSecrets:
    - name: regcred
  storage:
    volumeClaimTemplate:
      spec:
        accessModes:
        - ReadWriteOnce
        dataSource:
          apiGroup: snapshot.storage.k8s.io
          kind: VolumeSnapshot
          name: new-snapshot-test
        resources:
          requests:
            storage: 1Gi
  1. Use kubectl logs deployment.apps/redis-operator -n redis-operator and observe that multiple errror logs has been printed out by the operator.

Environment

  • Redis Operator Version: quay.io/opstree/redis-operator:v0.9.0
  • Kubernetes Version: v1.22.9

Root cause and Fix

The error message is due to the fact that the K8S statefulset controller (v1.22.9) does not support updating the volumeClaimTemplate field as revealed in validation.go/#L163-L165
According to this issue, it should be a K8S's side problem for not allowing updates to volumeClaimTemplate, and it requires fixing. The redis operator developers did not realize this issue until release v0.10.0 where they fixed the issue by making the field immutable statefulset.go/#L95-L96.

Additional Comments:

As in #95, we claim that this fix is not good enough because now the operator silently rejects change without throwing an error log. This makes it hard for the users to realize that their changes are actually ignored.

Update:

An related bug: https://github.com/banzaicloud/k8s-objectmatcher/issues/51. This reveals that in v0.9.0, the developers already intended to make the field immutable by ignoring changes in that field when calculating difference.

@tianyin
Copy link
Member

tianyin commented May 29, 2022

As in #95, we claim that this fix is not good enough because now the operator silently rejects change without throwing an error log. This makes it hard for the users to realize that their changes are actually ignored.

Why not send a PR to add the error log?

@Essoz
Copy link
Contributor Author

Essoz commented May 29, 2022

Yes, Tyler and I agreed that the developers can do better than simply ignore the change and planned to send an issue to them. I can later discuss with Tyler and use the acto team github account to submit a PR.

@Essoz
Copy link
Contributor Author

Essoz commented May 31, 2022

An PR has been posted OT-CONTAINER-KIT/redis-operator#280

@tianyin
Copy link
Member

tianyin commented May 31, 2022

neat

@tylergu tylergu added the bug Something isn't working label Jun 3, 2022
@tianyin
Copy link
Member

tianyin commented Jun 30, 2022

Let's close the issue if the PR is merged or the bug is fixed.

It's awesome that the bug is fixed!

@tianyin tianyin closed this as completed Jun 30, 2022
@Essoz
Copy link
Contributor Author

Essoz commented Jun 30, 2022

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants