Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Rollback OAM Application in the Rollout scenario does not work as expected #1792

Open
lujiajing1126 opened this issue Oct 18, 2024 · 5 comments
Assignees
Labels
kind/bug Something isn't working

Comments

@lujiajing1126
Copy link

What happened:

We are using Kruise with Kruise Rollouts to do canary release. But since the workload resources, e.g. Deployment, are controlled by Kruise (operator), it is not possible to rollback for canary release.

What you expected to happen:

Rollback should work.

How to reproduce it (as minimally and precisely as possible):

  1. Create an core.oam.dev/v1beta1.Application with Kruise, for example, a Deployment (containing a container and an init container) will be generated,
  2. Create a Rollout and declare the given Deployment generated above to be controlled by Rollout operator,
  3. Make a change to the "Component Definition", for example, bump version of the init container.
  4. Make a change to the core.oam.dev/v1beta1.Application, for example, the image of the container, in order to trigger a canary release. During this step, a new canary deployment is created (together with the new init container).
  5. Rollback the container image, but still with the new init container.

So the rollback failed since the init container has been updated.

Anything else we need to know?:

Environment:

  • Kruise version:
  • Kubernetes version (use kubectl version):
  • Install details (e.g. helm install args):
  • Others:
@lujiajing1126 lujiajing1126 added the kind/bug Something isn't working label Oct 18, 2024
@lujiajing1126 lujiajing1126 changed the title [BUG] Kruise with Rollout [BUG] Rollback OAM Application in the Rollout scenario does not work as expected Oct 18, 2024
@furykerry
Copy link
Member

plz clarify how step 5 is performed, and what result you expected.

@furykerry furykerry assigned AiRanthem and unassigned FillZpp Oct 18, 2024
@AiRanthem
Copy link
Member

@lujiajing1126 I've sent you an email to confirm the scenario of this case. Please provide some feedback after confirming.

@lujiajing1126
Copy link
Author

@lujiajing1126 I've sent you an email to confirm the scenario of this case. Please provide some feedback after confirming.

I've confirmed the case

@AiRanthem
Copy link
Member

@lujiajing1126 The issue seems to be caused by improper usage: Components should be completely decoupled from business logic. It's recommended to modify the Component to parameterize the init container’s image like the business container’s image, managing them uniformly in the Application. Here is a demo:

# component.yaml
apiVersion: core.oam.dev/v1beta1
kind: ComponentDefinition
metadata:
  name: rollout-test
spec:
  workload:
    definition:
      apiVersion: apps/v1
      kind: Deployment
  schematic:
    cue:
      template: |
        parameter: {
          mainImage: string
          initImage: string
        }
        output: {
          apiVersion: "apps/v1"
          kind:       "Deployment"
          metadata: {
            name: context.name
          }
          spec: {
            selector: matchLabels: {
              app: context.name
            }
            template: {
              metadata: labels: {
                app: context.name
              }
              spec: {
                initContainers: [{
                  name:  "init-container"
                  image: parameter.initImage
                  command: ["sh", "-c", "echo Init Container Running"]
                }]
                containers: [{
                  name:  "main-container"
                  image: parameter.mainImage
                }]
              }
            }
          }
        }
# app.yaml
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: vela-app
spec:
  components:
    - name: app
      type: rollout-test
      properties:
        initImage: busybox:1
        mainImage: hello-world:v1
      traits:
        - type: scaler
          properties:
            replicas: 4
  policies:
    - name: target-default
      type: topology
      properties:
        clusters: ["local"]
        namespace: "default"
  workflow:
    steps:
      - name: deploy2default
        type: deploy
        properties:
          policies: ["target-default"]

@lujiajing1126
Copy link
Author

@lujiajing1126 The issue seems to be caused by improper usage: Components should be completely decoupled from business logic. It's recommended to modify the Component to parameterize the init container’s image like the business container’s image, managing them uniformly in the Application. Here is a demo:

It makes no sense. The OAM template always has a chance to be updated.

Is it possible to detect if the workload is controlled by Rollout operator? and then we may be able to keep component revision during canary stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants