Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EKS: CDK cluster update event keeps seeing LoggingUpdate, making it impossible to update the EKS version #33063

Open
1 task
HannesBBR opened this issue Jan 22, 2025 · 2 comments
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. effort/medium Medium work item – several days of effort p3

Comments

@HannesBBR
Copy link

Describe the bug

We're trying to update our EKS cluster to a new version (e.g. 1.29 => 1.30), and are running into an issue where the EKS cluster custom resource throws an error Only one type of update - VpcConfigUpdate, LoggingUpdate, EndpointAccessUpdate, or AuthModeUpdate can be allowed. The issue is that the custom resource thinks that there is both a LoggingUpdate and an EKS version update, while we're not doing logging updates.

Regression Issue

  • Select this option if this issue appears to be a regression.

Last Known Working CDK Version

No response

Expected Behavior

When doing an EKS version update, without any other changes, the custom resource should not detect a LoggingUpdate.

Current Behavior

Upon inspection of the custom resource logs, we can see that it's getting the following event props:

  • oldProps:
{  
  ...
  "logging": {
      "clusterLogging": [
          {
              "types": [],
              "enabled": true
          }
      ]
  },
  ...
}
  • newProps:
{
  ...
  "logging": {},
  ...
}

So that causes the custom resource to see a difference in the logging config.

However, both props are incorrect as far as I can see:

  • The oldProps clusterLogging property should have enabled: false, as that's also what is configured on the actual cluster (which can be seen when running aws eks describe-cluster).
  • The newProps should be the same as oldProps (apart from the cluster version), since we're not doing any changes except for the version

So it looks like there's something going wrong when disabling logs, causing it to have a mismatch between what's in the oldProps event properties, and what's actually configured in the cluster.

Reproduction Steps

For an existing EKS cluster created using the EKS Cluster construct:

  • Enable one or more of the ClusterLogging types
  • Disable all ClusterLogging types (i.e. pass an empty array)
  • Bump the EKS version of the cluster => this should cause the error due to the loggingConfig being incorrect in the oldProps and newProps

The workaround is to temporarily enable one of the logging types, then do the EKS update, and afterwards remove the logging types again.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.176.0

Framework Version

No response

Node.js Version

v20.18.1

OS

Mac/Debian

Language

TypeScript

Language Version

5.5.4

Other information

No response

@HannesBBR HannesBBR added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jan 22, 2025
@github-actions github-actions bot added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label Jan 22, 2025
@pahud pahud self-assigned this Jan 22, 2025
@pahud
Copy link
Contributor

pahud commented Jan 22, 2025

Enable one or more of the ClusterLogging types
Disable all ClusterLogging types (i.e. pass an empty array)
Bump the EKS version of the cluster => this should cause the error due to the loggingConfig being incorrect in the oldProps and newProps

Hi, can you share your code snippet about this for better visibility? From what I've read above looks like:

  1. cluster deployed with logging undefined
  2. now, enable one or more ClusterLogging types(did you redeploy at this step?)
  3. Disable all ClusterLogging types (i.e. pass an empty array) (did you redeploy at this step?)
  4. bump eks cluster version and redeploy and failure occurred

If you didn't redeploy at step 2 and 3 then the major difference is the logging props would change from undefined to a defined empty array, is it correct? Under the hood, undefined and empty array would be considered a property change. Is my assumption correct?

@pahud pahud added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. p3 labels Jan 22, 2025
@pahud pahud removed their assignment Jan 22, 2025
@pahud pahud added effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Jan 22, 2025
@HannesBBR
Copy link
Author

Hi, thanks for checking!

The scenario was as follows in our case:

  1. Deployed a cluster without any logging types enabled
this.cluster = new Cluster(this, 'EksCluster', {
    ..
    clusterLogging: []
    ..
});
  1. New deploy where we enabled some logging types
this.cluster = new Cluster(this, 'EksCluster', {
    ..
    clusterLogging: [ClusterLoggingTypes.API]
    ..
});
  1. New deploy where we disabled all logging types again (since we only need them enabled for debugging purposes)
this.cluster = new Cluster(this, 'EksCluster', {
    ..
    clusterLogging: []
    ..
});
  1. New deploy where we bump the version of EKS
this.cluster = new Cluster(this, 'EksCluster', {
    ..
    version: KubernetesVersion.V1_31
    clusterLogging: []
    ..
});

Step 4 fails because of the error mentioned above. If I first enable at least one Logging Type again, then the update works as expected (because the logging property in oldProps and newProps is the same).

It seems that somewhere in step 3, there is something going wrong, which causes the custom resource to receive incorrect values of the logging config. Looking at the cloudformation templates, what I can see is:

Before step 3:

  "Type": "Custom::AWSCDK-EKS-Cluster",
  "Properties":{
          "logging": {
            "clusterLogging": [
              {
                "enabled": true,
                "types": ["api"]
              }
            ]
          },
    }

After step 3:

  "Type": "Custom::AWSCDK-EKS-Cluster",
  "Properties":{
          "logging": {
            "clusterLogging": [
              {
                "enabled": true,
                "types": []
              }
            ]
          },
    }

=> I would expect the template to have enabled: false instead after step 3. Especially since that's also what the aws eks describe-cluster command returns. So my guess is that the translation from CDK construct to CFN goes 'wrong' here, but that the custom resource still actually disables the logging in the cluster. And that somehow then causes issues in e.g. step 4 where it ends up with a newProps that contains logging: {} instead of the correct logging config from the template.

Hope that helps!

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service bug This issue is a bug. effort/medium Medium work item – several days of effort p3
Projects
None yet
Development

No branches or pull requests

2 participants