Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pod stuck in ContainerCreating status while waiting for an IP address to get assigned #2892

Closed
davidgp1701 opened this issue Apr 23, 2024 · 15 comments
Labels

Comments

@davidgp1701
Copy link

davidgp1701 commented Apr 23, 2024

What happened:

We migrated our clusters to EKS 1.28 from 1.27. That upgrade also included migrating from CNI version 1.16 to CNI 1.18. After a while, in nodes with a relatively high number of containers we start seeing Pods blocked in ContainerCreating state:

$ kubectl get pods test-app3-7cd64f566f-6rc5r
NAME                         READY   STATUS              RESTARTS   AGE
test-app3-7cd64f566f-6rc5r   0/1     ContainerCreating   0          39s

Doing a description of the container it shows the following error:

Warning  FailedCreatePodSandBox  1s    kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f5e4d01fec4726c2f19cd0a07bb6d229f9a651b46f78be9d92159eff19e8ed13": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

Our subnets have more than enough IP addresses.

The instance type it is: m6a.2xlarge and right now the node only is running 42 Pods with more of enough free CPU and Memory:

$ kubectl describe node ip-10-24-135-73.eu-central-1.compute.internal
Name:               ip-10-24-135-73.eu-central-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m6a.2xlarge
                    beta.kubernetes.io/os=linux
                    eks.amazonaws.com/capacityType=ON_DEMAND
                    eks.amazonaws.com/nodegroup=test
                    eks.amazonaws.com/nodegroup-image=ami-06a57b0cfbd962246
                    eks.amazonaws.com/sourceLaunchTemplateId=XXXXXXXXXXXXXXXXx
                    eks.amazonaws.com/sourceLaunchTemplateVersion=1
                    failure-domain.beta.kubernetes.io/region=eu-central-1
                    failure-domain.beta.kubernetes.io/zone=eu-central-1c
                    k8s.io/cloud-provider-aws=XXXXXXXXXXXXXXXxx
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-24-135-73.eu-central-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m6a.2xlarge
                    nodegroup=test
                    topology.ebs.csi.aws.com/zone=eu-central-1c
                    topology.kubernetes.io/region=eu-central-1
                    topology.kubernetes.io/zone=eu-central-1c
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.24.135.73
                    csi.volume.kubernetes.io/nodeid:
                      {"ebs.csi.aws.com":"XXXXXXX","efs.csi.aws.com":"XXXXXXX","smb.csi.k8s.io":"ip-10-24-135-73.eu-central-1.compute.in...
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 22 Apr 2024 10:28:57 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-24-135-73.eu-central-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Mon, 22 Apr 2024 14:57:13 +0200
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Mon, 22 Apr 2024 14:52:40 +0200   Mon, 22 Apr 2024 10:28:55 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Mon, 22 Apr 2024 14:52:40 +0200   Mon, 22 Apr 2024 10:28:55 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Mon, 22 Apr 2024 14:52:40 +0200   Mon, 22 Apr 2024 10:28:55 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Mon, 22 Apr 2024 14:52:40 +0200   Mon, 22 Apr 2024 10:29:24 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.24.135.73
  InternalDNS:  ip-10-24-135-73.eu-central-1.compute.internal
  Hostname:     ip-10-24-135-73.eu-central-1.compute.internal
Capacity:
  cpu:                        8
  ephemeral-storage:          102334Mi
  hugepages-1Gi:              0
  hugepages-2Mi:              0
  memory:                     32114404Ki
  pods:                       58
  vpc.amazonaws.com/pod-eni:  38
Allocatable:
  cpu:                        7910m
  ephemeral-storage:          95500736762
  hugepages-1Gi:              0
  hugepages-2Mi:              0
  memory:                     31097572Ki
  pods:                       58
  vpc.amazonaws.com/pod-eni:  38
System Info:
  Machine ID:                 ec2794e12d100370d12f53be2b39eeaa
  System UUID:                ec2794e1-2d10-0370-d12f-53be2b39eeaa
  Boot ID:                    70adb53a-6ece-4cab-8048-db809831a867
  Kernel Version:             6.1.82
  OS Image:                   Bottlerocket OS 1.19.4 (aws-k8s-1.28)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.31+bottlerocket
  Kubelet Version:            v1.28.7-eks-c5c5da4
  Kube-Proxy Version:         v1.28.7-eks-c5c5da4
ProviderID:                   aws:///eu-central-1c/i-02f2d0bbdd98ada59
Non-terminated Pods:          (49 in total)
  Namespace                   Name                                         CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                         ------------  ----------  ---------------  -------------  ---
  kube-system                 aws-node-c9vp5                               50m (0%)      0 (0%)      0 (0%)           0 (0%)         137m
  kube-system                 csi-smb-node-fl5mn                           30m (0%)      0 (0%)      60Mi (0%)        400Mi (1%)     4h28m
  kube-system                 ebs-csi-node-dbc77                           30m (0%)      0 (0%)      120Mi (0%)       768Mi (2%)     4h28m
  kube-system                 efs-csi-node-7qh4r                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h28m
  kube-system                 kube-proxy-xkh6v                             100m (1%)     0 (0%)      0 (0%)           0 (0%)         4h28m
  kube-system                 prometheus-prometheus-node-exporter-c2mfg    0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h28m
  logging                     filebeat-filebeat-jtms7                      100m (1%)     1 (12%)     100Mi (0%)       400Mi (1%)     4h28m
  test                        test-app3-7cd64f566f-289vx                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-2jzct                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-2phmj                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m43s
  test                        test-app3-7cd64f566f-2sgvm                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-44t9n                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-4mrv5                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-65qhs                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m43s
  test                        test-app3-7cd64f566f-69xgk                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-6h6wt                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-6rc5r                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4m53s
  test                        test-app3-7cd64f566f-7d8tl                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         6m8s
  test                        test-app3-7cd64f566f-7r6n2                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-8tggt                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h28m
  test                        test-app3-7cd64f566f-b6thp                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-b6wd4                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m43s
  test                        test-app3-7cd64f566f-c4gm4                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-c5qvc                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-f2kln                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-fpjf2                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-gqc8q                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-gtqfr                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-h4ngv                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m43s
  test                        test-app3-7cd64f566f-hh4qf                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-jjq82                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-kwdqm                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-kxq2k                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-l426q                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-lj64q                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-lspws                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-mmlkx                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-n8llc                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-nkx7v                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-qjf46                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-rv4t7                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-s8bnn                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m43s
  test                        test-app3-7cd64f566f-s8wx7                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-sc4vx                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-vrqz4                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-wvbdp                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
  test                        test-app3-7cd64f566f-z5mzp                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         4h28m
  test                        test-app3-7cd64f566f-zl4pb                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         8m11s
  test                        test-app3-7cd64f566f-zrprz                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         7m44s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests    Limits
  --------                   --------    ------
  cpu                        310m (3%)   1 (12%)
  memory                     280Mi (0%)  1568Mi (5%)
  ephemeral-storage          0 (0%)      0 (0%)
  hugepages-1Gi              0 (0%)      0 (0%)
  hugepages-2Mi              0 (0%)      0 (0%)
  vpc.amazonaws.com/pod-eni  0           0
Events:                      <none>

Considering that a m6a.2xlarge could accomodate 58 Pods, we are at 10 pods to reach that limit, so the container should be able to get an IP address.

Attach logs

I sent the BottleRocket logs collected with logdog to the support e-mail address. Having issues running the collector scrip with Bottlerocket.

What you expected to happen:

As previously commented, I expected two things:

  • The maximum number of Pods should be 58 (for a m6a.2xlarge instance according to this: https://github.com/aws/amazon-vpc-cni-k8s/blob/master/misc/eni-max-pods.txt), that it is possible just by switching to the version 1.16 of the CNI. Just by downgrading of CNI version, without restarting/recreating the node, the node was able to run successfully 58 Pods, instead of the limit of 48 Pods and the rest in ContainerCreating state.
  • That any Pods that the node is not able to accommodate for not being able to allocate an IP address to it, to leave it in PENDING state so the cluster autoscaler notices it and triggers the scale-up of the NodeGroup.

How to reproduce it (as minimally and precisely as possible):

1- Create an EKS 1.28 cluster.
2- Installed the followed addon versions:

- CoreDNS - 1.10.1-eksbuild.7
- Kube-Proxy - 1.28.6-eksbuild.2
- CNI - 1.18.eksbuild.1

For the CNI addon, use these parameters aside from the default configuration:

{
  "env": {
    "ENABLE_POD_ENI": "true"
  },
  "init": {
    "env": {
      "DISABLE_TCP_EARLY_DEMUX": "true"
    }
  }
}

3- Create a test nodegroup based on the m6a.2xlarge instance. It has a fix size of 1 instance. According to the docs, it should be able to accommodate 58 Pods.

After everything it is created, this should be the state of the only node:

❯ kubectl describe node ip-10-24-132-168.eu-central-1.compute.internal
Name:               ip-10-24-132-168.eu-central-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m6a.2xlarge
                    beta.kubernetes.io/os=linux
                    eks.amazonaws.com/capacityType=ON_DEMAND
                    eks.amazonaws.com/nodegroup=test
                    eks.amazonaws.com/nodegroup-image=ami-06a57b0cfbd962246
                    eks.amazonaws.com/sourceLaunchTemplateId=lt-03959d819d4fa9a8f
                    eks.amazonaws.com/sourceLaunchTemplateVersion=1
                    failure-domain.beta.kubernetes.io/region=eu-central-1
                    failure-domain.beta.kubernetes.io/zone=eu-central-1c
                    k8s.io/cloud-provider-aws=XXXXXX
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-24-132-168.eu-central-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m6a.2xlarge
                    nodegroup=test
                    topology.kubernetes.io/region=eu-central-1
                    topology.kubernetes.io/zone=eu-central-1c
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.24.132.168
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 23 Apr 2024 10:44:11 +0200
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-24-132-168.eu-central-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 23 Apr 2024 10:50:49 +0200
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Tue, 23 Apr 2024 10:49:48 +0200   Tue, 23 Apr 2024 10:44:10 +0200   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Tue, 23 Apr 2024 10:49:48 +0200   Tue, 23 Apr 2024 10:44:10 +0200   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Tue, 23 Apr 2024 10:49:48 +0200   Tue, 23 Apr 2024 10:44:10 +0200   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Tue, 23 Apr 2024 10:49:48 +0200   Tue, 23 Apr 2024 10:44:22 +0200   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.24.132.168
  InternalDNS:  ip-10-24-132-168.eu-central-1.compute.internal
  Hostname:     ip-10-24-132-168.eu-central-1.compute.internal
Capacity:
  cpu:                        8
  ephemeral-storage:          102334Mi
  hugepages-1Gi:              0
  hugepages-2Mi:              0
  memory:                     32114404Ki
  pods:                       58
  vpc.amazonaws.com/pod-eni:  38
Allocatable:
  cpu:                        7910m
  ephemeral-storage:          95500736762
  hugepages-1Gi:              0
  hugepages-2Mi:              0
  memory:                     31097572Ki
  pods:                       58
  vpc.amazonaws.com/pod-eni:  38
System Info:
  Machine ID:                 ec24b4df0710c1079c1ee77006260b10
  System UUID:                ec24b4df-0710-c107-9c1e-e77006260b10
  Boot ID:                    5f483c5d-94e7-4ff7-945b-8359013df727
  Kernel Version:             6.1.82
  OS Image:                   Bottlerocket OS 1.19.4 (aws-k8s-1.28)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.31+bottlerocket
  Kubelet Version:            v1.28.7-eks-c5c5da4
  Kube-Proxy Version:         v1.28.7-eks-c5c5da4
ProviderID:                   aws:///eu-central-1c/i-0d30622970867fb68
Non-terminated Pods:          (4 in total)
  Namespace                   Name                        CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                        ------------  ----------  ---------------  -------------  ---
  kube-system                 aws-node-js2dr              50m (0%)      0 (0%)      0 (0%)           0 (0%)         6m42s
  kube-system                 coredns-7666b54fc4-msc8h    100m (1%)     0 (0%)      70Mi (0%)        170Mi (0%)     7m19s
  kube-system                 coredns-7666b54fc4-w7nks    100m (1%)     0 (0%)      70Mi (0%)        170Mi (0%)     7m19s
  kube-system                 kube-proxy-8sp78            100m (1%)     0 (0%)      0 (0%)           0 (0%)         6m41s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests    Limits
  --------                   --------    ------
  cpu                        350m (4%)   0 (0%)
  memory                     140Mi (0%)  340Mi (1%)
  ephemeral-storage          0 (0%)      0 (0%)
  hugepages-1Gi              0 (0%)      0 (0%)
  hugepages-2Mi              0 (0%)      0 (0%)
  vpc.amazonaws.com/pod-eni  0           0
Events:
  Type     Reason                   Age                    From                     Message
  ----     ------                   ----                   ----                     -------
  Normal   Starting                 6m37s                  kube-proxy
  Normal   Starting                 6m43s                  kubelet                  Starting kubelet.
  Warning  InvalidDiskCapacity      6m43s                  kubelet                  invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  6m43s (x2 over 6m43s)  kubelet                  Node ip-10-24-132-168.eu-central-1.compute.internal status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    6m43s (x2 over 6m43s)  kubelet                  Node ip-10-24-132-168.eu-central-1.compute.internal status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     6m43s (x2 over 6m43s)  kubelet                  Node ip-10-24-132-168.eu-central-1.compute.internal status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  6m43s                  kubelet                  Updated Node Allocatable limit across pods
  Normal   Synced                   6m41s                  cloud-node-controller    Node synced successfully
  Normal   RegisteredNode           6m38s                  node-controller          Node ip-10-24-132-168.eu-central-1.compute.internal event: Registered Node ip-10-24-132-168.eu-central-1.compute.internal in Controller
  Normal   ControllerVersionNotice  6m33s                  vpc-resource-controller  The node is managed by VPC resource controller version v1.4.7
  Normal   NodeReady                6m31s                  kubelet                  Node ip-10-24-132-168.eu-central-1.compute.internal status is now: NodeReady
  Normal   NodeTrunkInitiated       6m29s                  vpc-resource-controller  The node has trunk interface initialized successfully

4 - Create a simple deployment:

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
  labels:
    app: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      containers:
        - name: test
          image: hashicorp/http-echo
          args:
            - "-text=HelloWorld"
          ports:
            - containerPort: 5678 # Default port for image
              protocol: TCP

5 - Scale up the deployment, around 41 Pods, the Pods get blocked in ContainerCreating status, with the following error:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "659a090453b7fc6de1f3e8566404b081e42164b36b3e7566e72134084cf69dbc": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

The node has 45 Pods, it should still be able to use 58 Pods, no CPU, Memory or Disk limitations. And, if no more IP addresses could be assigned to that node, the Pods should end in PENDING status instead.

6- Downgrade to CNI version 1.16 and the containers in ContainerCreating status switch to RUNNING.

Environment:

Environment variables:

ADDITIONAL_ENI_TAGS:                    {}
ANNOTATE_POD_IP:                        false
AWS_VPC_CNI_NODE_PORT_SUPPORT:          true
AWS_VPC_ENI_MTU:                        9001
AWS_VPC_K8S_CNI_CONFIGURE_RPFILTER:     false
AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:     false
AWS_VPC_K8S_CNI_EXTERNALSNAT:           false
AWS_VPC_K8S_CNI_LOGLEVEL:               DEBUG
AWS_VPC_K8S_CNI_LOG_FILE:               /host/var/log/aws-routed-eni/ipamd.log
AWS_VPC_K8S_CNI_RANDOMIZESNAT:          prng
AWS_VPC_K8S_CNI_VETHPREFIX:             eni
AWS_VPC_K8S_PLUGIN_LOG_FILE:            /var/log/aws-routed-eni/plugin.log
AWS_VPC_K8S_PLUGIN_LOG_LEVEL:           DEBUG
DISABLE_NETWORK_RESOURCE_PROVISIONING:  false
ENABLE_IPv4:                            true
ENABLE_IPv6:                            false
ENABLE_POD_ENI:                         true
ENABLE_PREFIX_DELEGATION:               false
ENABLE_SUBNET_DISCOVERY:                true
NETWORK_POLICY_ENFORCING_MODE:          standard
VPC_CNI_VERSION:                        v1.18.0
WARM_ENI_TARGET:                        1
WARM_PREFIX_TARGET:                     1

Init container environment variables:

DISABLE_TCP_EARLY_DEMUX:  true
ENABLE_IPv6:              false
ENABLE_POD_ENI:           true
  • Kubernetes version (use kubectl version):
Client Version: v1.28.9
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.7-eks-b9c9ed7
  • CNI versions showing the issue:

    • v1.18.0-eksbuild.1
    • v1.17.1-eksbuild.1
  • CNI versions NOT showing the issue:

    • v1.16.3-eksbuild.2
  • OS (e.g: cat /etc/os-release):

AMI: bottlerocket-aws-k8s-1.28-x86_64-v1.19.4-4f0a078e

os-release:

NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"
  • Kernel (e.g. uname -a):
Linux ip-10-24-135-73.energy.local 6.1.82 #1 SMP PREEMPT_DYNAMIC Fri Apr  5 22:26:15 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
@GnatorX
Copy link
Contributor

GnatorX commented Apr 29, 2024

Are you using pod security group for these pods? Its interesting to see that there isn't the trunk-attached label on the node and it feels similar to aws/karpenter-provider-aws#1252

@davidgp1701
Copy link
Author

Are you using pod security group for these pods? Its interesting to see that there isn't the trunk-attached label on the node and it feels similar to aws/karpenter-provider-aws#1252

Yes, I'm. In the the settings I setup: DISABLE_TCP_EARLY_DEMUX and ENABLE_POD_ENI. I will take a look at the post you mention in detail. Thanks.

@davidgp1701
Copy link
Author

Ok, I did more testing this problem happens only if you enable Security Groups for Pods adding the environment variables described here: https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html , SGs for pods are needed for our clusters.

@GnatorX the trunk-attached label was not set, I tried to set it, both with true and false values, the problem still persists.

@GnatorX
Copy link
Contributor

GnatorX commented Apr 30, 2024

Its a bit more complicated than just adding the label. From what I can tell when we had similar issues, on node creation or pod creation the label must already be on the node object which should be added by the CNI. If the label isn't there and the pod already started creation on the pod, it is possible that the pod will get stuck in ContainerCreating because VPC RC saw a pod land on a "unmanaged node" and its logic is to ignore it until a new event is emit for that pod. This could me unknown amount of time stuck in ContainerCreating.

The label being present on the node prior the pod landing will let VPC resource controller retry until the node is managed. I believe something is wrong with the CNI setup thats why the label isn't present. I am not familiar with EKS setup and I don't work for AWS. It might be worth while to cut a support ticket instead to see.

@orsenthil
Copy link
Member

The node has 45 Pods, it should still be able to use 58 Pods, no CPU, Memory or Disk limitations. And, if no more IP addresses could be assigned to that node, the Pods should end in PENDING status instead.

  • Does ipamd.log and plugin.log give any additional error messages besides pod struck on container creation?
CNI versions NOT showing the issue:
    v1.16.3-eksbuild.2

Are the reset of the environmental factors same when this issue is observed?

We will try to reproduce this issue to know more details on the behavior with your description and share an update.

Could you gather the logs for bottlerocket and share it with using the instructions given in the troubleshooting guide?

@orsenthil
Copy link
Member

orsenthil commented May 1, 2024

I sent the BottleRocket logs collected with logdog to the support e-mail address. Having issues running the collector scrip with Bottlerocket.

I notice you have already sent it, we will review the logs.

@wy100101
Copy link

wy100101 commented May 2, 2024

I'm seeing a similar issue here on v1.18.1-eksbuild.1 with ENABLE_POD_ENI set to "true". I'm not actually using pod security groups:

Name:               ip-10-201-165-12.ec2.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=c6a.2xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-1
                    failure-domain.beta.kubernetes.io/zone=us-east-1b
                    k8s.io/cloud-provider-aws=d233b588ef1fdb73bec8d62908da3a7f
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-201-165-12.ec2.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=c6a.2xlarge
                    olo.com/asg-name=eks-devenv-node-default
                    olo.com/node-group-name=default
                    olo.com/node-type=default
                    topology.ebs.csi.aws.com/zone=us-east-1b
                    topology.kubernetes.io/region=us-east-1
                    topology.kubernetes.io/zone=us-east-1b
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.201.165.12
                    csi.volume.kubernetes.io/nodeid:
                      {"ebs.csi.aws.com":"i-015d8453f004b176e","efs.csi.aws.com":"i-015d8453f004b176e","smb.csi.k8s.io":"ip-10-201-165-12.ec2.internal"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 30 Apr 2024 11:46:37 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-201-165-12.ec2.internal
  AcquireTime:     <unset>
  RenewTime:       Thu, 02 May 2024 02:03:05 -0400
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 02 May 2024 02:02:21 -0400   Tue, 30 Apr 2024 11:46:37 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 02 May 2024 02:02:21 -0400   Tue, 30 Apr 2024 11:46:37 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 02 May 2024 02:02:21 -0400   Tue, 30 Apr 2024 11:46:37 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Thu, 02 May 2024 02:02:21 -0400   Tue, 30 Apr 2024 11:46:58 -0400   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:   10.201.165.12
  InternalDNS:  ip-10-201-165-12.ec2.internal
  Hostname:     ip-10-201-165-12.ec2.internal
Capacity:
  cpu:                        8
  ephemeral-storage:          50620216Ki
  hugepages-1Gi:              0
  hugepages-2Mi:              0
  memory:                     16009796Ki
  pods:                       58
  vpc.amazonaws.com/pod-eni:  38
Allocatable:
  cpu:                        7910m
  ephemeral-storage:          45577849165
  hugepages-1Gi:              0
  hugepages-2Mi:              0
  memory:                     14992964Ki
  pods:                       58
  vpc.amazonaws.com/pod-eni:  38
System Info:
  Machine ID:                 ec2616dd3f2c306558bfd766260713d7
  System UUID:                ec24155c-3b11-d1f6-a252-f3633d61269e
  Boot ID:                    34afba82-6fad-49ab-bb09-ff3fbaa11576
  Kernel Version:             5.15.0-1055-aws
  OS Image:                   Ubuntu 20.04.6 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.7.2
  Kubelet Version:            v1.27.6
  Kube-Proxy Version:         v1.27.6
ProviderID:                   aws:///us-east-1b/i-015d8453f004b176e

Looking at the cni-driver logs and 1 of 4 enis is a trunk eni that has no IPs to allocate limiting the node to 42 IPs

"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"AssignPodIPv4Address: IP address pool stats: total 42, assigned 42"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.211/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.211/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.160.139/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.160.139/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.207/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.207/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.163.134/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.163.134/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.165.111/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.165.111/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.165.202/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.165.202/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.330Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.147/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.147/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.26/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.26/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.26/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.26/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.166.48/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.166.48/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.165.253/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.165.253/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.160.148/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.160.148/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.127/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.127/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.155/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.155/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"AssignPodIPv4Address: ENI eni-04a01feee45a761c6 does not have available addresses"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"AssignPodIPv4Address: ENI eni-01dbfeaf18cf82dd7 does not have available addresses"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.165.80/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.165.80/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.157/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.157/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.163.227/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.163.227/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.166.150/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.166.150/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.143/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.143/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.173/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.173/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.166.229/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.166.229/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.167.188/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.167.188/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.85/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.85/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.104/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.104/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.166.136/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.166.136/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.233/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.233/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.239/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.239/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.140/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.140/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"AssignPodIPv4Address: ENI eni-0aaff8bf49e7d3734 does not have available addresses"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.204/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.204/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.185/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.185/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.218/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.218/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.161.165/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.161.165/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.166.240/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.166.240/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.70/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.70/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.187/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.187/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.164.241/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.164.241/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.160.105/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.160.105/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.163.196/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.163.196/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.163.243/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.163.243/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.163.99/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.163.99/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.163.113/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.163.113/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:687","msg":"Get free IP from prefix failed no free IP available in the prefix - 10.201.162.243/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"Unable to get IP address from CIDR: no free IP available in the prefix - 10.201.162.243/ffffffff"}
{"level":"debug","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"AssignPodIPv4Address: ENI eni-0a211c5e727746570 does not have available addresses"}
{"level":"error","ts":"2024-05-02T01:54:51.331Z","caller":"datastore/data_store.go:607","msg":"DataStore has no available IP/Prefix addresses"}

@Dpkbhatt052
Copy link

Dpkbhatt052 commented May 3, 2024

This is an expected behaviour with security group for pods as IP's to pod that are not using security group for pod would get IP's from the ENI other than trunk ENI.

Basically when you enable security group for pod and if your instance can have maximum of 4 ENI means 56 IPs , vpc resource controller will create a trunk eni out from the 4 ENI. Whenever a pod using security group is deployed , vpc rrsource controller will create a branch eni on the trunk eni and assign IP to it. Normal pods won't get IP from the Trunk ENI. Hence the pods ( that are not using security group for pods will get IPs from the rest of the 3 ENI ) . There are two things here one is pod using security group which is getting IP by vpc resource controller and other is normal pod which is not using security groups who ip allocation is done by ipamd. Now in ipamd you are seeing 42 total IPs because its from remaining 3 ENIs other than trunk ENI which is expected. But max pod contribution would be contributed by both security group for pod and pod not using security group for pod. Hence its expected that 43rd pod will come in container creating as ipamd can max assign 42 IPs to the pods (not using security group for pods). This is an expected behaviour. I believe reducing max pod is the option to get this mitigated. The reason you were seeing 58 pod getting IP was due to a known bug in vpc cni version v1.16.x because there the IPs were getting allocated from trunk eni as well to normal pods not using sg for pod.

#2801

On latest version its resolved and hence you are seeing the expected behaviour.

@davidgp1701
Copy link
Author

I will be closing this issue report. I was assuming something wrong here, I thought the CNI plugin was setting for the different instances a limit of allocatable IP address that then was used to calculate the max_pods and clearly it is something more basic than that, Bottlerocket basically has it hardcoded: https://github.com/bottlerocket-os/bottlerocket/blob/develop/packages/os/eni-max-pods in the AMI.

In the past this was working ok for us and we never noticed the issue until CNI version 1.17 because of this bug: #2801 , maybe because of that I wrongly assumed the max pods in the case I reported it should be 58, instead of 42.

I suppose I will look at a solution like this: #2801 to dynamically assign max pods without needing to remember to set it up for each instance template when we change nodegorups instance types or when we move into Karpenter that allows a misc of instance types with different number of max pods.

Copy link

github-actions bot commented May 6, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

@wy100101
Copy link

wy100101 commented May 8, 2024

I can't tell from this conclusion. Do standard EKS nodes advertise the right number of available IPs?

@davidgp1701
Copy link
Author

I can't tell from this conclusion. Do standard EKS nodes advertise the right number of available IPs?

What do you mean by standard EKS? The Amazon EKS AMI? they have a script to calculate the number of Pods based on the CNI configuration, not using that AMI, so not sure if that script works out of the box or you need to trigger by the userdata.

@ajay15283
Copy link

@davidgp1701, I am facing the same problem with EKS 1.30 with CNI 1.18.3. Have you managed to configure the cluster with CNI 1.18?

@davidgp1701
Copy link
Author

davidgp1701 commented Aug 13, 2024

@davidgp1701, I am facing the same problem with EKS 1.30 with CNI 1.18.3. Have you managed to configure the cluster with CNI 1.18?

I created the following container and added to our ECR repos:

FROM alpine
ARG MAX_PODS_CALC_SCRIPT_URL=https://raw.githubusercontent.com/awslabs/amazon-eks-ami/v20240424/templates/al2/runtime/max-pods-calculator.sh
ARG IMDS_HELPER_SCRIPT_URL=https://raw.githubusercontent.com/awslabs/amazon-eks-ami/v20240424/templates/al2/runtime/bin/imds

RUN apk --update --no-cache add aws-cli bash curl jq

RUN curl -o /usr/local/bin/max-pods-calculator.sh $MAX_PODS_CALC_SCRIPT_URL \
  && chmod +x /usr/local/bin/max-pods-calculator.sh
RUN curl -o /usr/local/bin/imds $IMDS_HELPER_SCRIPT_URL \
  && chmod +x /usr/local/bin/imds

ADD bootstrap.sh ./
RUN chmod +x ./bootstrap.sh
ENTRYPOINT ["./bootstrap.sh"]

The Bootstrap script it is the following one:

#!/usr/bin/env bash

# Source environment variables defined in bootstrap container's user-data
USER_DATA_DIR=/.bottlerocket/bootstrap-containers/current
source "$USER_DATA_DIR/user-data"

if [ "${CUSTOM_NETWORKING}" = "true" ]; then
  CUSTOM_NETWORKING_ARG="--cni-custom-networking-enabled"
fi
if [ "${PREFIX_DELEGATION}" = "true" ]; then
  PREFIX_DELEGATION_ARG="--cni-prefix-delegation-enabled"
fi

# Runs the max-pods-calculator script to calculate max pods
max_pods=$(max-pods-calculator.sh \
  --instance-type-from-imds \
  --cni-version "${CNI_VERSION}" \
  "${CUSTOM_NETWORKING_ARG}" \
  "${PREFIX_DELEGATION_ARG}" \
  ${CNI_MAX_ENI:+--cni-max-eni "${CNI_MAX_ENI}"})
if [ "${?}" -ne 0 ]; then
  echo "ERROR: Failed to calculate max-pods value using the max-pods-calculator helper script: ${max_pods}" >&2
  exit 1
fi

# Set the max-pods setting via Bottlerocket's API
if ! apiclient set kubernetes.max-pods="${max_pods}"; then
  echo "ERROR: Failed set kubernetes.max-pods setting via Bottlerocket API" >&2
  exit 1
fi

Then, in Terraform, when defined the nodegroups, we pass the following bootstrap configuration (we are using Bottlerocket):

locals {
  user_data    = <<-EOT
    export CNI_VERSION=${local.cni_version}
    export CUSTOM_NETWORKING=true
  EOT
  user_data_64 = base64encode(local.user_data)

  bootstrap_extra_args = <<-EOT
    [settings.bootstrap-containers.max-pods-calculator]
    source = "${var.aws_account_id}.dkr.ecr.${var.aws_region}.amazonaws.com/max_pods:latest"
    essential = false
    mode = "always"
    user-data = "${local.user_data_64}"
  EOT
}

That should help you to create similar configuration.

@deasydoesit
Copy link

deasydoesit commented Jan 22, 2025

This is a critical issue with opaque visibility.

We either need (1) better logging around when this edge case is hit, (2) placement of pods into Error state rather than ContainerCreating so Cluster Autoscaler / Karpenter can adjust accordingly, or (3) native understanding of the real pod limits to prevent scheduling on at-capacity nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants