-
Notifications
You must be signed in to change notification settings - Fork 753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod stuck in ContainerCreating
status while waiting for an IP address to get assigned
#2892
Comments
Are you using pod security group for these pods? Its interesting to see that there isn't the |
Yes, I'm. In the the settings I setup: |
Ok, I did more testing this problem happens only if you enable Security Groups for Pods adding the environment variables described here: https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html , SGs for pods are needed for our clusters. @GnatorX the |
Its a bit more complicated than just adding the label. From what I can tell when we had similar issues, on node creation or pod creation the label must already be on the node object which should be added by the CNI. If the label isn't there and the pod already started creation on the pod, it is possible that the pod will get stuck in ContainerCreating because VPC RC saw a pod land on a "unmanaged node" and its logic is to ignore it until a new event is emit for that pod. This could me unknown amount of time stuck in ContainerCreating. The label being present on the node prior the pod landing will let VPC resource controller retry until the node is managed. I believe something is wrong with the CNI setup thats why the label isn't present. I am not familiar with EKS setup and I don't work for AWS. It might be worth while to cut a support ticket instead to see. |
Are the reset of the environmental factors same when this issue is observed? We will try to reproduce this issue to know more details on the behavior with your description and share an update. Could you gather the logs for bottlerocket and share it with using the instructions given in the troubleshooting guide? |
I notice you have already sent it, we will review the logs. |
I'm seeing a similar issue here on v1.18.1-eksbuild.1 with ENABLE_POD_ENI set to "true". I'm not actually using pod security groups:
Looking at the cni-driver logs and 1 of 4 enis is a trunk eni that has no IPs to allocate limiting the node to 42 IPs
|
This is an expected behaviour with security group for pods as IP's to pod that are not using security group for pod would get IP's from the ENI other than trunk ENI. Basically when you enable security group for pod and if your instance can have maximum of 4 ENI means 56 IPs , vpc resource controller will create a trunk eni out from the 4 ENI. Whenever a pod using security group is deployed , vpc rrsource controller will create a branch eni on the trunk eni and assign IP to it. Normal pods won't get IP from the Trunk ENI. Hence the pods ( that are not using security group for pods will get IPs from the rest of the 3 ENI ) . There are two things here one is pod using security group which is getting IP by vpc resource controller and other is normal pod which is not using security groups who ip allocation is done by ipamd. Now in ipamd you are seeing 42 total IPs because its from remaining 3 ENIs other than trunk ENI which is expected. But max pod contribution would be contributed by both security group for pod and pod not using security group for pod. Hence its expected that 43rd pod will come in container creating as ipamd can max assign 42 IPs to the pods (not using security group for pods). This is an expected behaviour. I believe reducing max pod is the option to get this mitigated. The reason you were seeing 58 pod getting IP was due to a known bug in vpc cni version v1.16.x because there the IPs were getting allocated from trunk eni as well to normal pods not using sg for pod. On latest version its resolved and hence you are seeing the expected behaviour. |
I will be closing this issue report. I was assuming something wrong here, I thought the CNI plugin was setting for the different instances a limit of allocatable IP address that then was used to calculate the In the past this was working ok for us and we never noticed the issue until CNI version 1.17 because of this bug: #2801 , maybe because of that I wrongly assumed the max pods in the case I reported it should be 58, instead of 42. I suppose I will look at a solution like this: #2801 to dynamically assign max pods without needing to remember to set it up for each instance template when we change nodegorups instance types or when we move into Karpenter that allows a misc of instance types with different number of max pods. |
This issue is now closed. Comments on closed issues are hard for our team to see. |
I can't tell from this conclusion. Do standard EKS nodes advertise the right number of available IPs? |
What do you mean by |
@davidgp1701, I am facing the same problem with EKS 1.30 with CNI 1.18.3. Have you managed to configure the cluster with CNI 1.18? |
I created the following container and added to our ECR repos: FROM alpine
ARG MAX_PODS_CALC_SCRIPT_URL=https://raw.githubusercontent.com/awslabs/amazon-eks-ami/v20240424/templates/al2/runtime/max-pods-calculator.sh
ARG IMDS_HELPER_SCRIPT_URL=https://raw.githubusercontent.com/awslabs/amazon-eks-ami/v20240424/templates/al2/runtime/bin/imds
RUN apk --update --no-cache add aws-cli bash curl jq
RUN curl -o /usr/local/bin/max-pods-calculator.sh $MAX_PODS_CALC_SCRIPT_URL \
&& chmod +x /usr/local/bin/max-pods-calculator.sh
RUN curl -o /usr/local/bin/imds $IMDS_HELPER_SCRIPT_URL \
&& chmod +x /usr/local/bin/imds
ADD bootstrap.sh ./
RUN chmod +x ./bootstrap.sh
ENTRYPOINT ["./bootstrap.sh"] The Bootstrap script it is the following one: #!/usr/bin/env bash
# Source environment variables defined in bootstrap container's user-data
USER_DATA_DIR=/.bottlerocket/bootstrap-containers/current
source "$USER_DATA_DIR/user-data"
if [ "${CUSTOM_NETWORKING}" = "true" ]; then
CUSTOM_NETWORKING_ARG="--cni-custom-networking-enabled"
fi
if [ "${PREFIX_DELEGATION}" = "true" ]; then
PREFIX_DELEGATION_ARG="--cni-prefix-delegation-enabled"
fi
# Runs the max-pods-calculator script to calculate max pods
max_pods=$(max-pods-calculator.sh \
--instance-type-from-imds \
--cni-version "${CNI_VERSION}" \
"${CUSTOM_NETWORKING_ARG}" \
"${PREFIX_DELEGATION_ARG}" \
${CNI_MAX_ENI:+--cni-max-eni "${CNI_MAX_ENI}"})
if [ "${?}" -ne 0 ]; then
echo "ERROR: Failed to calculate max-pods value using the max-pods-calculator helper script: ${max_pods}" >&2
exit 1
fi
# Set the max-pods setting via Bottlerocket's API
if ! apiclient set kubernetes.max-pods="${max_pods}"; then
echo "ERROR: Failed set kubernetes.max-pods setting via Bottlerocket API" >&2
exit 1
fi Then, in Terraform, when defined the nodegroups, we pass the following bootstrap configuration (we are using Bottlerocket): locals {
user_data = <<-EOT
export CNI_VERSION=${local.cni_version}
export CUSTOM_NETWORKING=true
EOT
user_data_64 = base64encode(local.user_data)
bootstrap_extra_args = <<-EOT
[settings.bootstrap-containers.max-pods-calculator]
source = "${var.aws_account_id}.dkr.ecr.${var.aws_region}.amazonaws.com/max_pods:latest"
essential = false
mode = "always"
user-data = "${local.user_data_64}"
EOT
} That should help you to create similar configuration. |
This is a critical issue with opaque visibility. We either need (1) better logging around when this edge case is hit, (2) placement of pods into |
What happened:
We migrated our clusters to EKS 1.28 from 1.27. That upgrade also included migrating from CNI version 1.16 to CNI 1.18. After a while, in nodes with a relatively high number of containers we start seeing Pods blocked in
ContainerCreating
state:Doing a description of the container it shows the following error:
Our subnets have more than enough IP addresses.
The instance type it is:
m6a.2xlarge
and right now the node only is running 42 Pods with more of enough free CPU and Memory:Considering that a
m6a.2xlarge
could accomodate 58 Pods, we are at 10 pods to reach that limit, so the container should be able to get an IP address.Attach logs
I sent the BottleRocket logs collected with
logdog
to the support e-mail address. Having issues running the collector scrip with Bottlerocket.What you expected to happen:
As previously commented, I expected two things:
1.16
of the CNI. Just by downgrading of CNI version, without restarting/recreating the node, the node was able to run successfully 58 Pods, instead of the limit of 48 Pods and the rest inContainerCreating
state.PENDING
state so the cluster autoscaler notices it and triggers the scale-up of the NodeGroup.How to reproduce it (as minimally and precisely as possible):
1- Create an EKS 1.28 cluster.
2- Installed the followed addon versions:
For the CNI addon, use these parameters aside from the default configuration:
3- Create a test nodegroup based on the
m6a.2xlarge
instance. It has a fix size of 1 instance. According to the docs, it should be able to accommodate 58 Pods.After everything it is created, this should be the state of the only node:
4 - Create a simple deployment:
5 - Scale up the deployment, around 41 Pods, the Pods get blocked in
ContainerCreating
status, with the following error:The node has 45 Pods, it should still be able to use 58 Pods, no CPU, Memory or Disk limitations. And, if no more IP addresses could be assigned to that node, the Pods should end in
PENDING
status instead.6- Downgrade to CNI version 1.16 and the containers in
ContainerCreating
status switch toRUNNING
.Environment:
Environment variables:
Init container environment variables:
kubectl version
):CNI versions showing the issue:
CNI versions NOT showing the issue:
OS (e.g:
cat /etc/os-release
):AMI:
bottlerocket-aws-k8s-1.28-x86_64-v1.19.4-4f0a078e
os-release:
uname -a
):Linux ip-10-24-135-73.energy.local 6.1.82 #1 SMP PREEMPT_DYNAMIC Fri Apr 5 22:26:15 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
The text was updated successfully, but these errors were encountered: