-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS CNI failed to add chain rule for each CIDR in VPC with nf_tables mode #2373
Comments
Hi @HenryXie1 , thanks for raising this issue. Our support for RHEL is best-effort, and attaching 15 CIDRs to a VPC is definitely not a recommended design pattern, so this is a case we have not seen before. To support this, we can restructure this code: https://github.com/aws/amazon-vpc-cni-k8s/blob/master/pkg/networkutils/network.go#L427 to have a chain per CIDR and one jump per chain. Also, I redacted the account information from your post above. You definitely do not want to share that information on a public forum. I filed a support request to have it permanently removed. |
Hey @jdn5126 |
Sorry, I can't provide any ETA on this as we will have to internally figure out how to prioritize this. In the meantime, we are happy to take PRs if anyone is interested in taking up the fix |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days |
ETA plz |
We are currently defining a solution for this. No ECD at this time |
Closing as fixed by #2697. This will ship in release |
|
What happened:
After adopting Red Hat Enterprise Linux 8.7 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 AMI (08d4770d37a0b97a1) which enabled nftable, all AWS CNI pods are pending with:
time="2023-04-17T03:17:29Z" level=info msg="Updating iptables mode to nft"
Installed /host/opt/cni/bin/aws-cni
Installed /host/opt/cni/bin/egress-v4-cni
time="2023-04-17T03:17:29Z" level=info msg="Starting IPAM daemon... "
time="2023-04-17T03:17:29Z" level=info msg="Checking for IPAM connectivity... "
time="2023-04-17T03:17:30Z" level=info msg="Copying config file... "
time="2023-04-17T03:17:30Z" level=info msg="Successfully copied CNI plugin binary and config file."
time="2023-04-17T03:17:30Z" level=error msg="Failed to wait for IPAM daemon to complete" error="exit status 1"
Node's ipamd log has error:
{"level":"error","ts":"2023-04-27T08:27:59.957Z","caller":"networkutils/network.go:368",
"msg":"host network setup: failed to add nat/AWS-SNAT-CHAIN-14 rule [14] AWS-SNAT-CHAIN shouldExist true rule [
! -d 100.64.0.0/16 -m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-15], running [/usr/sbin/iptables -t nat -A AWS-SNAT-CHAIN-14 ! -d 100.64.0.0/16 -m comment --comment AWS SNAT CHAIN -j AWS-SNAT-CHAIN-15 --wait]: exit status 4: iptables v1.8.4 (nf_tables): RULE_APPEND failed (Too many links): rule in chain AWS-SNAT-CHAIN-14\n"}
AWS CNI adds chain rule for each CIDR in VPC, and nftable has hard limitation NFT_JUMP_STACK_SIZE = 16 defined in the code (e.g https://codebrowser.dev/linux/linux/include/net/netfilter/nf_tables.h.html#22). So for VPCs which has 15 or above CIDRs, CNI can't attach all rules successfully in nftable which caused error above.
Attach logs
What you expected to happen:
AWS CNI with nf_table mode should be aware of the hard limitation NFT_JUMP_STACK_SIZE = 16 and avoid to create a such long chain.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
[Redacted]
Environment:
Kubernetes version (use
kubectl version
):Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.9", GitCommit:"a1a87a0a2bcd605820920c6b0e618a8ab7d117d4", GitTreeState:"clean", BuildDate:"2023-04-12T12:16:51Z", GoVersion:"go1.19.8", Compiler:"gc", Platform:"darwin/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.17-eks-ec5523e", GitCommit:"d9e9b09276855a532739ef8cb728194aa145430b", GitTreeState:"clean", BuildDate:"2023-03-20T18:46:36Z", GoVersion:"go1.19.6", Compiler:"gc", Platform:"linux/amd64"}
CNI Version
v1.12.2
OS (e.g:
cat /etc/os-release
):[root@ANL10230842 ~]# cat /etc/os-release
NAME="Red Hat Enterprise Linux"
VERSION="8.7 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.7"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.7 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/red_hat_enterprise_linux/8/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.7
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.7"
uname -a
):The text was updated successfully, but these errors were encountered: