-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connectetivity to Kubernetes API service timeout #1797
Comments
@ErikLundJensen - Please let us know if it possible to open a support ticket? With the ticket if you can share the clusterARN we can verify iptables and also check if it is any known issues like this - kubernetes/client-go#374 |
@ErikLundJensen CNI doesn't setup any IP table rules to facilitate API Server access. Also, I see that the hostNetwork and Pod IP ranges are different. Are you using Custom networking with CNI? Also, any reason you carved out both Pod (10.64.x.x) and Service IP ranges (10.100.x.x) from |
Yes, we are using Custom networking with CNI. The subnet in each availability zone gets a block from 10.64.x.x (for example 10.64.64.0/18), however, the service IP range is not tight to any particular availability zone and therefore gets another CIDR block. As @achevuru wrote, it is responsibility of kube-proxy to setup the iptables for mapping 10.100.0.1 to the 192.168.0.12 and 192.168.0.91 and therefore this is issue is most likely not related to AWS Customer CNI. However, this could either be to re-connection issues as @jayanthvn wrote or even related to |
I recreated the cluster from scratch using Terraform and realised that a couple of security groups were not destroyed. The result was mix of old and new security groups having the same value for Name attribute in tags, however, with different generated security group names. |
|
What happened:
We have been running AWS CNI for months but today when scaling cluster from 0 to a few nodes we ran into the following problem.
CoreDns get timeout when trying to connect to Kubernetes API at 10.100.0.1
CNI uses 10.64.x.x networks for the pod networks.
Service CIDR is 10.100.0.0.
HostNetwork is 192.168.x.x.
When testing the connectivity from other pods we do see the same problem -- except those pods that uses hostNetwork (192.168...)
The Kubernetes Endpoint and Kubernetes Service looks all fine in namespace "default" - I have compared with another environment running the same setup.
Where is the translation from Kubernetes ClusterIP 10.100.0.1 to Endpoint 192.168.0.12 resolved?
kube-proxy adds kubernetes service port at each node as seen from the kube-proxy log:
Adding new service port "default/kubernetes:https" at 10.100.0.1:443/TCP
The aws-node pod (running the Customer CNI setup) does not report any errors in the logs.
Environment:
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-0389ca3", GitCommit:"8a4e27b9d88142bbdd21b997b532eb6d493df6d2", GitTreeState:"clean", BuildDate:"2021-07-31T01:34:46Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
The text was updated successfully, but these errors were encountered: