-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The keepalive.conf provided by the CPLB creates an infinite loop. #5178
Comments
With this setting, LVS is enabled for all control planes. Since this occurs by chance, you may not encounter the problem right away.
As a temporary workaround, I changed lbAlgo to “sh” as described above. The CPLB LbAlgo is “rr” by default, which is RoundRobin, but I don't think this is appropriate because it can cause a cycle. Has anyone else noticed this happening? |
For example, in this configuration, the active VIPs are held by 10.1.0.3. -> [ 10.1.0.3(lvs)] -> [ 10.1.0.2(lvs)] -> [10.1.0.2(kube-apiserver)] Because of the “lb_algo sh”, the 10.1.0.3(lvs) draw and the 10.1.0.2(lvs) draw have the same result and no infinite loop occurs.
|
i run into the same issue and the workaround by @chattytak works for now. |
I have confirmed that one of the three control planes crashes and cannot be restarted when applying configuration changes using k0sctl apply to an environment installed with CPLB (work-around already enabled) and NLLB enabled. I don't think CPLB is well tested and should not be introduced into production, even with workarounds. |
Thanks for the report. Indeed it's not tested enough, that's why it's a beta feature. |
Before creating an issue, make sure you've checked the following:
Platform
Linux 5.14.0-362.24.2.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Mar 30 14:11:54 EDT 2024 x86_64 GNU/Linux
NAME="AlmaLinux"
VERSION="9.3 (Shamrock Pampas Cat)"
ID="almalinux"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="AlmaLinux 9.3 (Shamrock Pampas Cat)"
ANSI_COLOR="0;34"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:almalinux:almalinux:9::baseos"
HOME_URL="https://almalinux.org/"
DOCUMENTATION_URL="https://wiki.almalinux.org/"
BUG_REPORT_URL="https://bugs.almalinux.org/"
ALMALINUX_MANTISBT_PROJECT="AlmaLinux-9"
ALMALINUX_MANTISBT_PROJECT_VERSION="9.3"
REDHAT_SUPPORT_PRODUCT="AlmaLinux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.3"
Version
v1.31.1+k0s.1
Sysinfo
`k0s sysinfo`
What happened?
In the keepalived.conf provided by the CPLB, the
virtual_server
section is enabled for all control planes, which is an incorrect setting.If a SYN is received on a control plane that is a MASTER by definition of the
vrrp_instance
section, it will be load balanced according to the definition of thevirtual_server
section of the MASTER.If the BACKUP side is selected at this time, the
virtual_server
will also operate on the control plane that is the BACKUP, and load balancing will occur there as well.The next time the MASTER is selected, the
persistence_timeout
is in effect, so it goes to the BACKUP again, which in turn goes to the MASTER, and so on in a loop.To solve this problem, make the configuration in the
virtual_server
section a separate file and load it using theinclude
parameter. Run the script using thenotify_master
andnotify_backup
parameters in thevrrp_instance
section, with theinclude
parameter enabled only for MASTER and theinclude
parameter for BACKUP comment out and reload keepalived.(Recognizing that reloading keepalived will cause the
notify_backup
script to run again, so a check mechanism is needed to prevent a reload loop from occurring.)The following is a reference site, although it is in Japanese.
https://weseek.co.jp/tech/2989/#keepalived-2
Steps to reproduce
Expected behavior
No response
Actual behavior
No response
Screenshots and logs
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: