Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to load correct certificates if 1 invalid one is in consul k/v #941

Open
finwo opened this issue Feb 21, 2024 · 2 comments
Open

Unable to load correct certificates if 1 invalid one is in consul k/v #941

finwo opened this issue Feb 21, 2024 · 2 comments

Comments

@finwo
Copy link

finwo commented Feb 21, 2024

Summary

When there's a single invalid certificate in the consul certificate store in use by fabio, all the other certificates in the store will not be refreshed.

Upon a fresh start of fabio (like a server reboot), this causes no single ssl endpoint to function anymore.

Steps to reproduce

We're running fabio using the fabiolb/fabio:latest image from dockerhub on a nomad cluster as system job (running on every worker node).

First off, here's the used config loaded into /etc/fabio/fabio.properties within the container from a nomad template:

proxy.addr = :{{env "NOMAD_PORT_http"}};proto=http,:{{env "NOMAD_PORT_https"}}:proto=https;cs=cssl
ui.addr = :{{env "NOMAD_PORT_ui"}}
registry.consul.register.enabled = false
{{with nomadVar "nomad/jobs/fabio"}}
registry.consul.addr = {{.consul_addr}}
proxy.cs = cs=cssl;type=consul;cert=http://{{.consul_addr}}/v1/kv/certs
{{end}}

The certificates are loaded into /certs within consul as bundles, so /certs/example.com.pem contains the full chain including the private key.

To replicate the issue, insert a single certificate where the private key does not match the certificate. This will result in fabio not loading any certificates.

Expected result

Fabio to load all valid certificates, and ignore/skips the invalid certificates, allowing domains where valid certificates are available for to continue to function.

Actual result

Fabio refuses to load or update any certificate, causing all services it's loadbalancing to fail upon ssl initialization because it fails to find a valid certificate for the servername indicated by SNI.

@tristanmorgan
Copy link
Member

func loadCertificates(...) tries to load all certs and if one fails, an error is returned.
The logs do reflect this and this is expected behaviour for the code as it stands now. should we change it is another question.

@finwo
Copy link
Author

finwo commented Jan 11, 2025

For proof-of-concept setups having a single invalid one fail the entire batch makes sense, but for production setups having one invalid certificate not take down the entire updating or ingesting of all certificates "feels" unstable.

Note that it does not take down already-running instances, but it does remove the ability to spin up new instances of fabio. When running a cluster on hardware, this is not a major issue, but when running a cluster on spot-instances in AWS where there's significant churn this will take down the entire ingress path of the cluster.

So even if both types of behaviors would be desirable depending on the context, having that be configurable would be a major improvement in my opinion.

Regardless of the approach taken, having a clearly defined log that states which certificate was invalid would be a great help already, allowing me to add a pattern matcher to our logs that notifies me whenever something goes wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants