replace netcat with socat for liveness probe/metrics #22
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We've had an issue on Google Kubernetes Engine, on a node with
kernel version 4.14.138+, where liveness probes would regularly fail
some percentage of the time.
We've traced the problem down to the
poll()
system call sometimesfailing in the
nc
command used in the liveness probe, whereuponnc
returns an empty response, despite the TCP connection fromZookeeper clearly sending back an
imok
.Netcat uses
select()
,poll()
,read()
, wherepoll()
sometimesthrows an error because Zookeeper has closed the TCP connection.
Socat uses
select()
,read()
, which works here.