Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replica lag throttler not respecting lag time #155

Open
colin-strong opened this issue Jan 18, 2024 · 1 comment
Open

Replica lag throttler not respecting lag time #155

colin-strong opened this issue Jan 18, 2024 · 1 comment

Comments

@colin-strong
Copy link

Ad platform has configured LHMs to use the replica lag throttler.

Running an LHM shows the following logs:

INFO -- : Max current replica lag: 0

(Splunk)

However, Observe shows that replication lag for the database undergoing the LHM is over 40 minutes:

Screenshot 2024-01-18 at 7 24 20 AM
@tgwizard
Copy link

When I've seen this before it's because the LHM replica lag throttler resolves the addresses of the replicas by querying the writer:

https://github.com/Shopify/lhm/blob/5f748d2833883a9ccb74732e746a8bb2867ee83a/lib/lhm/throttler/replica_lag.rb#L89C7-L89C31

And then tries to connect to the replicas through those (IP) addresses, and fails because it doesn't have access (network / permissions, etc.). Usually replica access goes not directly to the DB but through ProxySQL, but these replica addresses that the writer sees are the IP addresses of the actual replica DBs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants