You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently, we encountered a problem: the region epoch on PD is fallback due to the PD synchronizing issue when transferring the leader. And the new PD leader doesn't have the latest information on some regions including the peer list and the leader.
But the client keeps trying to send the request to the first peer in the old peer list which doesn't contain the new leader peer. Then the client gets the not leader error from the TiKV. After receiving the error, the client decides to update the region cache from PD, however, the information on PD is still stale. This process lasts for 2min which affects the user's business. But actually, the correct information is returned through the TiKV's response. We can use that information to update the region cache to avoid some unexpected issues of PD which make the system more robust.
The text was updated successfully, but these errors were encountered:
@wuhuizuo: The label(s) /label ? cannot be applied. These labels are supported: ``. Is this label configured under labels -> additional_labels or `labels -> restricted_labels` in `plugin.yaml`?
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Recently, we encountered a problem: the region epoch on PD is fallback due to the PD synchronizing issue when transferring the leader. And the new PD leader doesn't have the latest information on some regions including the peer list and the leader.
But the client keeps trying to send the request to the first peer in the old peer list which doesn't contain the new leader peer. Then the client gets the not leader error from the TiKV. After receiving the error, the client decides to update the region cache from PD, however, the information on PD is still stale. This process lasts for 2min which affects the user's business. But actually, the correct information is returned through the TiKV's response. We can use that information to update the region cache to avoid some unexpected issues of PD which make the system more robust.
The text was updated successfully, but these errors were encountered: