-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race between ResendRequest and SequenceReset when using async SessionProxy #503
Comments
The test also tries to reproduce an inversion of SendSequenceReset and ResendRequest messages, but it is not reproduced (it happens only when a SessionProxy causes the ResendRequest to be sent asynchronously).
I added a system test:
I guess the fix would be to modify the replaying code so that it sends a message to the library to instruct it to send the SequenceReset message, but I have no idea where to do this :-/ EDIT: after another look, the problem is probably that the saveValidResendRequest() is directly called by the session when receiving the ResendRequest. I will try to make it go through a new method in SessionProxy, so we can ensure that the replaying is triggered only after we have fully sent our own ResendRequest. |
I pushed a possible solution in master...pcdv:artio:master I reused the existing ResendRequestController interface and added a delay() method in ResendRequestResponse so the Session does not immediately execute the ResendRequest. It is then possible to call method Session#executeResendRequest(int, uk.co.real_logic.artio.util.AsciiBuffer, int, int, int) asynchronously. This method has been extracted from current code, whose default behavior hasn't changed. |
I submitted PR #505. Finally, I didn't modify the ResendRequestController (except its javadoc) but added a method in SessionProxy, and a public method in Session that can be invoked by the proxy when ready to process the ResendRequest. |
Consider the following scenario:
After the second connection ARTIO sent messages in the wrong order:
The sequence would be correct (IMHO) if the ResendRequest was sent before the SequenceReset, or if there was NewSeqNo = 7 in the SequenceReset. This sequence causes a disconnection because EXCHANGE detects a "MsgSeqNum too low" issue.
The problem seems related to the fact that the SequenceReset is sent directly by the Replayer when it is done replaying, while the ResendRequest is directly sent by the library when it detects the high MsgSeqNum in Logon message.
I should add that I use a SessionProxy to route all outbound messages through a cluster, and this is probably increasing the probability of such a bug as it delays the sending of the ResendRequest. NB: it seems that SessionProxy#sendSequenceReset() is never called, so I don't have the opportunity to override the value of NewSeqNo to a correct value. But SessionProxy#sendResendRequest() is called as expected.
I'm able to reproduce the bug easily using my custom setup/code. I will try to reproduce it using pure artio dependencies.
The text was updated successfully, but these errors were encountered: