Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JsonRpcConnection: Don't read any data on shutdown #10213

Merged
merged 1 commit into from
Nov 7, 2024

Conversation

yhabteab
Copy link
Member

@yhabteab yhabteab commented Oct 31, 2024

When the Desconnect() method is called, clients are not disconnected immediately. Instead, a new coroutine is spawned using the same strand as the other coroutines. This coroutine calls async_shutdown on the TCP socket, which might be blocking. However, in order not to block indefintely, the Timeout class cancels all operations on the socket after 10 seconds. Though, the timeout does not trigger the handler immediately; it spawns another coroutine using the same strand as in the JsonRpcConnection class. This can cause unexpected delays if e.g. HandleIncomingMessages gets resumed before the coroutine from the timeout class. Apart from that, the coroutine for writing messages uses the same condition, making the two symmetrical.

I noticed this while analyzing the logs from ref/NC/820479.

@yhabteab yhabteab added bug Something isn't working area/distributed Distributed monitoring (master, satellites, clients) labels Oct 31, 2024
@yhabteab yhabteab added this to the 2.15.0 milestone Oct 31, 2024
@cla-bot cla-bot bot added the cla/signed label Oct 31, 2024
lib/remote/jsonrpcconnection.cpp Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
lib/remote/jsonrpcconnection.cpp Outdated Show resolved Hide resolved
@yhabteab yhabteab added consider backporting Should be considered for inclusion in a bugfix release ref/NC labels Oct 31, 2024
@yhabteab yhabteab force-pushed the do-not-read-data-on-disconnect branch from b9cbe94 to 933abc3 Compare October 31, 2024 15:17
When the `Desconnect()` method is called, clients are not disconnected
immediately. Instead, a new coroutine is spawned using the same strand
as the other coroutines. This coroutine calls `async_shutdown` on the
TCP socket, which might be blocking. However, in order not to block
indefintely, the `Timeout` class cancels all operations on the socket
after `10` seconds. Though, the timeout does not trigger the handler
immediately; it creates spawns another coroutine using the same strand
as in the `JsonRpcConnection` class. This can cause unexpected delays if
e.g. `HandleIncomingMessages` gets resumed before the coroutine from the
timeout class. Apart from that, the coroutine for writing messages uses
the same condition, making the two symmetrical.
@yhabteab yhabteab force-pushed the do-not-read-data-on-disconnect branch from 933abc3 to 1c34610 Compare October 31, 2024 16:09
@yhabteab yhabteab requested review from Al2Klimov and julianbrost and removed request for julianbrost and Al2Klimov October 31, 2024 16:09
@Al2Klimov Al2Klimov merged commit 9a8620d into master Nov 7, 2024
27 checks passed
@Al2Klimov Al2Klimov deleted the do-not-read-data-on-disconnect branch November 7, 2024 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working cla/signed consider backporting Should be considered for inclusion in a bugfix release ref/NC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants