Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure signer doesn't exit on connection change #524

Merged
merged 1 commit into from
Oct 7, 2024
Merged

Conversation

roeierez
Copy link
Collaborator

@roeierez roeierez commented Oct 6, 2024

We have noticed that when changing a network while greenlight (using breez sdk) is running we no longer able to create an invoice.
Digging into this I have noticed the signer exists with these logs:

[2024-10-06 14:26:41.256 DEBUG tonic::codec::decode:204] decoder inner stream error: Status { code: Unknown, message: "error reading a body from connection: address not available", source: Some(hyper::Error(Body, Error { kind: Io(Kind(AddrNotAvailable)) })) }
[2024-10-06 14:26:41.256 DEBUG gl_client::signer:855] Got an error from the scheduler status: Unknown, message: "error reading a body from connection: address not available", details: [], metadata: MetadataMap { headers: {} }
[2024-10-06 14:26:41.256 ERROR gl_client::signer:807] Scheduler signer loop exited unexpectedly: Err(Scheduler stream error Status { code: Unknown, message: "error reading a body from connection: address not available", source: Some(hyper::Error(Body, Error { kind: Io(Kind(AddrNotAvailable)) })) })
[2024-10-06 14:26:41.257 INFO gl_client::signer:812] Exiting the signer loop
[2024-10-06 14:26:41.257 INFO breez_sdk_core::greenlight::node_api:1220] signer exited gracefully

This PR ensures the signer won't exit and will indeed run forever as intended.
The same approach of run_forever_inner was applied to run_forever_scheduler

Note: while this prevents the signer from exiting and allow the app to recover there is still a gap of ~3 minutes before the new connected signer is receiving messages. Because I clearly see that after the fix the local signer reconnects successfully I wonder if some keep alive settings on the backend side makes the scheduler "think" the old signer is still connected...

@roeierez roeierez requested review from cdecker and nepet October 6, 2024 13:25
@JssDWt
Copy link
Collaborator

JssDWt commented Oct 6, 2024

Might fix breez/breez-sdk-greenlight#1090
Might fix #521

@cdecker
Copy link
Collaborator

cdecker commented Oct 7, 2024

Nice change, thanks @roeierez 👍

Note: while this prevents the signer from exiting and allow the app to recover there is still a gap of ~3 minutes before the new connected signer is receiving messages. Because I clearly see that after the fix the local signer reconnects successfully I wonder if some keep alive settings on the backend side makes the scheduler "think" the old signer is still connected...

We buffer all pending requests, and will redeliver them to any signer that connects, redundantly, exactly for this case where a signer dies silently. So if the signer isn't immediately getting any pending request on reconnect we need to take a look at it. Generally speaking we let the client drive as much as possible, since it most likely has the most information on its network and connectivity state (power safe mode, network switch, etc), whereas the server side is just always on.

@cdecker cdecker merged commit 4b54918 into main Oct 7, 2024
13 checks passed
@cdecker cdecker deleted the fix-scheduler-loop branch October 7, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants