feat: node recovery strategies #2076

weboko · 2024-07-18T13:56:49Z

This is a feature request

Problem

Prerequisite: #2070

Once we can understand in what state a light node is we should provide clear strategies for consumers to recover from undesirable states.

Proposed Solutions

For each node health state we should develop a method that should be triggered to recover from it.
Additionally we should introduce option networkRecovery: boolean that would make triggering of such states automatic.

This behavior should be off by default and tested before making it on by default.

Node health state:

sufficiently healthy: no actions needed;
minimally healthy:
- find and establish connection to new peers to fulfill needed requirements:
unhealthy:
- if possible use previous operation (check in practice if it works);
- if no implement hard reset operation that would re-establish connection to bootstrap nodes and will start all over again;
expose mentioned API;
implement auto triggering operation if networkRecovery option was provided;

Note: hard reset operation won't help in case if node is offline (from Internet) and we should be clear about it in our API / documentation / behavior.

The text was updated successfully, but these errors were encountered:

weboko · 2024-07-18T15:49:11Z

From discussion:

partially healthy strategy:
- from @danisharora099 : background approach should be sufficient enough

weboko · 2024-07-18T15:50:37Z

ping @vpavlin @hackyguru for perspective on the issue as we are not sure if it is needed

vpavlin · 2024-07-18T16:46:07Z

I am a bit confused by

For each node health state we should develop a method

I'd expect there are basically 2 methods - "find new nodes" and "hard reset" which can potentially be used in any health state by the app dev in case their app notices anything weird?

And then automating it on waku level by setting networkRecovery: true

Or maybe this is what has been said?:)

weboko · 2024-07-22T08:50:07Z

@vpavlin there could be two methods or just one depending how we find it better.

but the question here is more - do we need it? Have you noticed before such a need when talked to people that use js-waku?

weboko · 2024-08-20T13:36:47Z

As we don't have enough evidence it would give a lot of improvement for developers - iceboxing for now.

weboko · 2024-10-16T22:27:39Z

From https://github.com/waku-org/support/issues/2

Maybe we need a full network wipe feature (zy0n)

Ideally js-waku should be able to recover from bad situations.
Keeping Iceboxed for now, need more feedback after fixing original problem with Filter #2158

chair28980 mentioned this issue Jul 18, 2024

[Epic: js-waku] Reliability Protocol for Resource-Restricted Clients #2154

Closed

39 tasks

fryorcraken added this to Waku Jul 18, 2024

weboko mentioned this issue Jul 18, 2024

feat: introduce node health as a metric #2070

Closed

weboko moved this to Triage in Waku Jul 18, 2024

weboko moved this from Triage to Blocked in Waku Jul 18, 2024

weboko moved this from Blocked to Icebox in Waku Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: node recovery strategies #2076

feat: node recovery strategies #2076

weboko commented Jul 18, 2024

weboko commented Jul 18, 2024

weboko commented Jul 18, 2024

vpavlin commented Jul 18, 2024

weboko commented Jul 22, 2024

weboko commented Aug 20, 2024

weboko commented Oct 16, 2024

feat: node recovery strategies #2076

feat: node recovery strategies #2076

Comments

weboko commented Jul 18, 2024

Problem

Proposed Solutions

weboko commented Jul 18, 2024

weboko commented Jul 18, 2024

vpavlin commented Jul 18, 2024

weboko commented Jul 22, 2024

weboko commented Aug 20, 2024

weboko commented Oct 16, 2024