Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use nats for pubsub instead of waku #295

Closed
wants to merge 1 commit into from
Closed

Use nats for pubsub instead of waku #295

wants to merge 1 commit into from

Conversation

snormore
Copy link
Contributor

@snormore snormore commented Aug 31, 2023

Replaces use of waku for pubsub with nats.

The fork of go-waku being used by this node is very out-of-date, and resulting in some issues (like #291). This node isn't really using waku/libp2p for anything besides a local/private pubsub network anyway, and so we can swap it out for nats for a more reliable/mature pubsub solution.

This simplifies it pretty significantly, and also allows us to use a more recent version of Go, but does require a nats server/cluster running outside of these nodes that they can connect to via the --nats.url args.

@neekolas
Copy link
Collaborator

This is a scarily large diff. And it comes with new infrastructure we'd have to run with high availability.

How would we de-risk this project to make sure we haven't missed any subtle dependencies on libp2p (or for that matter go-waku) inside our code?

What do we think about setting this up as a separate cluster in dev/prod talking to the same DB? We can give it it's own load balancers and URLs, point some of our internal apps at them, and do truly end-to-end tests. We'd also want to make sure we have sufficient observability to actually run this reliably and debug issues before moving prod traffic over.

That would also give us time to update all the places across our stack where we spin up local versions of the node to also have a nats cluster (xmtp-js, xmtp-ios, xmtp-android, bot-kit-pro)

Given that would take days to test, we'll want some sort of temporary band-aid to the current node issues. Maybe some sort of automatic restarts every ~6 hours?

@neekolas
Copy link
Collaborator

Also, of course, amazing work @snormore getting this solution together so quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants