-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enabling relay client deteriorates p2p connectivity #9751
Comments
Thanks for reporting @smrz2001! We expect this will be fixed by upgrading to go-libp2p 0.26.4 per https://github.com/libp2p/go-libp2p/releases/tag/v0.26.4 and libp2p/go-libp2p#2208 . We will do this in 0.19.1 early week of 2023-03-27: #9754 |
I assigned to you @Jorropo as I assume you'll do the go-libp2p update. |
@BigLep this is a multifacet bug. |
Oh ok, thanks - good to know! |
Not resolving but for visibility 0.19.1 did ship with the updated go-lib2p version which "may help". |
This seems to be blocked on the linked go-libp2p issues. If with the latest go-libp2p (which fixes some but not all of the related issues) this is causing you problems post back. |
Triage update: we had go-libp2p updates recently, @Jorropo will check if he has time. |
Things are dodgy while libp2p/go-libp2p#1603 is still not fixed, this can cause issues like this. I think it's better use of time to fix this in go-libp2p before trying to debug further. |
Triage note: seems that the last PR we are waiting for is libp2p/go-libp2p#2542 and then we need go-libp2p release to fix this. |
Triage notes:
|
Oops, seems like we needed more information for this issue, please comment with more details or this issue will be closed in 7 days. |
Thanks everyone for working on this!! @lidel, connecting from a local Kubo 0.24.0 node with the Relay Client enabled to one of our IPFS nodes (on Kubo v0.19.1) appears to be working, and the swarm connection is stable. I'm also able to fetch recent CIDs from the infra node via the local node. It will be difficult to retest the latest release with the configuration we were running when we saw the issue, unfortunately, but from the above test, it feels reasonable to assume that the issue has, in fact, been fixed. Thanks again! |
Checklist
Installation method
built from source
Version
For our partner's node:
For our partner's node:
Description
I'll preface the description by saying that we observed the issue on the latest version of Kubo at the time (
v0.18.1
). I could not find any relevant issues between then and the new release, and at this point will not be able to upgrade our nodes to try to recreate the issue, especially since pubsub is deprecated inv0.19.0
and we need pubsub for the time being.Our investigation ran as follows:
Side note: We spent several days upgrading all the IPFS HTTP client and related dependencies in our Ceramic code to be able to use Kubo
v0.18.1
. Enabling the Relay Client causedquic-v1
multiaddrs to be used, which caused our Ceramic nodes to crash because our IPFS HTTP client did not like the new multiaddrs. We were unable to apply @lidel's recommended patches because our monorepo packaging toollerna
does not support applying patches to dependencies during builds 😭We operated under the assumption that disabling the Relay Client would have been a regression and so forged ahead trying to keep it enabled despite the additional changes required in the Ceramic code.
On both nodes, when running
ipfs swarm peers | grep <multiaddr>
we were only able to seep2p-circuit
multiaddrs, even though both nodes' advertised multiaddrs are publicly accessible.We even added our node to our partner's node under the
Peering
section of the configuration, and their node to our node's configuration, but this also didn't help with either connectivity or with the multiaddrs showing up inipfs swarm peers
. (Note that the config above no longer has our partner node's peer ID in our configuration - we decided to have all partner nodes have our Kubo nodes in theirPeering
config.)We tried
ipfs swarm disconnect <multiaddr>
on both sides as well to see if anything changed when they reconnected but the behavior remained the same.We tried disabling the Relay Client and repeating the previous step, but again nothing changed.
When we then explicitly disconnected from the Circuit Relay bootstrap nodes after disabling the Relay Client, it all started working perfectly! We saw direct swarm connections from both sides and the lookup timeouts practically disappeared. (I'm a little fuzzy on which list of nodes exactly we disconnected from but I know for sure they were related to
p2p-circuit
because we just weren't getting direct swarm connections even after disabling the Relay Client.)Our final setup (also codified in our wrapper image for Kubo) now has the Relay Client explicitly disabled in the configuration.
Please let us know if there any additional information we can provide, or if someone would like to pair and run some tests on our nodes.
Sorry for not having more data here - we were doing all of this around midnight the night before our ComposeDB Beta launch at EthDenver. We just wanted to get things working and weren't thinking about collecting data for later :(
cc @Jorropo @lidel @BigLep
The text was updated successfully, but these errors were encountered: