Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batching with array #108

Open
wants to merge 9 commits into
base: new-index
Choose a base branch
from
Open

Batching with array #108

wants to merge 9 commits into from

Conversation

RCasatta
Copy link
Collaborator

@RCasatta RCasatta commented Sep 16, 2024

Currently electrs support batching by separating requests via new lines, however, other impls support it also via json array as explained in the cited issue. This add support for batching requests via json array

close Blockstream/esplora#516

@RCasatta RCasatta requested a review from shesek September 16, 2024 13:33
src/electrum/server.rs Outdated Show resolved Hide resolved
@shesek
Copy link
Collaborator

shesek commented Oct 18, 2024

Should we limit the number of requests per batch?

tests/electrum.rs Outdated Show resolved Hide resolved
@RCasatta
Copy link
Collaborator Author

Should we limit the number of requests per batch?

Do we have limits for "new-line separated" batch requests? Is it different here? How much do you propose as the limit here?

@shesek
Copy link
Collaborator

shesek commented Oct 24, 2024

Do we have limits for "new-line separated" batch requests? Is it different here?

We don't, but it is different here. New-lines separated requests are streamed, and individual responses are sent as they become available. We only hold up to 10 pending requests in the queue, then block and stop reading from the TCP socket until a pending request gets processed to free room (a bounded mpsc::sync_channel).

With array batching all the requests and all the response has to be buffered in-memory at once, which could use significant resources and be a DoS vector if there are no limits in place.

@RCasatta
Copy link
Collaborator Author

RCasatta commented Oct 28, 2024

How much do you propose as the limit here

Do you think we can afford 20 as the limit for batch requests? I think it's common number because is used as gap limit...

@shesek
Copy link
Collaborator

shesek commented Nov 3, 2024

20 seems fine, yes 👍

However it seems the tests are failing. The failure isn't directly related to the batching but to the TestRunner used to test it.

@RCasatta RCasatta force-pushed the batching_with_array branch from 0bd3299 to d08cf38 Compare November 4, 2024 12:44
@RCasatta
Copy link
Collaborator Author

RCasatta commented Nov 4, 2024

I didn't realize locally because i launched the test singularly, I think we don't support having multiple concurrent tests.
My solution is to make the test ignored by default and test it specifically.

@RCasatta RCasatta force-pushed the batching_with_array branch from c2458ae to 250fb9b Compare November 25, 2024 10:45
@RCasatta
Copy link
Collaborator Author

Rebased on new-index and fixed forgotten comment

junderw pushed a commit to junderw/electrs that referenced this pull request Dec 6, 2024
Update cargo setup method and enable arm64 builds again
@shesek
Copy link
Collaborator

shesek commented Jan 14, 2025

I think we don't support having multiple concurrent tests.

I tracked down the cause. It's not specific to the tests, it fails whenever electrs is run multiple times in the same process, even if not concurrently.

This appears to happen because the threads sending parallel RPC requests to bitcoind are being reused across different electrs instances. Each RPC thread has a thread_local var with a cached Daemon instance (and TCP socket). The thread initially initializes a Daemon connection to Instance A's bitcoind, then later when the same thread is reused for instance B but the cached connection to A is returned instead - causing the mismatch errors.

If you apply this patch:

diff --git a/src/daemon.rs b/src/daemon.rs
index 6808907..48498c7 100644
--- a/src/daemon.rs
+++ b/src/daemon.rs
@@ -493,9 +493,9 @@ impl Daemon {
                 // Store a local per-thread Daemon, each with its own TCP connection. These will
                 // get initialized as necessary for the `rpc_threads` pool thread managed by rayon.
                 thread_local!(static DAEMON_INSTANCE: OnceCell<Daemon> = OnceCell::new());
 
                 DAEMON_INSTANCE.with(|daemon| {
-                    daemon
-                        .get_or_init(|| self.retry_reconnect())
-                        .retry_request(&method, &params)
+                    let d = daemon.get_or_init(|| self.retry_reconnect());
+                    debug!("rpc thread {:?} of {} got {}", std::thread::current().id(), self.daemon_dir.display(), d.daemon_dir.display());
+                    d.retry_request(&method, &params)
                 })
             })
         })

And run:

RUST_LOG=debug cargo test electrum -- --nocapture --test-threads 1 2>&1 | grep 'rpc thread'

You'll see:

DEBUG rpc thread ThreadId(17) of /tmp/.tmp8eAz3c/regtest got /tmp/.tmp8eAz3c/regtest
...
DEBUG rpc thread ThreadId(17) of /tmp/.tmptlPtvs/regtest got /tmp/.tmp8eAz3c/regtest

Note how the second time its trying to get a Daemon connected to tmptlPtvs, but gets the previous tmp8eAz3c instead because it ran on the same thread.

I was expecting the threads not to be reused across electrs instances, because each is using its own rayon::ThreadPool. Can't quite figure out what's causing this. Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

electrum rpc batch request not working
2 participants