Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitswap/server: WANT_HAVE requests should use Blockstore.Has #657

Closed
Wondertan opened this issue Aug 7, 2024 · 12 comments · Fixed by #672
Closed

bitswap/server: WANT_HAVE requests should use Blockstore.Has #657

Wondertan opened this issue Aug 7, 2024 · 12 comments · Fixed by #672
Assignees
Labels
P1 High: Likely tackled by core team if no one steps up

Comments

@Wondertan
Copy link
Member

Currently, WANT_HAVE requests rely on GetSize to check for presence of the requested block and the learnt size gets discarded.

Basically, the Blockstore has to lookup data size that's unused. Depending on Blockstore implementation, this might have different consequences. Intuitively, looking up data size should be more expensive compared to checking data existence. In practice, this is what happens in our case.

Hence the proposal is to add new method to blockstore manager that performs only the Has check and refactor ReceivedMessage to decouple processing of WANT_HAVE and WANT_BLOCK messages.

@Wondertan Wondertan added the need/triage Needs initial labeling and prioritization label Aug 7, 2024
@Wondertan
Copy link
Member Author

This is relatively urgent for us and if there is no capacity from your side to implement this, we could submit a patch as well

@gammazero
Copy link
Contributor

@Wondertan Preliminary investigation does not indicate that there is any specific reason that GetSize was used over Has so I will see if it looks reasonable to get this done now.

@lidel
Copy link
Member

lidel commented Aug 13, 2024

Triage notes:

  • Blockstore.Has sounds very sensible (presence check often cheaper/billed differently than reading size)
  • we were looking into necessity of getting sizes, and plan to make maxBlockSizeReplaceHasWithBlock configurable, @gammazero will look into this
  • we also want to investigate block duplication related to this and other bitswap ENV flag to enable duplicate block metric from bitswap rainbow#159

@Wondertan
Copy link
Member Author

I would also take a look at how cached blockstores(like bloom) would behave with this change.

@gammazero gammazero added P1 High: Likely tackled by core team if no one steps up and removed need/triage Needs initial labeling and prioritization labels Aug 20, 2024
@gammazero
Copy link
Contributor

gammazero commented Aug 30, 2024

It is true that Want_Have requests rely on GetSize to check for presence of the requested block and the size gets discarded. So, using Blockstore.Has for Want_Have requests does offer an slightly faster lookup, however...

it also risks having to do both the Has and GetSize lookup for the same CID since it is possible for there to be both Want_Have and Want_Block requests for the same CID. This is why all the CID's from the wantlist are put into a unique set and then the sizes are looked up for al of them. This way there is only one query per CID, just they are all size lookups.

I think there are other improvements that can probably have a more significant performance improvement. Specifically the use of worker pools and communication between these goroutines can be streamlined. Additionally, high frequency metrics updates may also contribute non-trivial overhead. Improvements will be supplied as more profiling is done.

@Wondertan
Copy link
Member Author

it also risks having to do both the Has and GetSize lookup for the same CID since it is possible for there to be both Want_Have and Want_Block requests for the same CID. 

@gammazero, it possible for both requests to come for the same CID from a single peer? Or do you mean they are deduplicated across multiple peers?

@Wondertan
Copy link
Member Author

Gentle ping @gammazero.

We would like to understand the reasoning behind the risk a bit more.

@Wondertan
Copy link
Member Author

Last ping @gammazero

@gammazero
Copy link
Contributor

gammazero commented Sep 11, 2024

@Wondertan This is only across different peers, AFAIK. Any single peer should only issue a WantHave or a WantBlock for a CID.

Some things to consider...

A bitswap message may have a number of CID entries. Some are WantHave and some are WantBlock. Some WantHaves may be converted to WantBlock if the blocks are small enough

Using blockstore.Has for WantHave requests is only doable when automatically converting small WantHave requests to WantBlock is disabled (when e.maxBlockSizeReplaceHasWithBlock is zero). Getting block sizes is necessary to see if a block is small enough to do the conversion.

Getting block sizes is done as a concurrent batch, which is generally faster for more than a very small number of blocks, although it may also depend on backend store, cached values, etc. This means that we will want to separate the WantHaves from the WantBlocks in a bitswap message, so that those can both be done in concurrent batches.

At a later time the peer client may decide to send a message to retrieve the blocks for which WantHave responses were received. Now the bitswap server needs to get the block sizes for all those blocks, because it did not get the sizes previously. This is where the "extra" query happens.

I coded a prototype corresponding to the above description, and did not see a significant performance improvement. However, I may not have been testing at sufficient scale and/or was using a similar storage backend to what you are using.

Given specific storage backends, it may make a very significant difference in performance and/or cost to avoid reading block sizes. I propose two changes:

  1. Make maxBlockSizeReplaceHasWithBlock configurable. Setting to zero would disable automatic conversion of WantHave to WantBlock for small blocks.
  2. Offer the above code changes as a PR for you to test-drive and see if it meets your needs.

@gammazero
Copy link
Contributor

@gammazero gammazero added the need/author-input Needs input from the original author label Sep 17, 2024
@Wondertan
Copy link
Member Author

@gammazero, that's a great solution and fixes our concern! Thank you ❤️

Should we start testing the prototype now or after your PR gets ready for review?

@gammazero
Copy link
Contributor

@Wondertan You can start testing it now, and any feedback will be helpful. We would like to review and test in our cluster soon, but have not yet so cannot make any guarantees about how it works.

@lidel lidel removed need/author-input Needs input from the original author kind/stale labels Sep 26, 2024
@ipfs ipfs deleted a comment from github-actions bot Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 High: Likely tackled by core team if no one steps up
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants