Introduce efficient slot pagination #169

gostkin · 2023-12-27T01:30:38Z

Now there's a concept of after: if not everything was returned then the slot of the last event is returned as after in the response.

Events are returned per block to simplify the integrations. So either everything from the same block is returned or nothing.

SebastienGllmt · 2023-12-27T14:32:18Z

webserver/shared/models/common.ts

+export type AfterSlotPagination = {
+  /**
+   * Minimal slot from which the events should be returned (not inclusive)
+   *
+   * @example 46154769
+   */
+  after: number | undefined,
+}
+export type UntilSlotPagination = {
+  /**
+   * Maximal slot from which the events should be returned (inclusive)
+   *
+   * @example 46154860
+   */
+  untilSlot: number | undefined,
+}
+export type SlotPagination = AfterSlotPagination & UntilSlotPagination;


The reason we don't use slot number in the existing types like TransactionPaginationType and UntilPaginationType is because there is no guarantee a slot number still contains the same data between requests (it's possible a rollback happens that replaces a block that used to be at a slot with a totally different block)

For our current use case that shouldn't really change anything though, since we don't put a slot that we don't consider confirmed in the untilSlot. Since we are not handling/expecting rollbacks anyway.

I think if we used tx ids here as cursors, it's still much simpler to keep the slot range, and paginate inside that range only.

For our current use case that shouldn't really change anything though, since we don't put a slot that we don't consider confirmed in the untilSlot. Since we are not handling/expecting rollbacks anyway.

Sure, but Carp is a general-purpose indexer so we shouldn't ship something that can cause subtle bugs for the average user if they want this task

I think if we used tx ids here as cursors, it's still much simpler to keep the slot range, and paginate inside that range only.

I disagree. This is exactly what we want to move away from because it leads to a slow presync. It means we would have to do a bunch of useless Carp queries for slot ranges that have nothing inside it instead of just querying for the next batch of data and then consuming it as required on the Paima side.

I agree that it would be good to have that, and it would certainly be much faster to catch up the entire chain.

But then we would need to keep track of the tx id per cde in a table, since all of them will sync at different points and with different speeds. Although we may need that anyway. And then there is the issue that you never know if you need to fetch an extra page or not.

All of this doesn't really matter for the presync stage in paima, which is where the pagination is more useful, but during the sync stage we will always end up fetching extra data that will need to be kept around (possibly in memory) until the corresponding evm block is fetched. And at that point we may even need to request another page, so while doable, I think it does bring some extra complexity and I'm not sure it has much benefit.

So I'm thinking, could we have both things instead? We can use tx based pagination to catch up during presync, but just using slot ranges during the sync stage?

Just querying by slot range will also break if a single slot contains more entries than the pagination limit (which is not so unrealistic because Cardano allows transactions with like 100 outputs which often happens during NFT drops) so you would have to at the very least make sure the page limit is larger potentially a few txs like this in a block (plus a margin to take into account the fact block sizes may increase in the future)

Avoiding this issue is why we also keep track of the tx id for the pagination for the other endpoints

if a single slot contains more entries than the limit they all are returned, since this amount of data has upper limit in any case and if we return more data it doesn't break anything and it is not infinite amount of data

SebastienGllmt · 2023-12-27T14:34:07Z

webserver/server/app/controllers/ProjectedNftRangeController.ts

+                genErrorMessage(Errors.SlotRangeLimitExceeded, {
+                    limit: PROJECTED_NFT_LIMIT.MAX_LIMIT,
+                    found: limit,
+                })


Error message needs to be updated as well

gostkin · 2024-01-29T19:43:10Z

closing since we decided not to go this way

Introduce efficient slot pagination

790e109

gostkin requested review from SebastienGllmt and ecioppettini December 27, 2023 01:30

SebastienGllmt reviewed Dec 27, 2023

View reviewed changes

SebastienGllmt mentioned this pull request Dec 29, 2023

Update Carp pagination logic PaimaStudios/projected-nft-whirlpool#21

Open

gostkin closed this Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce efficient slot pagination #169

Introduce efficient slot pagination #169

gostkin commented Dec 27, 2023 •

edited

Loading

SebastienGllmt Dec 27, 2023

ecioppettini Dec 27, 2023

SebastienGllmt Dec 27, 2023 •

edited

Loading

ecioppettini Dec 28, 2023

SebastienGllmt Dec 28, 2023 •

edited

Loading

gostkin Dec 28, 2023

SebastienGllmt Dec 27, 2023

gostkin commented Jan 29, 2024

Introduce efficient slot pagination #169

Introduce efficient slot pagination #169

Conversation

gostkin commented Dec 27, 2023 • edited Loading

SebastienGllmt Dec 27, 2023

Choose a reason for hiding this comment

ecioppettini Dec 27, 2023

Choose a reason for hiding this comment

SebastienGllmt Dec 27, 2023 • edited Loading

Choose a reason for hiding this comment

ecioppettini Dec 28, 2023

Choose a reason for hiding this comment

SebastienGllmt Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

gostkin Dec 28, 2023

Choose a reason for hiding this comment

SebastienGllmt Dec 27, 2023

Choose a reason for hiding this comment

gostkin commented Jan 29, 2024

gostkin commented Dec 27, 2023 •

edited

Loading

SebastienGllmt Dec 27, 2023 •

edited

Loading

SebastienGllmt Dec 28, 2023 •

edited

Loading