Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to UTxO-HD: mempool snapshotting #1382

Open
wants to merge 6 commits into
base: utxo-hd-main
Choose a base branch
from

Conversation

jasagredo
Copy link
Contributor

@jasagredo jasagredo commented Feb 6, 2025

This PR accumulates some improvements for UTxO-HD that result in benefits for mempool snapshotting, which is part of the critical path for block minting (and therefore block diffusion). Running in a full-block synthetic chain shows these times for 50 consecutive mempool snapshots:

  • pre: at utxo-hd-main before this PR
  • caf: pre + moving the Cardano tx out translations to a CAF
  • no-diffs: caf + do not compute diffs when snapshotting the mempool
  • no-shtxin: no-diffs + remove the ShelleyTxIn newtype

plot

@jasagredo jasagredo changed the title WIP: improvements to UTxO-HD Improvements to UTxO-HD: mempool snapshotting Feb 7, 2025
@jasagredo jasagredo self-assigned this Feb 7, 2025
@jasagredo jasagredo marked this pull request as ready for review February 7, 2025 12:11
@nfrisby
Copy link
Contributor

nfrisby commented Feb 7, 2025

I edited the PR description a little.

Copy link
Contributor

@nfrisby nfrisby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly comments, fewer bangs, etc.

@@ -1114,6 +1116,14 @@ class ( Show (HardForkTxOut xs)
default txOutEjections :: CanHardFork xs => NP (K (NS WrapTxOut xs) -.-> WrapTxOut) xs
txOutEjections = composeTxOutTranslations $ ipTranslateTxOut hardForkEraTranslation

txOutTails :: Tails (InPairs.Fn2 WrapTxOut) xs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a comment on this method, like for txOutEjections just above it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little awkward that we never intend for any instance to override this default 🤔.

I suppose the only alternative would be to define the polymorphic "CAF" and give it a SPECIALISE pragma. But... then we'd have to make sure there's a SPECIALISE chain that reaches from the Consensus entry point all the way to every use, which is a real headache. So, the awkwardness seems worthwhile 👍

With the current approach, the dictionary apparently handles that specialization chain for us. But I'm not exactly sure why it's able to do so, which makes me a little nervous. We have instance CardanoHardForkConstraints c => HasHardForkTxOut (CardanoEras c), which means that instance function (ie the thing that builds the HasHardForkTxOut dictionary from the other dictionaries) might get used in some code location that needs HasHardForkTxOut and already has a CardanoHardForkConstraints context. Since the txOutTails default does depend on that instance context (since it has CanHardFork in its own context, and our Cardano instance of that also requires CardanoHardForkConstraints), it wouldn't be shared among different uses of this instance function.

But, your measurements show that it does help. I just worry that we might accidentally spoil that someday, since it's not entirely transparent to us why tucking this polymorphic "CAF" inside a type class does actually end up using a proper CAF.

(This same concern applies to txOutEjections --- I just hadn't thought through the details when originally suggesting it.)

Edit: I guess as of the Ledger Team's monomorphic crypto work, the explanation will become much simpler, since CardanoHardForkConstraints only has one index, c! At that point, it'll be obvious that a sufficient requirement is for CanHardFork XS to be in-scope for the HasHardForkTxOut XS instance.

--
-- When getting a mempool snapshot, we will revalidate all the
-- transactions but we won't do anything useful with the resulting
-- state. We can safely omit computing the differences in this case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please amend this comment to also discuss:

  • This is a worthwhile optimization (and has been measured as such), since snapshotting the mempool is on the critical path of block minting.
  • Eventually, the UTxO HD plan has always been for the ledger rules to construct the differences, instead of the Consensus layer computing them retroactively via calculateDifferences. Once that's true, we hope this optimization will no longer be worthwhile. That's part of the reason that we're content to just use a boolean isomorph instead and and yield empty Diffs instead of making it a GADT that makes the precise codomain (or just having two different functions, one with Diffs in the codomain and one without).

-- transactions but we won't do anything useful with the resulting
-- state. We can safely omit computing the differences in this case.
data ComputeDiffs
= ComputeDiffs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment saying that this option should be used with resyncing the mempool with an updated selection.

-- state. We can safely omit computing the differences in this case.
data ComputeDiffs
= ComputeDiffs
| IgnoreDiffs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment saying that this option should be used when snapshotting the mempool when minting a block.


---

unionValues ::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intuition is that we should have a comment here explaining that collisions are impossible for this first phase of UTxO HD.

@@ -82,6 +83,9 @@ data InternalState blk = IS {
-- This should always be in-sync with the transactions in 'isTxs'.
, isTxIds :: !(Set (GenTxId blk))

, isTxKeys :: !(LedgerTables (LedgerState blk) KeysMK)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two fields definitely need comments, which can probably state a pretty simple INVARIANT with respect to isTxs. I'd imagine it can be exact for isTxKeys and just a little hand-wavy for isTxValues.

in snapshotFromIS $ IS {
isTxs = TxSeq.fromList $ map unwrap val
, isTxIds = Set.fromList $ map (txId . txForgetValidated . fst) val
, isTxKeys = emptyLedgerTables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment could help explain why isTxKeys and isTxValues can soundly be empty here.

values
(isLastTicketNo is)
(TxSeq.toList $ isTxs is)
else if pointHash (isTip is) == castHash (getTipHash ticked)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be one call to computeSnapshot, where the values has been conditionally recovered either from isTxValues or from isTxKeys.

-> Map.Map (TxIn l) (TxOut l)
-> CBOR.Decoder s (Map.Map (TxIn l) (TxOut l))
go 0 m = pure m
go !len m = do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only place you need bangs are len and m, yeah? Since you're using the Data.Map.Strict interface. (Which would plug the leak in m here, also.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

3 participants