types: introduce `TypeInner` making `Type` the only public "incomplete Simplicity type" type #233

apoelstra · 2024-07-04T15:26:04Z

Previously we had two types, Type and Bound, which were mutually recursive. Type was also exposed in the public API as a type representing an incomplete Simplicity type. Bound was exposed in the public API for a couple of reasons which this PR eliminates.

However, because Type is both public and unusable without a live type inference context, this means that it needs to contain an Arc<Context>. But this means that if we have a self-referential type, this leaks to a somewhat-convoluted reference cycle in which a Type holds an Arc<Context> which owns a set of Bounds which contain the original Type. If you are skeptical of this, try running the test harness in valgrind before and after the last commit of this PR -- you will see that previously we were leaking memory in the memory_leak unit test but now we are not.

This completes the "type inference overhaul" needed to eliminate the memory leak in infinitely-sized types.

This avoids passing `hint` through all the layers of recursion, but more importantly, constructs the actual error from a `Context` rather than from a `LockedContext`. In the next commit, we will need extra information to do this, and not want our context to be locked at the time.

We will want to change how types are constructed and this will minimize the future diff.

We will need this elsewhere in the next commit.

src/types/context.rs

src/types/mod.rs

src/types/context.rs

src/types/mod.rs

uncomputable

Reviewed 15d69b2 Looks great. I left some comments.

We have the Bound::sum and Bound::product constructors which check whether the proposed children of the bound are complete, and if so, construct a complete bound. If not, they construct an incomplete one. This check involves calling Bound::final_data which locks the context and then inspects the bound. This limits our freedom of movement to call these constructors because we need to make sure that the context isn't already locked. Better to pull this logic into Context where it can keep track of locking.

We want to make the `Bound` type private to the types module, because it is conceptually an implementation detail of the type inferenece algorithm. It has a weird mutual recursion with `Type` that shouldn't be exposed in the public API. It will also become a problem in a later commit when we change it to be mutually recursive with a new `TypeInner` type instead. For a couple reasons: 1. `TypeInner` is a private type, so if `Bound` contains it, then `Bound` would need to hide its internals which is impossible for an enum. So we would need a `BoundInner` in addition to `TypeInner`, which would be a real pain to understand. 2. Neither `TypeInner` nor `Bound` will hold a reference to an inference context. This means that the API of `Bound` will be severely limited since it will no longer have the ability to introspect itself more than one level deep. This makes for an awkward API that should not be public. This type moves a bunch of text from the main module doccomment and makes it a non-doccomment. It is a move-only of this text; it doesn't change anything. The existing text is somewhat out-of-date but I don't want to change it until we're done with everything else. (It's not too bad; the main thing is that it's now missing the `Context` object.) I believe this is the only "controversial" commit in this PR, because the objects in the Error type actually are, conceptually speaking, individual Bounds rather than Types. But we are stuck between "users should not see or care about bounds" and "the error enum should contain all relevant context" and I think this is the best choice. We can file an issue to revisit this maybe, though I'm willing to live with it.

As part of our project to make `Bound` private, we need to drop the `Context::bind` method, which takes an arbitrary bound and applies it to a type. This method doesn't really need to be public in so much generality, because (for example) binding a type to a free bound is a no-op that should never be done. In fact, it turns out that in Simplicity's typing rules we only ever bind to product bounds. (In some future iteration of Simplicity with e.g. a copair combinator, we would bind to sums; but it's hard to imagine a combinator that would require binding to units or other complete types, since this would represent a combinator whose inputs needed to have exact types, which is not really in the spirit of Simplicity.) So we can get away with just exposing a bind_product method, which takes `Type`s and only constructs a Bound internally.

Previously we had two types, `Type` and `Bound`, which were mutually recursive. `Type` was also exposed in the public API as a type representing an incomplete Simplicity type. `Bound` was exposed in the public API for a couple of reasons which were eliminated over the last few commits. However, because `Type` is both public and unusable without a live type inference context, this means that it needs to contain an `Arc<Context>`. But this means that if we have a self-referential type, this leaks to a somewhat-convoluted reference cycle in which a `Type` holds an `Arc<Context>` which owns a set of `Bound`s which contain the original `Type`. If you are skeptical of this, try running the test harness in valgrind before and after this commit -- you will see that previously we were leaking memory in the `memory_leak` unit test but now we are not. This commit introduces a new private type `TypeInner` which is mutually recursive with (the now-private) `Bound`. Then `Type` contains both a `TypeInner` and an `Arc<Context>` and is not recursive, eliminating the possibility of cyclic references. As a side-effect, it greatly reduces the amount of Arc-cloning that is happening, which should improve performance.

apoelstra · 2024-07-12T16:34:03Z

Addressed the two actionable comments. Should be good to go.

uncomputable

ACK dada0f2

8d3eda8 update encoding/decoding API to split off witness data (Andrew Poelstra) d9a5e8d simplicity-sys: update libsimplicity to 1b85dc31d80d36dc012755e4369aeeac815476a6 (Andrew Poelstra) 3251bcd simplicity-sys: update simplicity to 952e126256dd31d7ed8f4b33b99ae8cd1edabe89 (Andrew Poelstra) cbc22ff fixups to vendor-simplicity.sh (Andrew Poelstra) 825c7dc clippy: new lint about ignoring return value of format (Andrew Poelstra) ae96e2f bit decoding: check that bit iterators are fully consumed (Andrew Poelstra) Pull request description: Draft til #233 goes in, and maybe will split into multiple PRs. ACKs for top commit: uncomputable: ACK 8d3eda8 Tree-SHA512: 62485da10137329f8d8f8e5fabc5685c216a7898df522ba2abab6b2317557ac187d0749e3532fce8ae7b602c303afd53aeb3de3465ba74b877997c3f8f17dda8

apoelstra added 3 commits July 4, 2024 14:01

types: refactor to reuse some code across constructors

f3e5230

We will want to change how types are constructed and this will minimize the future diff.

types: refactor to pull out some type destructuring logic from bind()

7c47778

We will need this elsewhere in the next commit.

apoelstra mentioned this pull request Jul 11, 2024

update staging branch #234

Merged

uncomputable self-assigned this Jul 11, 2024