Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding traits, Path + Entry impls #23

Merged
merged 25 commits into from
Jul 20, 2024
Merged

Encoding traits, Path + Entry impls #23

merged 25 commits into from
Jul 20, 2024

Conversation

sgwilym
Copy link
Contributor

@sgwilym sgwilym commented Jul 9, 2024

  • Add a compact_width module
  • Add a max_power module
  • Adds Encoder and Decoder traits which work with Ufotofu consumers and producers.
  • Implements Encoder + Decoder on PathRc
  • Implements Encoder + Decoder on Entry
  • Adds an earthstar crate with nascent NamespaceId and IdentityId structs
    • These are built on top of a wip Cinn25519 implementation with validated shortnames.
    • I did this to implement Arbitrary on these types so we could use realistic values in fuzz tests as per Willow parameters for testing? #17.

This is already enough to review, I will do the relative encodings in a separate branch off of this.

Still needs:

  • A non-local Ufotofu dependency

@sgwilym sgwilym added the enhancement New feature or request label Jul 9, 2024
@sgwilym sgwilym requested a review from AljoschaMeyer July 10, 2024 15:50
@sgwilym sgwilym marked this pull request as ready for review July 11, 2024 09:26
@sgwilym sgwilym mentioned this pull request Jul 14, 2024
6 tasks
@Frando
Copy link
Contributor

Frando commented Jul 18, 2024

Hi, I am having a look at this at the moment and the implications for usage in iroh-willow.

My question is: Is it necessary to couple the Encoder and Decoder traits to ufotofu?

My guess is that the reasoning is that with this design encode is async and can yield on-the-fly once the write buffer is full? In my encoder trait in iroh-wilow, the trait has a required method encoded_len instead, and when sending I yield if the buffer doesn't fit the full message. The cost is, I guess, that I have to calculate the message length in advance. However, I would think that this is reasonably cheap.

With the trait coupled to ufotofu traits it becomes near-impossible to use willow-rs struct encoders and decoders if the higher-level protocol implementation does not use ufotofu but other stream and channel primitives, no? I did not yet look into implementing the ufotofu traits for other channels though.

Edit: After another look at ufotofu, I think I can implement the traits for my channel struct. Will report back once I tried it out.

@AljoschaMeyer
Copy link
Contributor

@Frando

The cost is, I guess, that I have to calculate the message length in advance.

To me, the true cost is the write buffer itself. You need to allocate it, and you need to copy its contents into the channel primitive. The ufotofu design allows for zero-copy serialisation, where you write directly to the channel (even if its internal buffer is smaller than the total size of the encoding) without intermediate allocations.

This is a win, and the resulting code is also more (or, at least, quite) elegant (for example, the entry encoder).

The big drawback is having to adopt ufotofu, of course. For obvious reasons I don't consider that as a drawback conceptually, but I cannot talk away the real engineering cost (and risk?), especially since the ufotofu codebase is immature and still undergoing some changes. In particular, you'd need to code against the slice_helpers branch, which should more accurately be called refactor_all_the_things.

I'd be up for a call to help with ufotofu integration and just general high-bandwidth sharing.

@Frando
Copy link
Contributor

Frando commented Jul 18, 2024

You need to allocate it, and you need to copy its contents into the channel primitive.

Not necessarily. My encoder trait encodes into &mut [u8] with the invariant that the slice is at least self.encoded_len() bytes long.

My Channel::send takes item: impl Encoder, checks if the available buffer is at least item.encoded_len, yields if not, and then calls item.encode with a mut slice of the channel buffer.

It does not use MaybeUninitialized, but that is an ortogonal concern.

I do like the design of the ufotofu traits, I am just unclear if the coupling to these new and still unstable traits is required for the quite low-level encoder trait.

Edit: Links to the traits in iroh-willow:

traits: https://github.com/n0-computer/iroh/blob/willow/iroh-willow%2Fsrc%2Futil%2Fcodec.rs

usage in channel send:
https://github.com/n0-computer/iroh/blob/willow/iroh-willow%2Fsrc%2Futil%2Fchannel.rs#L234

(exact signatures are different from what I wrote pseudocodish above but conceptually the same)

Note that I am not opposed to the traits introduced here conceptually, but interested if the encoding can be made less coupled to the io traits without perf or ergonomic downsides.

@AljoschaMeyer
Copy link
Contributor

AljoschaMeyer commented Jul 18, 2024

My encoder trait encodes into &mut [u8] with the invariant that the slice is at least self.encoded_len() bytes long.

I'm not sure whether we are talking past each other. I'm basically interested in the setting where the buffer of the channel (say, the OS-managed buffer for to write tcp data into) is smaller than the encoding. As far as I'm understanding the sentence I quoted, you'd require a channel that can allocate arbitrarily large buffers on demand, which is what the ufotofu design avoids.

Then again, the code you linked uses io::Write and not &mut [u8], and io::Write does not have this problem. Which is why I feel like I'm misunderstanding something.

In any case, the ufotofu design is pretty much the same as that of io::Write, except it fixes a couple of glaring issues (hardcoded item and error type (the latter being important for encoding), inability to pipe from reader to writer without an intermediate buffer, weirdness around uninitialised memory).

It does not use MaybeUninitialized, but that is an ortogonal concern.

Agreed. I keep switching between being happy or unhappy with the ufotofu approach. But irrespective of those feelings, it will stay the way it is, because it should have the ability to work with uninitialised memory. I just wish it was less of a pain...

interested if the encoding can be made less coupled to the io traits without perf or ergonomic downsides

Yeah, I share the sentiment, but I didn't find a satisfying way to do it. I think the ability for the channel to dictate fragment sizes is crucial, and that is what the "io[-like] traits" are all about. Buf and BufMut from bytes are surprisingly clsoe in some respects, but they have no error handling...

In my opinion, there is simply a void in terms of popular traits/apis in the rust ecosystem for general, io-like traits. And ufotofu fills that void in a - to me - satisfying manner. It just isn't a popular API (yet?).


Feel free to reach out on discord if you want to schedule a call.

@sgwilym
Copy link
Contributor Author

sgwilym commented Jul 19, 2024

Now with zero-copy Path changes merged in.

@sgwilym sgwilym merged commit 06457bd into main Jul 20, 2024
1 check passed
@sgwilym sgwilym deleted the encodings branch July 20, 2024 09:44
@AljoschaMeyer
Copy link
Contributor

@Frando Just published ufotofu version 0.2.0 fyi, that one should be fairly stable to code against.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Development

Successfully merging this pull request may close these issues.

3 participants