Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(turbine): implement retransmit stage #247

Open
wants to merge 53 commits into
base: main
Choose a base branch
from
Open

Conversation

yewman
Copy link
Contributor

@yewman yewman commented Aug 23, 2024

Implementation appears to be working correctly.
Most logic is well tested, however, there are some areas that will receive some extra attention regarding unit testing for equivalence with agave.

Metrics from 1 hour runtime

We can make these a little nicer but are they at least demonstrate functionality for now.
Screenshot 2024-10-15 at 3 29 40 PM

✅ turbine/retransmit_service.sig

Core logic mirrored from agave.

  • core logic
  • comments
  • testing
    • completed all tests from agave
    • comprehensive unit tests for demo

✅ turbine/turbine_tree.zig

  • core logic
  • comments
  • testing
    • complete retransmit tests from agave
    • comprehensive unit tests for demo

✅ turbine/shred_deduper.zig

Closely mirrored from agave.
ShredDeduper provides a means to probabilistically deduplicate incoming shreds based on their shred id and the raw bytes. Shred id's are allowed up to a specified number of duplicates, raw bytes are not allowed any duplicates.

  • core logic
  • comments
  • testing
    • completed all tests from agave
    • completed additional tests to assert agave equivalence

✅ utils/deduper.zig

Closely mirrored from agave.
Deduper is a generic probabilistic deduplication implementation.

  • core logic
  • comments
  • testing
    • completed key tests from agave
    • more to implement but okay for now

✅ utils/ahash.zig

Closely mirrored from ahash rust crate.
Currently only includes functionality required for the Deduper.

  • core logic
  • comments
  • testing
    • completed basic unit tests derived from running rust implementation
    • more to implement but okay for now

✅ rand/weighted_shuffle.zig

Closely mirrored from agave.
WeightedShuffle implements an iterator over indices shuffled according to their weight in an underlying weights array.

  • core logic
  • comments
  • testing
    • completed all tests from agave

@0xNineteen 0xNineteen changed the title feat(turbine): Turbine implementation focusing on the retransmit stage first feat(turbine): implement retransmit stage Aug 24, 2024
src/rand/weighted_shuffle.zig Outdated Show resolved Hide resolved
@yewman yewman force-pushed the harnew/turbine branch 4 times, most recently from 2ddc787 to f56a27d Compare September 5, 2024 20:45
@yewman yewman force-pushed the harnew/turbine branch 5 times, most recently from 2cd6003 to 4a9a900 Compare October 7, 2024 17:32
@yewman yewman force-pushed the harnew/turbine branch 5 times, most recently from 96a8034 to 03fa6d5 Compare October 9, 2024 02:56
@dnut dnut modified the milestones: Networking, --, Repair Oct 11, 2024
@yewman yewman marked this pull request as ready for review October 15, 2024 04:32
Copy link
Contributor

@InKryption InKryption left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRNG naming convention: try to cooperate with the changes of #304, and use the appropriate naming conventions for prng-related declarations.

Named parameters: we need more of them, we have a lot of very long functions with parameter types that are not always unambiguous.

Naming & style guide: this review comments on a few improvable names.

Unmanaged data structures: I'd like to see more in here, I've written down a number of comments related to this. I recommend addressing other comments before this.

The rest is miscellaneous or adjacent to the above.

After addressing all of that, I think this PR will look great.

src/cmd/cmd.zig Outdated Show resolved Hide resolved
src/cmd/cmd.zig Show resolved Hide resolved
src/gossip/data.zig Outdated Show resolved Hide resolved
src/rand/rand.zig Outdated Show resolved Hide resolved
src/rand/rand.zig Outdated Show resolved Hide resolved
Comment on lines +113 to +132
pub fn pubkey(self: Node) Pubkey {
return switch (self.id) {
.contact_info => |ci| ci.pubkey,
.pubkey => |pk| pk,
};
}

pub fn contactInfo(self: Node) ?ThreadSafeContactInfo {
return switch (self.id) {
.contact_info => |ci| ci,
.pubkey => null,
};
}

pub fn tvuAddress(self: Node) ?SocketAddr {
return switch (self.id) {
.contact_info => |ci| ci.tvu_addr,
.pubkey => null,
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make more sense for all three of these to be methods on NodeId.

Comment on lines +327 to +338
fn computeRetransmitParent(
fanout: usize,
index_: usize,
nodes: []const Node,
) ?Pubkey {
var index = index_;
const offset = (index -| 1) % fanout;
index = if (index == 0) return null else (index - 1) / fanout;
index = index - (index -| 1) % fanout;
index = if (index == 0) index else index + offset;
return nodes[index].pubkey();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing this to just return index, allowing the caller to index nodes themselves? This would play well with my other suggestion about using multiarraylist, where it'd allow the caller to just do nodes.items(.id)[computeRetransmitParent(fanout, index_)].pubkey() (in tandem with my other request to move the pubkey method into NodeId).

Comment on lines +468 to +470
num_known_nodes: usize,
num_unknown_staked_nodes: usize,
known_nodes_unstaked_ratio: struct { u64, u64 },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Named parameter struct for at least these three parameters. The four numbers in TestEnvironment.init(allocator, rng, 1_000, 100, .{ 1, 7 }) feel pretty magical.

};
}

pub fn hash(self: *AHasher, comptime T: type, data: *const T) void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function should be removed, write should be made pub and renamed to update, and the current update should be renamed to updateInt.
This will allow a user to use std.hash.autoHash[Strat] for more complex data.

Comment on lines +63 to +66
inline fn hashSlice(self: *AHasher, comptime T: type, data: []const T) void {
self.update(data.len);
self.write(@as([]const u8, data));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed in light of my other comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

4 participants