Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Serialization of individual vs. batched Announcements #235

Open
wesbiggs opened this issue Feb 17, 2023 · 4 comments
Open

Discussion: Serialization of individual vs. batched Announcements #235

wesbiggs opened this issue Feb 17, 2023 · 4 comments
Assignees

Comments

@wesbiggs
Copy link
Member

The spec allows implementations to define which Announcement Types can be used with Publish Announcement (singular announcement) and which can be used with Publish Batch (up to 131,072 announcements at a time).

The Parquet format was selected for use with off-chain batch publications for various good reasons, but some of these reasons (inclusion of a Bloom Filter, for example) are less useful (or detrimental) when dealing with an individual announcement.

At present the spec does not mandate a particular serialization for the Announcement parameter in the Publish Announcement Operation, presumably leaving this to the implementation.

With the proposal for user data operations (#233) we are bringing in usage of the Avro serialization format. We should discuss whether it is useful to define the Avro data types for individual announcements as well, and specify that individual announcements be serialized into this format at the DSNP spec level.

@wilwade
Copy link
Member

wilwade commented Apr 12, 2023

Should we break out the data serialization specifics from the announcement pages?

This would not be a spec change, but just a re-org around where we specify the serialization from "spec type" to Parquet and Avro.

Worth a WIP PR for just one to see what it looks like?

@wesbiggs
Copy link
Member Author

Suggestion that the spec give recommendations on which fields are important to be indexed, even if not in a batch file Bloom filter.

@wesbiggs
Copy link
Member Author

  1. Do we move the parquet encoding to a separate mapping page for simplicity/clarity
  2. Do we publish Avro schema for non-batched announcement types
  3. Should we add a note on WHY some columns were suggested for Bloom filter, i.e. that it is important to be able to search/index them.

@wesbiggs
Copy link
Member Author

These were discussed on community call 2023-07-20 and no objections were raised; next step is to draft a PR for review.

@wilwade wilwade removed their assignment Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants