Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjacent NBT Reading #51

Open
Offroaders123 opened this issue Sep 11, 2024 · 0 comments
Open

Adjacent NBT Reading #51

Offroaders123 opened this issue Sep 11, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@Offroaders123
Copy link
Owner

As of utilizing NBTify to parse NBT structures for Bedrock and New 3DS Edition, it seems like first-party support for reading adjacent NBT file structures could be very helpful. This could be easy to use in terms of an API too, rather than having to require the user to implement an abstraction over read() themselves.

NBTify has a goal of only returning values when it safely knows that the file has been read successfully. Part of this is to throw an error if the file has 'completed reading' while there are still bytes left to be read. Occasionally, it's just dead data, and you don't have to worry about it. Other times, it actually is other trailing data, or even further NBT payloads to continue reading too. For that last one, that's what this feature looks to make easier to handle.

Bedrock has Actor Storage entries which place NBT compound tags directly adjacent to each other inside of a singular buffer. To parse these with NBTify, you explicitly have to disable strict mode, in that it will not throw when bytes are remaining. My concern thus far has been that this is 'unsafe' to do, moreso risky, because if you don't know an explicit format of the NBT data of the file, you can return early and get junk NBT data that happens to correlate from the bytes in the file.

Here's the proposal for the concept of how this could work, I wrote this in Discord:

I still have to write a proper way to log out where NBTify successfully completes
Ideally it would return what offset it finished at, say like you passed strict: false
Then it would be a matter of reading with read(data.subarray(previousResult.byteOffset))

Ooooh! Maybe I could add a generator function for it!

/**
 * This needs a better name! :D
 */
export declare function readAdjacent<T extends RootTagLike = RootTag>(data: Uint8Array): AsyncGenerator<NBTData<T>>;

You'd use it like this:

// *edit: I was curious if it would be possible to just return a regular
// generator with Promises as the values, then you could await them in parallel,
// but in this case you especially have to find out where the first file ends
// before you are able to read the next one

for await (const nbt of readAdjacent(data)) {
  console.log(nbt);
}

(This next part is in regards to parsing things with a loose format, in non-strict mode)

Come to think of it, wrapping this in another handler actually mitigates this entire issue as a whole! It can still allow for auto-parsing/format detection too, because the entire loop still needs to run correctly in order for it to resolve peacefully

Here's an example of where strict mode can be innaccurate depending on the format you pass into it
The first file could be parsed correctly with rootName: false, because the second byte after the TAG_Compound byte happends to be the same as the final TAG_End needed to end the Compound
So even though there are two bytes remaining, you still could parse it as a different format, and it will 'happen' to work, even though the bytes may mean something totally different
So my thought is that if you success in reading a full NBT tree, you likely also reached the end of the file, else it might have misinterpreted the file as another format along the way

NBTify Bytes Comparison

You could even read the second file as little endian, and it will still parse correctly, even though you wrote it in big endian

So yeah, I have more I need to explore into this, in order for this to be streamlined in terms of the UX of the API. I'm curious if this should be part of NBTify itself, or maybe another utility feature somewhere else. And should this be part of NBTify itself? Maybe it's out of scope of it? I guess it depends. I just don't want to add a feature for an issue that doesn't really exist, to some extent. Say if NBTify's non-strict API was a little more transparent about when things complete, maybe it wouldn't be necessary to have a helper function, and it could actually just be up to the user to configure things how they need. But then again, having first-party building blocks very much does help with making things quicker. Another example might be Promise.all() or Promise.allSettled(), both of which have been very helpful lately (those feel like helper utilities to me).

This is also related to #31.

Other helpful considerations: Generators vs Promise-based async code (I was curious about whether I specifically need to use an async-generator for this, I think in this case it would have to be)

@Offroaders123 Offroaders123 added the enhancement New feature or request label Sep 11, 2024
@Offroaders123 Offroaders123 self-assigned this Sep 11, 2024
Offroaders123 added a commit that referenced this issue Sep 12, 2024
Ziltoid the Omniscient!!!

#31
#51

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/fromAsync

This is a demo of this feature, I'm not sure if this is completely how I will implement it, but it does seem to be working fairly okay.

The `chunk91` file is from Dexrn, in the project to parse CDB entries from New 3DS Edition. It seems to be relatively the same format as that of the equivalent format section in Bedrock's LevelDB implementation. That's where my `BlockEntity` file is from too, by Bedrock-LevelDB project.
Offroaders123 added a commit that referenced this issue Sep 13, 2024
#31
#51

https://lifesigns1.bandcamp.com/album/lifesigns

Funny how in the first issue, I wasn't liking the new GitHub UI at the time, now I don't mind it, and it does look really nice :D

This first one is specifically for the persistence of the read offset when `strict` mode is disabled. It is only accessible with a getter on the NBTData object, and it will return `null` otherwise, when strict mode is enabled (still on by default).

This is initially to set up allowing for easier test-parsing of files with adjacent content, like Bedrock's LevelDB entries. With NBTify currently not easily making the previous offset accessible, it makes it hard to deduce where the next reading step should start from. This simply adds that transparency on it's own.

I may rework the method by how the NBTData object handles this, make it a little tidier.

4664e4c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant