[NSFS] Fix Newline Reader to work with partial reads and improve its memory usage #8456
+183
−30
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Explain the changes
This PR intends to fix the parsing issue present in our NewLine reader. Previously, the reader would perform partial reads (4096 bytes) and would force covert it into string, this works most of the times but falls apart if there was a UTF8 character at the boundary of the 4096 buffer.
This PR removes any such conversions while scanning through the buffer and also ensures to efficiently use the buffer, doesn't creates unnecessary strings and buffers and rather tries to rely on a pre-allocated buffers.
Issues: Fixed #xxx / Gap #xxx
Testing Instructions:
./node_modules/.bin/jest src/test/unit_tests/jest_tests/test_newline_reader.test.js