Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NSFS] Fix Newline Reader to work with partial reads and improve its memory usage #8456

Conversation

tangledbytes
Copy link
Member

Explain the changes

This PR intends to fix the parsing issue present in our NewLine reader. Previously, the reader would perform partial reads (4096 bytes) and would force covert it into string, this works most of the times but falls apart if there was a UTF8 character at the boundary of the 4096 buffer.

This PR removes any such conversions while scanning through the buffer and also ensures to efficiently use the buffer, doesn't creates unnecessary strings and buffers and rather tries to rely on a pre-allocated buffers.

Issues: Fixed #xxx / Gap #xxx

  1. A DBS3 issue

Testing Instructions:

  1. ./node_modules/.bin/jest src/test/unit_tests/jest_tests/test_newline_reader.test.js
  • Doc added/updated
  • Tests added

@tangledbytes tangledbytes changed the title [NSFS] Fix Newline Reader to work partial reads and improve it's memory usage [NSFS] Fix Newline Reader to work with partial reads and improve its memory usage Oct 10, 2024
@tangledbytes tangledbytes force-pushed the utkarsh/fix/file-reader-partial-encoded branch 2 times, most recently from 60720bd to 9421c83 Compare October 30, 2024 19:12
src/util/file_reader.js Outdated Show resolved Hide resolved
src/util/file_reader.js Outdated Show resolved Hide resolved
src/util/file_reader.js Outdated Show resolved Hide resolved
src/util/file_reader.js Show resolved Hide resolved
src/util/file_reader.js Outdated Show resolved Hide resolved
src/util/file_reader.js Outdated Show resolved Hide resolved
@tangledbytes tangledbytes force-pushed the utkarsh/fix/file-reader-partial-encoded branch from 9421c83 to 589c533 Compare October 30, 2024 19:42
Copy link
Member

@guymguym guymguym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found a small fix

src/util/file_reader.js Show resolved Hide resolved
@tangledbytes tangledbytes force-pushed the utkarsh/fix/file-reader-partial-encoded branch from 589c533 to 1905da6 Compare October 30, 2024 23:30
… usage

Signed-off-by: Utkarsh Srivastava <[email protected]>

cleanup code

Signed-off-by: Utkarsh Srivastava <[email protected]>

fix reset method and rename config params

Signed-off-by: Utkarsh Srivastava <[email protected]>

handle overflow case

Signed-off-by: Utkarsh Srivastava <[email protected]>
Co-authored-by: Guy Margalit <[email protected]>
@tangledbytes tangledbytes force-pushed the utkarsh/fix/file-reader-partial-encoded branch from 1905da6 to c2bbbe5 Compare October 31, 2024 07:53
@tangledbytes tangledbytes merged commit b27e63b into noobaa:master Oct 31, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants