forked from snapframework/io-streams
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Proof of Concept of new gunzip/gunzipOne impl
Needs more testing, review, and work: 1. Update the test suite 2. Research the issue of concatinating zlib streams 3. Review various semantic considerations, especially in relation to the unix command "gunzip" But, it works for me
- Loading branch information
Showing
2 changed files
with
99 additions
and
39 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we still need zlib-bindings if we merged that patch?
f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think zlib-bindings may still be needed for compression, but that can be fixed.
f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, another thing I am unclear on, is the intended meaning of null bytestrings (if allowed), and whether or not they should somehow be preserved. Also there was some discussion of upgrading the zlib interface I am using to a public one, possibly with revisions.
f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpsmith: on compression, receiving a null bytestring means "flush the output" -- this is a convention of the library that we need to maintain, otherwise you'll break e.g. http framing.
f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lpsmith: another data point -- I maintain zlib-bindings now, so if anything needs to be exposed there we can do it
f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gregorycollins, I really like the new zlib interface that this revised decompression code is using. So I'm not sure that there's really any reason to update zlib-bindings at least for the decompression side.
Another data point is this behavior of gunzip/zcat:
So, this revised
Streams.gunzip
is consistent-ish with the behavior of the command linegunzip
, though it might be worth investigatinggunzip
's behavior in other cases of trailing junk. (e.g. what happens if the trailing junk starts with the magic bytes, but isn't a valid gz stream. Also, are there any other magic bytes that we might want to check for?)Since this behavior of zcat/gunzip is relatively obscure, however, I'm wondering if maybe it would be better from a UX/expectations perspective to have a
gunzip
that is strict about trailing "garbage", and agunzipMany
that exhibits the same behavior implemented in this proof of concept.Flushing is kind of what I expected. It would seem that this code probably ought to be revised to preserve null bytestrings (or at least preserve one for every consecutive sequence of null bytestrings) instead of effectively filtering them out. I don't know if there's any further flushing-related integration issues with
zlib
we need to consider on the decompression side. I seem to recall that @hvr was working on flushing issues with the newzlib
interfaces that I'm using here, at least on the compression side.f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But really, it's worth keeping in mind that the only thing that snapframework#56 needs in order to be resolved is this patch, (which may contain regressions, or may not be as correct as it should be).
HVR then brought up the possibility of eliminating the need for
zlib-bindings
entirely by also changing the compression code, which I personally haven't noticed any problems with (but I also haven't been using it directly myself very much.)f39178b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, zcat in the example above does return a non-zero return value: