Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add continue_with function to ZlibDecoder that allows refreshing of the stream without resetting the bufreader and fix resetting of ZlibDecoder #376

Closed
wants to merge 5 commits into from

Conversation

PierreV23
Copy link
Contributor

@PierreV23 PierreV23 commented Sep 13, 2023

Examples:

use flate2::{Decompress, read::ZlibDecoder};
use std::io::prelude::Read;

fn continue_with_example() {
    // Python ZLIB compressed with options: zlib.Z_DEFAULT_COMPRESSION, zlib.DEFLATED, -zlib.MAX_WBITS

    // b'{"msgs":[{"msg": "ping"}]}'
    let msg_vec1 = vec![
        170, 86, 202, 45, 78, 47, 86, 178, 138, 174, 6, 49, 148, 172, 20, 148, 10, 50, 243, 210,
        149, 106, 99, 107, 1, 0, 0, 0, 255, 255,
    ];

    // b'{"msgs":[{"msg": "lobby_clear"},{"msg": "lobby_complete"}]}'
    let msg_vec2 = vec![
        170, 198, 144, 201, 201, 79, 74, 170, 140, 79, 206, 73, 77, 44, 82, 170, 213, 65, 23, 206,
        207, 45, 200, 73, 45, 73, 5, 105, 5, 0,
    ];

    let mut decompresser = ZlibDecoder::new_with_decompress(
        &msg_vec1[..],
        Decompress::new_with_window_bits(false, 15),
    );

    let mut msg1 = String::new();
    decompresser.read_to_string(&mut msg1).unwrap();
    println!("{}", msg1);

    decompresser.continue_read(&msg_vec2[..]);

    let mut msg2 = String::new();
    decompresser.read_to_string(&mut msg2).unwrap();
    println!("{}", msg2);
}

fn reset_example() {
    // b'{"msgs":[{"msg": "ping"}]}'
    let msg_vec1 = vec![
        170, 86, 202, 45, 78, 47, 86, 178, 138, 174, 6, 49, 148, 172, 20, 148, 10, 50, 243, 210,
        149, 106, 99, 107, 1, 0, 0, 0, 255, 255,
    ];

    let mut decompresser = ZlibDecoder::new_with_decompress(
        &msg_vec1[..],
        Decompress::new_with_window_bits(false, 15),
    );

    let mut msg1 = String::new();
    decompresser.read_to_string(&mut msg1).unwrap();
    println!("{}", msg1);

    decompresser.reset_on_custom(&msg_vec1[..], false);

    let mut msg2 = String::new();
    decompresser.read_to_string(&mut msg2).unwrap();
    println!("{}", msg2);
}

This should help #312 and #276

NOTE: This does not affect read&write ZlibEncoder and write ZlibDecoder. These are yet to be written.

@PierreV23 PierreV23 changed the title Add continue_with function to ZlibDecoder that allows refreshing of the stream without resetting the bufreader Add continue_with function to ZlibDecoder that allows refreshing of the stream without resetting the bufreader and fix resetting of ZlibDecoder Sep 13, 2023
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for tackling this, even though I would have loved a conversation around alternatives first.

The way this library is structured, any method added as to be duplicated and adapted 6 times at least, which adds to the maintenance burden.

Solving a problem without changing the API of these types is thus preferred, and I believe that's possible here. My preference here would be to add a utility type which can be used to do these input-swaps.

@Byron Byron marked this pull request as draft September 13, 2023 09:00
@Byron Byron added the question label Sep 13, 2023
@PierreV23
Copy link
Contributor Author

I edited the the pull request. (I didn't mean to pull both continue_with and the reset fix at the same time, im a newbie...)

Yes, I agree a API change might be needed, but I was afraid of changing too much and I didn't want to change already existing code.

@PierreV23
Copy link
Contributor Author

The biggest issue at the moment is the fact that ZlibDecoder::reset uses bufread::reset_decoder_data.

pub fn reset_decoder_data<R>(zlib: &mut ZlibDecoder<R>) {
    zlib.data = Decompress::new(true);
}

It recreates an entirely new Decompress instance without taking into account possible custom parameters (zlib_header and window_bits). This causes deflate errors. reset_with_custom can be seen as a replacement.

continue_with allows for a refreshment of the stream, without touching the Decompress instance

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the examples, they are certainly a straightforward way of solving this particular issue. Due to the cost in maintenance that this additional API surface would incur, I'd like to investigate alternatives.

Can you instead try to create a Read implementation that allows to change an internal reader? Think along the lines of struct Foo(Rc<RefCell<impl std::io::Read>>). Then one can clone Foo to pass it as reader to the decompressor, but also call foo.reset(new_reader) on it. This has the same effect as this API, but is entirely under control of the user.

If properly implemented, tested and documented, I think it could start out as part of an example, similar to the one you already provide.

Thanks for your understanding and for considering continuing this PR despite the change in direction.

@PierreV23
Copy link
Contributor Author

PierreV23 commented Sep 13, 2023

Thanks for the examples, they are certainly a straightforward way of solving this particular issue. Due to the cost in maintenance that this additional API surface would incur, I'd like to investigate alternatives.

Can you instead try to create a Read implementation that allows to change an internal reader? Think along the lines of struct Foo(Rc<RefCell<impl std::io::Read>>). Then one can clone Foo to pass it as reader to the decompressor, but also call foo.reset(new_reader) on it. This has the same effect as this API, but is entirely under control of the user.

If properly implemented, tested and documented, I think it could start out as part of an example, similar to the one you already provide.

Thanks for your understanding and for considering continuing this PR despite the change in direction.

So if I understood you correctly, a new implementation of Read gets passed to bufreader::ZlibDecoder. Where you can directly manipulate the new Read instead manipulating it via the ZlibDecoder?

@Byron
Copy link
Member

Byron commented Sep 14, 2023

Yes. You can implement it as part of an official example/*.rs which could be what you presented here, but solving it with a Read wrapper that allows to swap out the inner Read instance.

@PierreV23
Copy link
Contributor Author

PierreV23 commented Sep 15, 2023

That would mean we would not need a reset function for any of the top layer wrappers right? I was thinking of how to remove some of the boilerplate I came by. What if we just make Config struct(s) that are mandatorily passed to ::new? This would also mean we could abandon the separate gz and zlib directories, generalizing them into a single Decoder and Encoder.

We can make default instances like ZLIB_DEFAULT and GZ_DEFAULT, or use functions: Config::zlib_default() and Config::gz_default().

I think there is a lot of code that can practically be discarded, with the result of creating a new interface that is prettier, more implicit and easier to use.

It would mean a move to a new major version though.

Update: I read more into the source, but GZ is more unique than i thought. I still think the codebase could use improvement.

@Byron
Copy link
Member

Byron commented Sep 16, 2023

It's great that you are taking the time to get acquainted, this will definitely be helpful!

That would mean we would not need a reset function for any of the top layer wrappers right?

I don't know what that means, but recommend trying to do as described by me earlier. There is no need to adjust anything in this crate to get your example to work.

It would mean a move to a new major version though.

Any breaking change could be explored in an own crate, and maybe from there one could conclude that it's worth making such a change here as well. This needs a lot of experimentation, time and discussion though to assure it's the right move, it has great implications.

I think to get this PR merged it has to pivot towards a utility type and an example as described earlier, and I urge to stay on topic or close this PR and work on a new crate that shows how to do better than what you find here. Thanks for your understanding.

@PierreV23
Copy link
Contributor Author

I think I will drop working on this now then. I am not experienced enough with Read to implement such things and I'd rather work on something else (i.e. generalized / config system).

I will take a break for now and look back once I have enough spare time again.

Good luck.

@Byron Byron closed this Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants