-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix an unhandled error "There are some data after the end of the payload data" #538
base: master
Are you sure you want to change the base?
Conversation
@Mrw33554432 The patch breaks a feature in some case. Please check a result of tests by |
I'm not quite sure why that happens, because the only thing this code do is break a dead loop (when out_remaining>0, len(tmp)<=0), and it should not be trigger by anything else... I might need to know why there wasn't an else before being able to fix it (I have no idea about that). |
|
In some algorithm, there is a case that first blocks of input data are consumed but there is no output because some of second blocks of input data are required to complete a calculation. It is a case that |
I will just try the out_remaining-=1 version, or if this one fail then I will add a mechanism that will only break before next run (in case some data is updated in +1 iter delay. |
You can run test with |
Could you add a test case that reproduce the problem with minimum data which failed without the fix? see https://github.com/miurahr/py7zr/blob/master/docs/contribution.rst |
The code
If all parameters in the loop stay the same in more than 1 iter, which means it is a dead loop. The only thing I am trying to do here is break the loop if it is dead, and perhaps that's causing error? But without it the code should run forever. I'm getting confused. |
The decompressor probably store something inside, and we should also check the param inside decompressor, ensuring a dead loop is happening. But I will leave it for later. |
If the past version works fine, either the task is successfully killed with timeout, or
returned |
You can find decompressor has buffer. It is required why some algorithm library sometimes produce more length of requested data with given input data, decompressor store data for the next request. https://github.com/miurahr/py7zr/blob/master/py7zr/compressor.py#L680-L685 |
…into this section twice (1 iter delayed to allow code perform eveything else it need to do)
There is another fact that some algorithm has an internal state and buffer. It accept max_length of output and it has the internal state or buffer. |
#536
Now when the problem happens, the code will raise the warning and break the dead loop. The other option which is force 1 place forward is also provided inside the code. Choose which you want (I am not familiar with decompression logic, but break works fine in my case. It may lead to potential data loss, or maybe not, idk). The warning is handled via warnings.warn, because I want to remain most part of the code unchanged. Other wise you can try to use the report method perhaps, but the parameter self.q seems not available in this method now.