Inconsistent ZIP format? #1260
-
My assumption is that the self-extracting zips (universal) have a consistent format that can be decompressed by other system tools. Is this a realistic assumption? It doesn't seem to be true in practice. I've tried python's zipfile library and macos unzip on the files and it doesn't consistently decompress the files. Sometimes it works fine and when it fails, it fails for both zipfile and unzip. This is a page that consistently fails for me: https://austinkleon.com/2023/10/03/discovering-aphantasia/ unzip "Discovering aphantasia - Austin Kleon (2023-10-06 5_05_23 PM).html"
Archive: Discovering aphantasia - Austin Kleon (2023-10-06 5_05_23 PM).html
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of Discovering aphantasia - Austin Kleon (2023-10-06 5_05_23 PM).html or
Discovering aphantasia - Austin Kleon (2023-10-06 5_05_23 PM).html.zip, and cannot find Discovering aphantasia - Austin Kleon (2023-10-06 5_05_23 PM).html.ZIP, period. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Thank you, I was able to reproduce the issue. I guess this is due to the payload at the end of the file (JS + extra data) that would be too large. The zip is valid though. I'll try to find a solution to avoid this issue. |
Beta Was this translation helpful? Give feedback.
I was able to reproduce the issue. I can also confirm the cause of the issue. I opened the saved page in https://hexed.it/, selected and deleted all the bytes after
<script>var e,t
, and the resulting file can be unzipped without any warning. I'm fixing the issue.