Fix directory local header size fields corrupted at offset >4 GB#413
Fix directory local header size fields corrupted at offset >4 GB#413maennchen merged 1 commit intomaennchen:mainfrom
Conversation
|
Thanks for the PR! Can we detect the corruption by unpacking? We could for example just add a directory at the end of the zip in I would prefer to have a test that tests the outcome and not the specific bytes. And if possible to not add another slow and resource hungry test. |
b880fa9 to
d02c801
Compare
|
Very good idea! |
|
@jquiaios This is not quite what I meant. The call to If we produce corrupted zip files, I would then expect What I'm ultimately saying is that I don't want to test the fix itself, which you do with My assumption is, if you only keep the |
|
@maennchen Modifying the At first I tried to add the $zipArchive = new ZipArchive();
$result = $zipArchive->open($zipPath, ZipArchive::CHECKCONS);But it then showed 2 failing tests despite my fix: My understanding is that I didn't find another way to test than using I'm opened to suggestions on how I could check this directly in |
|
Interesting. Then I agree with your solution. Unfortunately zip clients often yield very different results. For future context: Which zip implementation complained that the directory is corrupt and led you to open this issue? |
The first one was And to try to get more info for my investigation I used these other tools:
|
|
Thanks a lot ❤️ I’ll do a patch release soon. |
Hello,
I'm using this wonderful library to create ZIPs that can sometimes contain thousands of directories and files.
Some of these ZIPs can reach 10GB.
I've been having issues with bigger ZIPs that were also saying "corrupted" when trying to open it on my computer.
I've been talking with Claude to try to understand what I was doing wrong, and after quite a lot, I think I found a bug when trying to add a directory when the zip is already >4GB.
I'll try to explain as clearly as possible.
I had to use Claude to help me explain with the proper technical terms, since I'm not super knowledgeable with the inner tech of zip.
Hopefully it makes sense, let me know if you need any more information.
Thanks!
Problem
When building a ZIP archive larger than 4 GB that contains directory entries, any directory whose local file header starts past the 4 GB boundary ends up with corrupted size fields.
Streaming ZIP readers (those that cannot seek back to the Central Directory) misread the directory's start offset (~4+ GB) as its file size.
Root Cause
addDirectory()always passesenableZeroHeader: false, which meansforceEnableZip64is alwaysfalsefor directory entries:When only
startOffsetoverflows 4 GB (and not the sizes, which are always 0 for a directory),buildZip64ExtraBlock()correctly emits a ZIP64 extra block containing onlyrelativeHeaderOffset.However,
addFileHeader()used$zip64Enabledas a proxy for "sizes need ZIP64", and since the extra block is non-empty,$zip64Enabled = true, causing both size fields in the local file header to be written as0xFFFFFFFF.The result is a local file header that signals "look in ZIP64 extra for sizes" (via the
0xFFFFFFFFsentinels), but the ZIP64 extra block only contains the offset value. A reader following the spec interprets the first 8 bytes of the ZIP64 extra asoriginalSize, reading the ~4 GB offset as the file size instead of 0.Proposed fix
In
addFileHeader(), size fields in the local header should only use the0xFFFFFFFFsentinel when those sizes actually need ZIP64, not merely because the file's offset does:$forceEnableZip64preserves the existing behavior forenableZeroHeader=true(streaming mode), where sizes genuinely are unknown upfront and must be written as placeholders.Script to reproduce
Output