Skip to content

Fix directory local header size fields corrupted at offset >4 GB#413

Merged
maennchen merged 1 commit intomaennchen:mainfrom
jquiaios:main
Apr 11, 2026
Merged

Fix directory local header size fields corrupted at offset >4 GB#413
maennchen merged 1 commit intomaennchen:mainfrom
jquiaios:main

Conversation

@jquiaios
Copy link
Copy Markdown
Contributor

Hello,

I'm using this wonderful library to create ZIPs that can sometimes contain thousands of directories and files.
Some of these ZIPs can reach 10GB.
I've been having issues with bigger ZIPs that were also saying "corrupted" when trying to open it on my computer.

I've been talking with Claude to try to understand what I was doing wrong, and after quite a lot, I think I found a bug when trying to add a directory when the zip is already >4GB.

I'll try to explain as clearly as possible.
I had to use Claude to help me explain with the proper technical terms, since I'm not super knowledgeable with the inner tech of zip.

Hopefully it makes sense, let me know if you need any more information.
Thanks!

Problem

When building a ZIP archive larger than 4 GB that contains directory entries, any directory whose local file header starts past the 4 GB boundary ends up with corrupted size fields.
Streaming ZIP readers (those that cannot seek back to the Central Directory) misread the directory's start offset (~4+ GB) as its file size.

Root Cause

addDirectory() always passes enableZeroHeader: false, which means forceEnableZip64 is always false for directory entries:

// File.php, line 183
$forceEnableZip64 = $this->enableZeroHeader && $this->enableZip64;
// = false && true

When only startOffset overflows 4 GB (and not the sizes, which are always 0 for a directory), buildZip64ExtraBlock() correctly emits a ZIP64 extra block containing only relativeHeaderOffset.
However, addFileHeader() used $zip64Enabled as a proxy for "sizes need ZIP64", and since the extra block is non-empty, $zip64Enabled = true, causing both size fields in the local file header to be written as 0xFFFFFFFF.

The result is a local file header that signals "look in ZIP64 extra for sizes" (via the 0xFFFFFFFF sentinels), but the ZIP64 extra block only contains the offset value. A reader following the spec interprets the first 8 bytes of the ZIP64 extra as originalSize, reading the ~4 GB offset as the file size instead of 0.

Proposed fix

In addFileHeader(), size fields in the local header should only use the 0xFFFFFFFF sentinel when those sizes actually need ZIP64, not merely because the file's offset does:

// Before
compressedSize: $zip64Enabled ? 0xFFFFFFFF : $this->compressedSize,
uncompressedSize: $zip64Enabled ? 0xFFFFFFFF : $this->uncompressedSize,

// After
compressedSize: ($forceEnableZip64 || $this->compressedSize > 0xFFFFFFFF) ? 0xFFFFFFFF : $this->compressedSize,
uncompressedSize: ($forceEnableZip64 || $this->uncompressedSize > 0xFFFFFFFF) ? 0xFFFFFFFF : $this->uncompressedSize,

$forceEnableZip64 preserves the existing behavior for enableZeroHeader=true (streaming mode), where sizes genuinely are unknown upfront and must be written as placeholders.

Script to reproduce

<?php

require __DIR__ . '/vendor/autoload.php';

use ZipStream\CompressionMethod;
use ZipStream\OperationMode;
use ZipStream\ZipStream;

$zipPath = sys_get_temp_dir() . '/zipstream_bug_demo.zip';
$output  = fopen($zipPath, 'w+b');

$zip = new ZipStream(
    operationMode: OperationMode::SIMULATE_STRICT,
    outputStream: $output,
    enableZip64: true,
    defaultEnableZeroHeader: true,
    sendHttpHeaders: false,
);

// A ~4.5 GB file pushes the next entry past the 4 GB offset boundary.
$zip->addFileFromCallback(
    fileName: 'large.bin',
    callback: static function () { return fopen('/dev/zero', 'rb'); },
    compressionMethod: CompressionMethod::STORE,
    exactSize: 0x120000000,
);

// This directory entry lands at offset 0x120000053, which is > 4 GB.
$zip->addDirectory('mydir/');

$zip->finish();
$zip->executeSimulation();
fclose($output);

// The directory local file header starts at offset 0x120000057:
//   63 bytes  = large.bin local header (30 fixed + 9 filename + 20 ZIP64 extra + 4 Zs extra)
//   + 0x120000000 bytes = file data (STORE)
//   + 24 bytes = ZIP64 data descriptor
//
// compressedSize is at header offset +18, uncompressedSize at +22.
// Both must be 0 for a directory. Before the fix they are 0xFFFFFFFF.
$dirHeaderOffset = 0x120000057;

$f = fopen($zipPath, 'rb');
fseek($f, $dirHeaderOffset + 18);
$fields = unpack('Vcomp/Vuncomp', fread($f, 8));
fclose($f);

echo 'compressedSize  : ' . $fields['comp']   . ($fields['comp']   === 0xFFFFFFFF ? ' <-- BUG' : ' (OK)') . "\n";
echo 'uncompressedSize: ' . $fields['uncomp'] . ($fields['uncomp'] === 0xFFFFFFFF ? ' <-- BUG' : ' (OK)') . "\n";

unlink($zipPath);

Output

# Before fix:
compressedSize  : 4294967295 <-- BUG
uncompressedSize: 4294967295 <-- BUG

# After fix:
compressedSize  : 0 (OK)
uncompressedSize: 0 (OK)

@maennchen
Copy link
Copy Markdown
Owner

Thanks for the PR! Can we detect the corruption by unpacking? We could for example just add a directory at the end of the zip in testLongZipWith64 and assert that this worked properly.

I would prefer to have a test that tests the outcome and not the specific bytes. And if possible to not add another slow and resource hungry test.

@jquiaios jquiaios force-pushed the main branch 2 times, most recently from b880fa9 to d02c801 Compare April 10, 2026 23:08
@jquiaios
Copy link
Copy Markdown
Contributor Author

Very good idea!
I edited the existing testLongZipWith64 to add a directory and check by unpacking as you suggested.

@maennchen
Copy link
Copy Markdown
Owner

@jquiaios This is not quite what I meant. The call to validateAndExtractZip unpacks the zip file to disk. So essentially we can add the directory to the zip (which you already did) and then just assert that the patch exists and is a directory and not a file.

If we produce corrupted zip files, I would then expect validateAndExtractZip to fail.

What I'm ultimately saying is that I don't want to test the fix itself, which you do with assertFileDoesNotContain, but rather if it behaves correctly.

My assumption is, if you only keep the $zip->addDirectory('mydir/'); line in the test and revert the fix, the test will already fail. Once you apply the fix again it should pass.

@jquiaios
Copy link
Copy Markdown
Contributor Author

@maennchen Modifying the validateAndExtractZip was my first idea, but validateAndExtractZip is using ZipArchive, and ZipArchive reads the central directory to extract entries, not the local headers.

At first I tried to add the ZipArchive::CHECKCONS when opening a zip in validateAndExtractZip to perform additional consistency checks, like this:

$zipArchive = new ZipArchive();
$result = $zipArchive->open($zipPath, ZipArchive::CHECKCONS);

But it then showed 2 failing tests despite my fix:

There were 2 failures:

1) ZipStream\Test\ZipStreamTest::testLongZipWith64
Failed to open /tmp/zipstreamtestgm4rXp. Code: 21 (ER_INCONS)

/app/test/Util.php:64
/app/test/ZipStreamTest.php:556

2) ZipStream\Test\ZipStreamTest::testSimulationWithLargeZip64AndZeroHeader
Failed to open /tmp/zipstreamtestHVRTnI. Code: 21 (ER_INCONS)

/app/test/Util.php:64
/app/test/ZipStreamTest.php:1193

FAILURES!
Tests: 72, Assertions: 186, Failures: 2, Warnings: 1.

My understanding is that ZipArchive::CHECKCONS is too strict for zero-header mode on files at offset >4GB, and sees the ZIP64 extra sizes (0) vs central directory sizes (actual sizes) as inconsistent.

I didn't find another way to test than using assertFileDoesNotContain.

I'm opened to suggestions on how I could check this directly in validateAndExtractZip if you have any.
Thanks!

@maennchen
Copy link
Copy Markdown
Owner

Interesting. Then I agree with your solution. Unfortunately zip clients often yield very different results.

For future context: Which zip implementation complained that the directory is corrupt and led you to open this issue?

@jquiaios
Copy link
Copy Markdown
Contributor Author

For future context: Which zip implementation complained that the directory is corrupt and led you to open this issue?

The first one was Archive Utility - Version 10.15 (174), which is the default tool to open zip archives on my Macbook (pro M3) when double-clicking a .zip file. It gave me Error 79 - Inappropriate file type or format

And to try to get more info for my investigation I used these other tools:

  • CLI tool unzip: UnZip 6.00 of 20 April 2009, by Info-ZIP, with modifications by Apple Inc..
    I used it to try to get more info and investigate, and the errors were:

    • ucsize xxxxxx<> csize 4294967295 for STORED entry. Continuing with "compressed" size value
    • warning: extra field (type: 0x0001) corrupt
    • bad CRC xxxxxx (should be 00000000)
  • CLI 7Zip version 26.00 (arm64))
    Giving me the error ERROR: Headers Error: /my/full/path/myfolder/

@maennchen maennchen merged commit 7eab0a0 into maennchen:main Apr 11, 2026
13 checks passed
@maennchen
Copy link
Copy Markdown
Owner

Thanks a lot ❤️ I’ll do a patch release soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants