Skip to content

Incorrect assumptions about block sizes and data/hole sizes #59

@Lalufu

Description

@Lalufu

When trying to detect holes using the SEEK_DATA/SEEK_HOLE method, bmaptool assumes that sizes of data blocks and holes are multiples of a predetermined block size:

        while True:                                                                     
            start = _lseek(self._f_image, end, whence1)                                 
            if start == -1 or start >= limit or start == self.image_size:               
                break                                                                   
                                                                                        
            end = _lseek(self._f_image, start, whence2)                                 
            if end == -1 or end == self.image_size:                                     
                end = self.blocks_cnt * self.block_size                                 
            if end > limit:                                                             
                end = limit                                                             
                                                                                        
            start_blk = start // self.block_size                                        
            end_blk = end // self.block_size - 1                                        
            _log.debug("FilemapSeek: yielding range (%d, %d)"                           
                       % (start_blk, end_blk))                                          
            yield (start_blk, end_blk)                                                  

See the calculations of start_blk and end_blk. self.block_size is determined through calling FIGETBSZ with a fallback to stat.st_blksize. At least the latter is not suitable for this purpose. It signifies a preferred IO size to the file, and can be different (larger) from allocation size for the underlying file. In this case, the above calculation for start_blk and end_blk will result in incorrect information.

See the below example, where a range of 0--1 is calculated, because the initial data block in the file is shorter than the value determined through stat.st_blksize

$ bmaptool -d create --no-checksum /mnt/testfile
bmaptool: debug: opened image "/mnt/testfile"
bmaptool: debug: block size 1048576, blocks count 1831421, image size 1920383410176
bmaptool: debug: FilemapFiemap: initializing
bmaptool: debug: FilemapFiemap: the FIEMAP ioctl is not supported by the file-system
bmaptool: debug: opened image "/mnt/testfile"
bmaptool: debug: block size 1048576, blocks count 1831421, image size 1920383410176
bmaptool: debug: FilemapSeek: initializing
bmaptool: debug: FilemapSeek: get_mapped_ranges(0,  1831421(1831420))
bmaptool: debug: FilemapSeek: yielding range (0, -1)
bmaptool: debug: FilemapSeek: yielding range (1, 8)
[...]

Stat for this file:

$ stat /mnt/testfile
  File: /mnt/testfile
  Size: 1920383410176   Blocks: 66548192   IO Block: 1048576 regular file
Device: 35h/53d Inode: 97          Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2026-04-19 00:02:53.189324295 +0000
Modify: 2026-04-19 00:03:51.812166696 +0000
Change: 2026-04-19 00:03:51.812166696 +0000
 Birth: -

/mnt is an NFS file system.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions