Skip to content

Draft: Improve hdf5Reader with random sampling and type handling#50

Open
varunviswapriyan wants to merge 13 commits into
idtlab:developfrom
varunviswapriyan:develop
Open

Draft: Improve hdf5Reader with random sampling and type handling#50
varunviswapriyan wants to merge 13 commits into
idtlab:developfrom
varunviswapriyan:develop

Conversation

@varunviswapriyan
Copy link
Copy Markdown

@varunviswapriyan varunviswapriyan commented Nov 23, 2025

Enhanced the hdf5Reader class to include random sampling of rows while reading HDF5 files. Improved handling of numpy types and added error handling for various data processing steps. Also introduced chunked reading, multidimensional flattening, and structured type handling.

Related Issues / Pull Requests

None.
List all related issues and/or pull requests if there are any.

Description

Include a brief summary of the proposed changes.
Random sampling of 2000 rows in the HDF5 reader and chunked reading better supports massive datasets. Code also expanded structured NumPy types into python dictionaries, properly flattened multidimensional datasets, and handled 1D arrays more efficiently.

What changes are proposed in this pull request?

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code modifies existing public API, or introduces new public API, and I updated or wrote documentation
  • I have commented my code
  • My code requires documentation updates, and I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Enhanced the hdf5Reader class to include random sampling of rows while reading HDF5 files. Improved handling of numpy types and added error handling for various data processing steps.
@jeanbez jeanbez changed the title Improve hdf5Reader with random sampling and type handling Draft: Improve hdf5Reader with random sampling and type handling Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants