Update NanoVDB with the latest changes#2031
Merged
swahtz merged 43 commits intoApr 23, 2025
Merged
Conversation
5bb0830 to
50b9ecb
Compare
kmuseth
approved these changes
Apr 21, 2025
Contributor
kmuseth
left a comment
There was a problem hiding this comment.
looks good to me though this is so huge I can't say I carefully studied every single change
Signed-off-by: Matthew Cong <mcong@nvidia.com>
* improved change notes and updated version number Signed-off-by: Ken Museth <ken.museth@gmail.com> * improved documentation Signed-off-by: Ken Museth <ken.museth@gmail.com> --------- Signed-off-by: Ken Museth <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
* improved nanovdb_convert Signed-off-by: Ken Museth <ken.museth@gmail.com> * fixed typo Signed-off-by: Ken Museth <ken.museth@gmail.com> * added unit tests Signed-off-by: Ken Museth <ken.museth@gmail.com> * review feedback Signed-off-by: Ken Museth <ken.museth@gmail.com> --------- Signed-off-by: Ken Museth <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Minor fix to unit-test of NanoVDB Signed-off-by: Ken Museth <ken.museth@gmail.com> * minor change Signed-off-by: Ken Museth <ken.museth@gmail.com> --------- Signed-off-by: Ken Museth <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Add device mesh Signed-off-by: Matthew Cong <mcong@nvidia.com> * Restore current device in constructor and destructor Signed-off-by: Matthew Cong <mcong@nvidia.com> * Enable range-based for loops Signed-off-by: Matthew Cong <mcong@nvidia.com> * Use structured bindings inside for loop Signed-off-by: Matthew Cong <mcong@nvidia.com> * Remove uneeded accessors and modularize functions Signed-off-by: Matthew Cong <mcong@nvidia.com> * Statically initialize entry point Signed-off-by: Matthew Cong <mcong@nvidia.com> * Encapsulate Signed-off-by: Matthew Cong <mcong@nvidia.com> * Switch to lightweight wrapper class for DeviceNodes vector Signed-off-by: Matthew Cong <mcong@nvidia.com> * Cleanup Signed-off-by: Matthew Cong <mcong@nvidia.com> * Minor changes needed for DistributedPointsToGrid Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add RAII DeviceGuard Signed-off-by: Matthew Cong <mcong@nvidia.com> * More cleanup Signed-off-by: Matthew Cong <mcong@nvidia.com> * Clean up NCCL dependency Signed-off-by: Matthew Cong <mcong@nvidia.com> * Remove parallel_for to modularize Signed-off-by: Matthew Cong <mcong@nvidia.com> * Implement and test move constructor/assignment Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add docs Signed-off-by: Matthew Cong <mcong@nvidia.com> * Expand test and documentation Signed-off-by: Matthew Cong <mcong@nvidia.com> * Move non-reinit comment to function description Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com> * Switch to using aliases and modern type names Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com> * Use size_type instead of int for return Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com> * Separate out DeviceGuard and clean up API Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com> Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Convert MGPU convolution example to use DeviceMesh Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix typos and clarify Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com> * Check for errors and switch to for_each Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com> Co-authored-by: Mark Harris <783069+harrism@users.noreply.github.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Fixed issue in signedfloodfill Signed-off-by: Ken <ken.museth@gmail.com> * partially addressed review comments Signed-off-by: Ken <ken.museth@gmail.com> * added RootData::TileIterator Signed-off-by: Ken <ken.museth@gmail.com> * cleanup Signed-off-by: Ken <ken.museth@gmail.com> * fixed assert bug Signed-off-by: Ken <ken.museth@gmail.com> * snapshot Signed-off-by: Ken <ken.museth@gmail.com> * fixed bug and improved unit-test Signed-off-by: Ken <ken.museth@gmail.com> * improved unit test and documentation Signed-off-by: Ken <ken.museth@gmail.com> --------- Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com> Co-authored-by: Ken Museth <1495380+kmuseth@users.noreply.github.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
…dBlindMetaData Signed-off-by: Matthew Cong <mcong@nvidia.com> Co-authored-by: Ken Museth <1495380+kmuseth@users.noreply.github.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* major improvements to GridBlindMetaData Signed-off-by: Ken <ken.museth@gmail.com> * minor changes to fix clang issue Signed-off-by: Ken <ken.museth@gmail.com> --------- Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* minor simplification in CreateNanoGrid Signed-off-by: Ken <ken.museth@gmail.com> * fixed typo Signed-off-by: Ken <ken.museth@gmail.com> --------- Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* introducing new magic numbers for grids and files Signed-off-by: Ken <ken.museth@gmail.com> * removed two redundant magic numbers Signed-off-by: Ken <ken.museth@gmail.com> --------- Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Add distributed implementation of PointsToGrid Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix race condition in merge and single GPU case Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix kernel/cudaFree race conditions Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add DistributedPointsToGrid unittest Signed-off-by: Matthew Cong <mcong@nvidia.com> * Clean up example Signed-off-by: Matthew Cong <mcong@nvidia.com> * Use range-based for loops Signed-off-by: Matthew Cong <mcong@nvidia.com> * Use structured bindings inside for loop Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix Windows build and some warnings Signed-off-by: Matthew Cong <mcong@nvidia.com> * Update for refactored DeviceMesh Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix copyright/include Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add parallelForEach helper Signed-off-by: Matthew Cong <mcong@nvidia.com> * Templatize kernels to align with CUB better Signed-off-by: Matthew Cong <mcong@nvidia.com> * Shorten TemporaryDevicePool to TempDevicePool Signed-off-by: Matthew Cong <mcong@nvidia.com> * Refactor to use TempDevicePool Signed-off-by: Matthew Cong <mcong@nvidia.com> * Switch to class Signed-off-by: Matthew Cong <mcong@nvidia.com> * Parallelize kernel dispatch in build Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix race condition wrt to root node processing Signed-off-by: Matthew Cong <mcong@nvidia.com> * Speed up synchronization Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add comments and fix deprecated TransformInputIterator Signed-off-by: Matthew Cong <mcong@nvidia.com> * Address review comments Signed-off-by: Matthew Cong <mcong@nvidia.com> * Address more review comments * Add more cudaCheck calls and restrict EqualityIndicator type Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix license identifiers and remove unused code Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add missing include Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com> Co-authored-by: = <=> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Silence CMake warning about FindBoost module deprecation Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix typo Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com> Co-authored-by: Ken Museth <1495380+kmuseth@users.noreply.github.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
…uild failures * Adding /bigobj to TestNanoVDB.cu compilation options to fix Windows build failure --------- Signed-off-by: Jonathan Swartz <jonathan@jswartz.info> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Start support for more than two GPUs in key merge Signed-off-by: Matthew Cong <mcong@nvidia.com> * Flip flop storage Signed-off-by: Matthew Cong <mcong@nvidia.com> * Generalize concurrent leftIntervals/rightIntervals usage Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix median search Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix rebalancing Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix recursive MGPU merge Signed-off-by: Matthew Cong <mcong@nvidia.com> * Update TODO Signed-off-by: Matthew Cong <mcong@nvidia.com> * Add guards for zero-tile GPUs Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com>
…corresponding tests) * Fix memory leaks in PointsToGrid tests Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix leapfrogging across recursion levels Signed-off-by: Matthew Cong <mcong@nvidia.com> * Parallelize kernel dispatch for different levels Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com>
* minor cleanup Signed-off-by: Ken <ken.museth@gmail.com> * fixed issue in get/set random access methods Signed-off-by: Ken <ken.museth@gmail.com> * cleanup Signed-off-by: Ken <ken.museth@gmail.com> * deleted white space Signed-off-by: Ken <ken.museth@gmail.com> * cleanup Signed-off-by: Ken <ken.museth@gmail.com> * added unit tests Signed-off-by: Ken <ken.museth@gmail.com> * improved unit tests Signed-off-by: Ken <ken.museth@gmail.com> * improved unit tests Signed-off-by: Ken <ken.museth@gmail.com> --------- Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
* Add timer for MGPU PointsToGrid Signed-off-by: Matthew Cong <mcong@nvidia.com> * Avoid blocking kernel launches due to cudaFree Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix overlapping of copies Signed-off-by: Matthew Cong <mcong@nvidia.com> * Pipeline pointsPerVoxelPrefix sum Signed-off-by: Matthew Cong <mcong@nvidia.com> * Cleanup and optimization Signed-off-by: Matthew Cong <mcong@nvidia.com> * Further improve pipelining Signed-off-by: Matthew Cong <mcong@nvidia.com> * Remove unnecessary sync point Signed-off-by: Matthew Cong <mcong@nvidia.com> * Revert "Remove unnecessary sync point" This reverts commit f01b36ad852f32ec65517b3baa08ea267d5edc2d. Signed-off-by: Matthew Cong <mcong@nvidia.com> * Switch to GPU sync Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix event destruction race condition and reduce host thread latency Signed-off-by: Matthew Cong <mcong@nvidia.com> * Revert timer add Signed-off-by: Matthew Cong <mcong@nvidia.com> * More fine-grained pipelining Signed-off-by: Matthew Cong <mcong@nvidia.com> * Fix event wait/destruction race condition Signed-off-by: Matthew Cong <mcong@nvidia.com> --------- Signed-off-by: Matthew Cong <mcong@nvidia.com>
* minor cleanup Signed-off-by: Ken <ken.museth@gmail.com> * minor improvements to nanovdb::tools::cuda::PointsToGrid Signed-off-by: Ken <ken.museth@gmail.com> --------- Signed-off-by: Ken <ken.museth@gmail.com> Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
Signed-off-by: Matthew Cong <mcong@nvidia.com>
50b9ecb to
2ba65fe
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces and includes several improvements to NanoVDB. Changes include:
From @kmuseth :
Add missing constructor for GridBlindMetaData required for C++20
Fixed incorrect accessor recursion for non-leaf nodes
Fixed signedFloodFill for the (rare) case when tiles are missing in the root node
Added IndexGrid support to nanovdb_convert
Introduced new magic numbers for grids and files (backwards and forwards compatible)
From @matthewdcong
Added a DeviceMesh utility class for (multi-)GPU stream and communication management
Added a multi-GPU implementation of PointsToGrid
Fixed TBB template deduction in C++20
Clean up const-correctness in UnifiedBuffer
Fixed missing relocatable device flag for CUDA unit tests