Hi there! π
I stumbled across the Dynamic VRAM blog post today and got really excited β this is exactly what I've been needing for my AMD setup. The post mentions AMD support is planned for the future, but I couldn't wait, so I fired up my slop generator (Claude Code) and about an hour later it came back with a working HIP port.
What I did
The entire port lives in a single file change to src/plat.h β an #ifdef __HIP_PLATFORM_AMD__ block at the top that maps all CUDA Driver API types and functions to their HIP equivalents. All .c files remain completely unmodified.
Key points:
CUdeviceptr kept as unsigned long long (matching CUDA) with cast macros for HIP APIs that expect void* β preserves all upstream pointer arithmetic
cuMemAllocHost wrapped to supply HIP's extra flags parameter
cuGetErrorString wrapped to adapt HIP's direct-return API style
pyt-cu-plug-alloc-async.c excluded from the build (the cudaMalloc/cudaFree symbol interposition is not portable to HIP, but the VBAR path that ComfyUI uses doesn't depend on it)
Test results (RX 9070, gfx1201, ROCm 7.2)
All HIP VMM APIs work correctly:
hipMemAddressReserve β zero VRAM cost, as expected
hipMemCreate / hipMemMap / hipMemSetAccess β physical VRAM allocated and accessible
hipMemUnmap / hipMemRelease β clean deallocation
- Full VBAR allocate β fault β unpin lifecycle works
ComfyUI startup log:
comfy-aimdo inited for GPU: AMD Radeon RX 9070 (VRAM: 16304 MB)
DynamicVRAM support detected and enabled
Workflows that previously crashed with OOM (multi-model, full 1 megapixel resolution) now complete successfully with dynamic weight eviction working as intended.
The fork
I've pushed the change to a branch on my fork: spheenik/comfy-aimdo@hip-rocm-support
Happy to open a PR if you're interested. I understand you might want to approach this differently for an official implementation (CI, wheel builds, etc.) β just wanted to share that the HIP VMM APIs are there and working on RDNA 4, in case it's useful for your roadmap.
Build instructions (for anyone who wants to try)
gcc -shared -fPIC -O2 -D__HIP_PLATFORM_AMD__ -Isrc \
-I/opt/rocm/include -L/opt/rocm/lib \
src/vrambuf.c src/model-vbar.c src/control.c src/hostbuf.c \
src/debug.c src/pyt-cu-plug-alloc.c src-posix/model-mmap.c \
-lamdhip64 -o aimdo.so
Then replace the CUDA aimdo.so inside the comfy_aimdo pip package directory and start ComfyUI with --enable-dynamic-vram.
Thanks for the great work on AIMDO β it's a game changer!
Hi there! π
I stumbled across the Dynamic VRAM blog post today and got really excited β this is exactly what I've been needing for my AMD setup. The post mentions AMD support is planned for the future, but I couldn't wait, so I fired up my slop generator (Claude Code) and about an hour later it came back with a working HIP port.
What I did
The entire port lives in a single file change to
src/plat.hβ an#ifdef __HIP_PLATFORM_AMD__block at the top that maps all CUDA Driver API types and functions to their HIP equivalents. All.cfiles remain completely unmodified.Key points:
CUdeviceptrkept asunsigned long long(matching CUDA) with cast macros for HIP APIs that expectvoid*β preserves all upstream pointer arithmeticcuMemAllocHostwrapped to supply HIP's extraflagsparametercuGetErrorStringwrapped to adapt HIP's direct-return API stylepyt-cu-plug-alloc-async.cexcluded from the build (thecudaMalloc/cudaFreesymbol interposition is not portable to HIP, but the VBAR path that ComfyUI uses doesn't depend on it)Test results (RX 9070, gfx1201, ROCm 7.2)
All HIP VMM APIs work correctly:
hipMemAddressReserveβ zero VRAM cost, as expectedhipMemCreate/hipMemMap/hipMemSetAccessβ physical VRAM allocated and accessiblehipMemUnmap/hipMemReleaseβ clean deallocationComfyUI startup log:
Workflows that previously crashed with OOM (multi-model, full 1 megapixel resolution) now complete successfully with dynamic weight eviction working as intended.
The fork
I've pushed the change to a branch on my fork: spheenik/comfy-aimdo@hip-rocm-support
Happy to open a PR if you're interested. I understand you might want to approach this differently for an official implementation (CI, wheel builds, etc.) β just wanted to share that the HIP VMM APIs are there and working on RDNA 4, in case it's useful for your roadmap.
Build instructions (for anyone who wants to try)
gcc -shared -fPIC -O2 -D__HIP_PLATFORM_AMD__ -Isrc \ -I/opt/rocm/include -L/opt/rocm/lib \ src/vrambuf.c src/model-vbar.c src/control.c src/hostbuf.c \ src/debug.c src/pyt-cu-plug-alloc.c src-posix/model-mmap.c \ -lamdhip64 -o aimdo.soThen replace the CUDA
aimdo.soinside thecomfy_aimdopip package directory and start ComfyUI with--enable-dynamic-vram.Thanks for the great work on AIMDO β it's a game changer!