CI Fix GPTQmodel install in Docker build (#3188)

BenjaminBossan · web-flow · commit 1d8e124424be · 2026-04-24T15:13:25.000+02:00
While building the GPU Docker image, there is an error because GPTQModel cannot
be installed successfully. This is caused by the --no-build-isolation flag,
which is no longer required according to the GPTQModel README.

The next issue was conflicting CUDA versions with EETQ, which I fixed by
bumping the image.

Finally, TE wouldn't build, which required setting some env vars.

The docker build now finishes, as can be seen in the CI.
diff --git a/docker/peft-gpu/Dockerfile b/docker/peft-gpu/Dockerfile
@@ -26,7 +26,7 @@ RUN chsh -s /bin/bash
 SHELL ["/bin/bash", "-c"]
 
 # Stage 2
-FROM nvidia/cuda:12.8.1-cudnn-devel-ubuntu22.04 AS build-image
+FROM nvidia/cuda:13.2.1-cudnn-devel-ubuntu24.04 AS build-image
 COPY --from=compile-image /opt/conda /opt/conda
 ENV PATH=/opt/conda/bin:$PATH
 
@@ -48,14 +48,16 @@ RUN conda run -n peft pip install --no-cache-dir bitsandbytes optimum
 # to have compute hardware available we use the information from the CI runner (which hosts
 # a NVIDIA L4). So we fix the compute capability to 8.9. In the future we might extend this
 # to a list of compute capabilities (separated by ;).
-RUN CUDA_ARCH_LIST=8.9 conda run -n peft pip install --no-build-isolation gptqmodel
+RUN CUDA_ARCH_LIST=8.9 conda run -n peft pip install gptqmodel
 
 RUN \
     # Add eetq for quantization testing; needs to run without build isolation since the setup
     # script directly imports torch from the environment which would fail with isolation.
-    conda run -n peft pip install --no-build-isolation git+https://github.com/NetEase-FuXi/EETQ.git
+    # Ninja should speed up build time.
+    conda run -n peft pip install ninja && conda run -n peft pip install --no-build-isolation git+https://github.com/NetEase-FuXi/EETQ.git
 
-RUN \
+RUN NVTE_BUILD_USE_NVIDIA_WHEELS=1 \
+    CPATH="/usr/local/cuda/include:${CPATH}" \
     conda run -n peft pip install --no-build-isolation "transformer_engine[pytorch]"
 
 # Activate the conda env and install transformers + accelerate from source
@@ -64,7 +66,7 @@ RUN conda run -n peft pip install -U --no-cache-dir \
         "soundfile>=0.12.1" \
         scipy \
         torchao \
-        fbgemm-gpu-genai>=1.2.0 \
+        "fbgemm-gpu-genai>=1.2.0" \
         git+https://github.com/huggingface/transformers \
         git+https://github.com/huggingface/accelerate \
         peft[test]@git+https://github.com/huggingface/peft \