Skip to content

Fix/vulkan adreno crashes#3719

Open
simaotwx wants to merge 3 commits intoggml-org:masterfrom
toowoxx:fix/vulkan-adreno-crashes
Open

Fix/vulkan adreno crashes#3719
simaotwx wants to merge 3 commits intoggml-org:masterfrom
toowoxx:fix/vulkan-adreno-crashes

Conversation

@simaotwx
Copy link
Copy Markdown

whisper.cpp crashes on certain Adreno GPUs (just like this Adreno 642L) so it should switch to CPU to avoid the shader linking issue, among other things.
These patches only aim to fix the crash, not the underlying issue. Running on the CPU is pretty slow, especially on older ARM CPUs from when Snapdragon mobile platform model numbers were still three-digit numbers.

I'm not very good at C++ nor do I have lots of insight into whisper.cpp so any improvements are welcome.

Leaving these here so others can test these patches:
Fixes #2411
Fixes #3035
Fixes #2765
Related: #2415
Related: #3168
Related: ggml-org/llama.cpp#12421
Related: ggml-org/llama.cpp#6395
Related: ggml-org/llama.cpp#13450

Some Vulkan drivers (observed on Adreno, Qualcomm build 923a446bf8,
driver date 09/05/24) report bufferDeviceAddress support in
VkPhysicalDeviceVulkan12Features but crash with SIGSEGV when
vkGetBufferDeviceAddress is actually called:

    Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
    ggml-org#1 ggml_vk_create_buffer+4084
    ggml-org#2 ggml_vk_create_buffer_device+148
    ggml-org#3 ggml_backend_vk_buffer_type_alloc_buffer+240
    ggml-org#7 whisper_model_load+5996

The crash occurs inside ggml_vk_create_buffer when
device->device.getBufferAddress() is called — the driver-internal
function pointer dereferences null.

After creating the logical device, verify that the function pointer
resolves via vkGetDeviceProcAddr and that a test call returns a
non-zero address. If either check fails, disable
buffer_device_address so all guarded code paths skip BDA.
Some Vulkan drivers (observed on Adreno, Qualcomm build 923a446bf8)
fail to compile compute shaders at runtime, reporting
"Failed to link shaders" and returning ErrorUnknown from
createComputePipeline. Previously this threw a C++ exception that
propagated as an uncaught abort, or the resulting null pipeline was
dispatched causing SIGSEGV:

    AdrenoVK-0: Failed to link shaders.
    Fatal signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0xe8
    ggml-org#1 ggml_vk_dispatch_pipeline<vk_mat_mat_push_constants>+360
    ggml-org#2 ggml_vk_mul_mat_q_f16+6616
    ggml-org#3 ggml_backend_vk_graph_compute+41780

Three changes:

1. ggml_vk_create_pipeline_func: catch the exception, increment
   device->pipeline_failures, clean up the shader module, and return
   instead of rethrowing. Also handle null pipeline after creation.

2. ggml_vk_dispatch_pipeline: early-return if the pipeline is null or
   not compiled (safety net against dispatching broken pipelines).

3. ggml_backend_vk_device_supports_op: return false for all ops when
   pipeline_failures > 0, causing the backend scheduler to route
   everything to the CPU backend. The GPU is still used for buffer
   allocation but all compute runs on CPU.
Previously, missing storageBuffer16BitAccess threw
std::runtime_error("Unsupported device") which crashed the process on
platforms where C++ exceptions propagate as uncaught aborts (Android).

Some drivers also report the feature bit but don't enumerate
VK_KHR_16bit_storage as a device extension — pushing it into
device_extensions then causes vkCreateDevice to fail with
ErrorExtensionNotPresent (another fatal abort on Android).

Instead of throwing, set pipeline_failures so supports_op returns
false for all ops and the backend scheduler routes everything to CPU.
Only push VK_KHR_16bit_storage when the extension is actually
enumerated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant