ggml-rpc: fix 32-bit ARM (ILP32) serialization bugs by rovmo · Pull Request #21828 · ggml-org/llama.cpp

rovmo · 2026-04-12T23:25:31Z

Fix RPC backend on 32-bit ARM (ILP32) platforms

Problem

The RPC backend is completely non-functional on 32-bit ARM (armv7, ILP32) platforms. Any attempt to use --rpc causes either a silent hang, server-side "Expected HELLO command" errors, set_tensor out-of-bounds crashes, or graph_compute failures with corrupted tensor IDs.

Root cause: several places in the RPC serialization code assume size_t and pointers are 8 bytes. On 32-bit platforms size_t is 4 bytes and pointers are 4 bytes, causing the wire protocol to produce undersized or misaligned data that the receiving side (which reads fixed uint64_t fields) cannot parse.

Fixes

1. send_msg / send_rpc_cmd: message size field is 4 bytes instead of 8

send_msg() sends sizeof(msg_size) bytes for the size header, where msg_size is size_t (4 bytes on ILP32). The receiver recv_msg() reads uint64_t (8 bytes). The receiver blocks forever waiting for the remaining 4 bytes, causing a silent hang on the very first RPC command (HELLO).

2. ggml_backend_rpc_buffer_set_tensor: offset field is 4 bytes instead of 8

The offset parameter is size_t but the wire format specifies 8 bytes. The memcpy copies only sizeof(offset) = 4 bytes, and the subsequent data copy starts at the wrong position. The server reads a garbage 64-bit offset and rejects with "out of buffer bounds".

3. serialize_graph: node pointer memcpy reads 8 bytes from a 4-byte pointer

memcpy(dest, &cgraph->nodes[i], sizeof(uint64_t)) reads 8 bytes starting at the address of a 4-byte pointer. The extra 4 bytes are adjacent memory, producing corrupted tensor IDs that fail with "failed to create graph node".

4. serialize_tensor: zero-initialize rpc_tensor struct

Move memset before the null check so all serializations start zeroed, preventing uninitialized upper bytes in uint64_t fields assigned from narrower types.

Platforms affected

All 32-bit platforms where sizeof(size_t) == 4:

ARMv7 / armhf (Android Termux, Raspberry Pi 32-bit, etc.)
Potentially i686 / x86 32-bit builds

Testing

Tested on Android ARMv7 (Termux, clang 21) with Qwen2.5-0.5B-Instruct Q4_K_M:

Before: llama-cli --rpc hangs indefinitely with zero output
After: model loads and runs inference successfully over local RPC

The RPC backend assumes size_t and pointers are 8 bytes in its wire protocol serialization. On ILP32 platforms (ARMv7, x86-32) where size_t is 4 bytes, this causes silent hangs, data corruption, and server crashes — making RPC completely non-functional. Four fixes: - send_msg/send_rpc_cmd: use uint64_t for the size header instead of size_t, matching the uint64_t the receiver reads via recv_msg - set_tensor: serialize offset as uint64_t (8 bytes) instead of size_t (4 bytes), and fix the subsequent data placement offset - serialize_graph: widen node pointers to uint64_t before memcpy instead of reading sizeof(uint64_t) bytes from a 4-byte pointer - serialize_tensor: zero-initialize rpc_tensor before populating fields to avoid uninitialized upper bytes in uint64_t fields assigned from narrower types Tested on Android ARMv7 (Termux, clang 21) with Qwen2.5-0.5B over RPC: previously hung on HELLO handshake, now loads and runs inference.

ggml-gh-bot · 2026-04-12T23:29:38Z

Hi @rovmo, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

AI-generated content: This project does not accept PRs, descriptions or commit messages that are fully or predominantly AI-generated. If you have used AI to assist you in writing code, please make sure to disclose that explicitly.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

rovmo requested a review from a team as a code owner April 12, 2026 23:25

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-rpc: fix 32-bit ARM (ILP32) serialization bugs#21828

ggml-rpc: fix 32-bit ARM (ILP32) serialization bugs#21828
rovmo wants to merge 1 commit intoggml-org:masterfrom
rovmo:fix/rpc-32bit-arm-ilp32

rovmo commented Apr 12, 2026

Uh oh!

ggml-gh-bot bot commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rovmo commented Apr 12, 2026

Fix RPC backend on 32-bit ARM (ILP32) platforms

Problem

Fixes

Platforms affected

Testing

Uh oh!

ggml-gh-bot bot commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant