Skip to content

ggml-rpc: fix 32-bit ARM (ILP32) serialization bugs#21828

Open
rovmo wants to merge 1 commit intoggml-org:masterfrom
rovmo:fix/rpc-32bit-arm-ilp32
Open

ggml-rpc: fix 32-bit ARM (ILP32) serialization bugs#21828
rovmo wants to merge 1 commit intoggml-org:masterfrom
rovmo:fix/rpc-32bit-arm-ilp32

Conversation

@rovmo
Copy link
Copy Markdown

@rovmo rovmo commented Apr 12, 2026

Fix RPC backend on 32-bit ARM (ILP32) platforms

Problem

The RPC backend is completely non-functional on 32-bit ARM (armv7, ILP32) platforms. Any attempt to use --rpc causes either a silent hang, server-side "Expected HELLO command" errors, set_tensor out-of-bounds crashes, or graph_compute failures with corrupted tensor IDs.

Root cause: several places in the RPC serialization code assume size_t and pointers are 8 bytes. On 32-bit platforms size_t is 4 bytes and pointers are 4 bytes, causing the wire protocol to produce undersized or misaligned data that the receiving side (which reads fixed uint64_t fields) cannot parse.

Fixes

1. send_msg / send_rpc_cmd: message size field is 4 bytes instead of 8

send_msg() sends sizeof(msg_size) bytes for the size header, where msg_size is size_t (4 bytes on ILP32). The receiver recv_msg() reads uint64_t (8 bytes). The receiver blocks forever waiting for the remaining 4 bytes, causing a silent hang on the very first RPC command (HELLO).

2. ggml_backend_rpc_buffer_set_tensor: offset field is 4 bytes instead of 8

The offset parameter is size_t but the wire format specifies 8 bytes. The memcpy copies only sizeof(offset) = 4 bytes, and the subsequent data copy starts at the wrong position. The server reads a garbage 64-bit offset and rejects with "out of buffer bounds".

3. serialize_graph: node pointer memcpy reads 8 bytes from a 4-byte pointer

memcpy(dest, &cgraph->nodes[i], sizeof(uint64_t)) reads 8 bytes starting at the address of a 4-byte pointer. The extra 4 bytes are adjacent memory, producing corrupted tensor IDs that fail with "failed to create graph node".

4. serialize_tensor: zero-initialize rpc_tensor struct

Move memset before the null check so all serializations start zeroed, preventing uninitialized upper bytes in uint64_t fields assigned from narrower types.

Platforms affected

All 32-bit platforms where sizeof(size_t) == 4:

  • ARMv7 / armhf (Android Termux, Raspberry Pi 32-bit, etc.)
  • Potentially i686 / x86 32-bit builds

Testing

Tested on Android ARMv7 (Termux, clang 21) with Qwen2.5-0.5B-Instruct Q4_K_M:

  • Before: llama-cli --rpc hangs indefinitely with zero output
  • After: model loads and runs inference successfully over local RPC

The RPC backend assumes size_t and pointers are 8 bytes in its wire
protocol serialization. On ILP32 platforms (ARMv7, x86-32) where
size_t is 4 bytes, this causes silent hangs, data corruption, and
server crashes — making RPC completely non-functional.

Four fixes:

- send_msg/send_rpc_cmd: use uint64_t for the size header instead of
  size_t, matching the uint64_t the receiver reads via recv_msg

- set_tensor: serialize offset as uint64_t (8 bytes) instead of
  size_t (4 bytes), and fix the subsequent data placement offset

- serialize_graph: widen node pointers to uint64_t before memcpy
  instead of reading sizeof(uint64_t) bytes from a 4-byte pointer

- serialize_tensor: zero-initialize rpc_tensor before populating
  fields to avoid uninitialized upper bytes in uint64_t fields
  assigned from narrower types

Tested on Android ARMv7 (Termux, clang 21) with Qwen2.5-0.5B over
RPC: previously hung on HELLO handshake, now loads and runs inference.
@rovmo rovmo requested a review from a team as a code owner April 12, 2026 23:25
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 12, 2026
@ggml-gh-bot
Copy link
Copy Markdown

ggml-gh-bot bot commented Apr 12, 2026

Hi @rovmo, thanks for your contribution!

Per our contribution guidelines, the automated PR checker found the following issue(s) that need your attention:

  • AI-generated content: This project does not accept PRs, descriptions or commit messages that are fully or predominantly AI-generated. If you have used AI to assist you in writing code, please make sure to disclose that explicitly.

Please note that maintainers reserve the right to make final decisions on PRs. If you believe there is a mistake, please comment below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant