Problem
The label_response function in the Rust WASM guard panics when a tool response contains multi-byte UTF-8 characters (CJK, emoji, accented characters, etc.) and the byte index 500 falls in the middle of a multi-byte code point.
The root cause is two instances of unsafe byte-index string slicing in lib.rs:
// lib.rs line 808 (path-specific output preview)
let output_preview = if output_json.len() > 500 {
&output_json[..500] // panics if byte 500 is mid-character
} else {
&output_json
};
// lib.rs line 939 (general output preview)
let output_preview = if output_json.len() > 500 {
&output_json[..500] // same panic
} else {
&output_json
};
In Rust, indexing into a &str at a byte position that is not a character boundary panics with byte index N is not a char boundary. Since the slicing happens inside the WASM guest, the panic causes a WASM trap that permanently poisons the guard instance for the rest of the session.
Impact
Severity: High — this is a session-killing bug.
- Immediate: The
label_response call panics and returns an error to the gateway
- Cascading: The WASM instance is trapped and cannot recover. Every subsequent MCP tool call routed through that guard fails with:
"WASM guard 'github' is unavailable after a previous trap"
- Scope: Affects any workflow that processes content containing non-ASCII characters within the first ~500 bytes of a tool response JSON. This includes:
- PRs/issues with CJK (Chinese, Japanese, Korean) text in title or body
- Content with emoji in the first 500 bytes of serialized JSON
- Any Unicode content where multi-byte sequences cross the 500-byte boundary
Evidence
Discovered in the moeru-ai/airi repository during a PR triage workflow (run #24311673575). PR #1649 has a Chinese-language body — pull_request_read with method: "get" returns the PR body in the tool result, and the serialized JSON has CJK characters within the first 500 bytes.
Log analysis from mcp-gateway.log:
- Last successful log:
"generated 1 labeled items" (lib.rs ~L930)
- The
output_preview=... log line is absent — confirming the crash occurs between JSON serialization and the preview log statement
- Next log lines are
dealloc calls (Go defer cleanup after WASM trap)
- All subsequent MCP calls fail:
"WASM guard 'github' is unavailable after a previous trap"
Additional note
There is a third byte-slicing pattern at line 752 that slices &[u8] (not &str) and uses from_utf8() which returns Result — this one does not panic, but it silently drops the preview log if the truncation splits a character. It is not a crash bug but could be improved for consistency.
let preview_len = std::cmp::min(500, input_bytes.len());
if let Ok(preview) = std::str::from_utf8(&input_bytes[..preview_len]) {
// safe — from_utf8 returns Err instead of panicking
}
Fix
Replace byte-index slicing with str::floor_char_boundary(500) (stable since Rust 1.80) which finds the nearest valid character boundary at or before the given byte index:
let preview_end = output_json.floor_char_boundary(500);
let output_preview = &output_json[..preview_end];
See PR #3690 for the fix.
Reproduction
Any workflow that calls a GitHub MCP tool returning content with multi-byte UTF-8 characters positioned such that byte index 500 falls within a multi-byte sequence. Example: pull_request_read on a PR with a CJK body, or search_code returning source files with Unicode comments.
Problem
The
label_responsefunction in the Rust WASM guard panics when a tool response contains multi-byte UTF-8 characters (CJK, emoji, accented characters, etc.) and the byte index 500 falls in the middle of a multi-byte code point.The root cause is two instances of unsafe byte-index string slicing in
lib.rs:In Rust, indexing into a
&strat a byte position that is not a character boundary panics withbyte index N is not a char boundary. Since the slicing happens inside the WASM guest, the panic causes a WASM trap that permanently poisons the guard instance for the rest of the session.Impact
Severity: High — this is a session-killing bug.
label_responsecall panics and returns an error to the gateway"WASM guard 'github' is unavailable after a previous trap"Evidence
Discovered in the
moeru-ai/airirepository during a PR triage workflow (run #24311673575). PR #1649 has a Chinese-language body —pull_request_readwithmethod: "get"returns the PR body in the tool result, and the serialized JSON has CJK characters within the first 500 bytes.Log analysis from
mcp-gateway.log:"generated 1 labeled items"(lib.rs ~L930)output_preview=...log line is absent — confirming the crash occurs between JSON serialization and the preview log statementdealloccalls (Go defer cleanup after WASM trap)"WASM guard 'github' is unavailable after a previous trap"Additional note
There is a third byte-slicing pattern at line 752 that slices
&[u8](not&str) and usesfrom_utf8()which returnsResult— this one does not panic, but it silently drops the preview log if the truncation splits a character. It is not a crash bug but could be improved for consistency.Fix
Replace byte-index slicing with
str::floor_char_boundary(500)(stable since Rust 1.80) which finds the nearest valid character boundary at or before the given byte index:See PR #3690 for the fix.
Reproduction
Any workflow that calls a GitHub MCP tool returning content with multi-byte UTF-8 characters positioned such that byte index 500 falls within a multi-byte sequence. Example:
pull_request_readon a PR with a CJK body, orsearch_codereturning source files with Unicode comments.