Skip to content

Commit ec4b7c2

Browse files
Holberg7claude
andcommitted
bridge: add ASCII arrow -> as SAL frame operator
Prior to this change, the SAL frame-boundary operator was Unicode-only. `H:HR>130->U:ALERT` was parsed as a single frame rather than split into [H:HR>130, U:ALERT] with an implied THEN operator. This affected validate_composition, MacroRegistry validation, and SALDecoder NL annotation. This change adds `->` as a first-class alternate for `→` across the five coordinated locations in sdk/python/osmp/protocol.py and the two equivalent locations in the legacy sdk/python/src/osmp.py. Unicode behavior is unchanged. Tests: tests/test_bridge_fix.py (T1–T4, 4 passed). Field verification: RTP-012-B matrix 2026-04-24 on Gemma-4-E4B Q4_K_M patched-substrate cells used `->` across 135 opportunities with zero parser defects. Ctrl-substrate cells produced 0/9 acquisition across all priming conditions, providing supporting evidence that the pre-fix parser silently gated valid SAL chains. Versions: osmp 2.3.2 -> 2.3.3 (Python PyPI), osmp-protocol 2.3.2 -> 2.3.3 (npm). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 477e220 commit ec4b7c2

7 files changed

Lines changed: 209 additions & 10 deletions

File tree

CHANGELOG.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,40 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
66

77
---
88

9+
## [v2.3.3] — 2026-04-24
10+
11+
Patch release. Bridge-fix: ASCII arrow `->` is now a first-class SAL frame-boundary operator equivalent to Unicode ``.
12+
13+
**Package versions shipped with this release:**
14+
- `osmp` (Python / PyPI): 2.3.2 → **2.3.3**
15+
- `osmp-protocol` (npm): 2.3.2 → **2.3.3**
16+
17+
### Fixed
18+
19+
- **Bridge:** `->` (ASCII) is now treated as a SAL frame-boundary operator equivalent to `` (Unicode). Previously, any SAL string using the ASCII shorthand (e.g. `H:HR>130->U:ALERT`) was parsed as a single frame rather than split into constituent frames, causing `validate_composition` to false-negative on valid chains, `MacroRegistry.register` to reject ASCII-arrow chain templates, and `SALDecoder.decode_natural_language` to skip the "then" NL mapping. Unicode behavior is unchanged.
20+
21+
Coordinated edits across two files:
22+
- `sdk/python/osmp/protocol.py` — five edits: line 1787 (`_FRAME_SPLIT_RE` adds `->` alternate), line 2046 (`validate_composition` filter whitelist adds `"->"`), line 2908 (`MacroRegistry.register` chain-validation filter adds `"->"`), line 3052 (`MacroRegistry` consequence-class-inheritance filter adds `"->"`), line 3215 (`SALDecoder._OPERATOR_NL` adds `"->": " then "`).
23+
- `sdk/python/src/osmp.py` (legacy single-file distribution) — two edits: line 1610 (frame-split regex), line 1862 (validator filter). Three edits from the modular package have no legacy counterpart because `MacroRegistry` and `_OPERATOR_NL` are not exported from the legacy surface. See `LEGACY_PARITY_AUDIT.md` for the full parity picture.
24+
25+
### Tests
26+
27+
- Added `sdk/python/tests/test_bridge_fix.py` — T1 (NL annotation round-trip ASCII↔Unicode byte-identical, "then" present), T2 (validator parity — same issue set across arrow forms), T3 (macro chain with ASCII arrow validates against ASD), T4 (Unicode corpus regression: 10 golden frames decode byte-identical). 4/4 pass.
28+
- TypeScript suite (existing): 97/97 pass.
29+
- Go suite (existing): `osmp` tests ok.
30+
31+
### Field verification
32+
33+
Verified on-device during RTP-012-B (2026-04-24, Gemma-4-E4B Q4_K_M on RedMagic 10S Pro via llama-server in Termux). Patched-substrate cells used the `->` operator across 9 cells × 15 rounds = 135 opportunities with zero parser defects. Ctrl-substrate cells (pre-fix) produced 0/9 acquisition across all priming conditions — supporting evidence that the pre-fix parser silently gated valid SAL chains.
34+
35+
### Not changed
36+
37+
- No API additions, no breaking changes, no migration required.
38+
- `osmp-mcp` server not re-released; it picks up `osmp==2.3.3` automatically at install time.
39+
- `server.json` unchanged; MCP registry entry trails the package version per project convention.
40+
41+
---
42+
943
## [v2.1.0] — 2026-04-21
1044

1145
Additive release. No breaking changes. Patent-pending UBOT evaluator integrated across all three SDKs + MCP server.

sdk/python/osmp/protocol.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1784,7 +1784,7 @@ def encode_broadcast(self, namespace: str, opcode: str) -> str:
17841784
_OPCODE_PATTERN = r'[A-Z§][A-Z0-9§]*' # Opcode body, includes § for I:§
17851785

17861786
# Operators that split compound SAL instructions into frames
1787-
_FRAME_SPLIT_RE = re.compile(r'([→∧∨↔∥;])')
1787+
_FRAME_SPLIT_RE = re.compile(r'(->|[→∧∨↔∥;])')
17881788
# Pattern matching namespace:opcode after @ (prohibited: namespace-as-target)
17891789
_NS_TARGET_RE = re.compile(rf'@({_NS_PATTERN}):({_OPCODE_PATTERN})')
17901790
# Pattern extracting namespace:opcode from a SAL frame
@@ -2043,7 +2043,7 @@ def validate_composition(
20432043

20442044
# ── Split into frames and validate each ──────────────────────────────
20452045
parts = _FRAME_SPLIT_RE.split(sal)
2046-
frames = [p.strip() for p in parts if p.strip() and p.strip() not in "→∧∨↔∥;"]
2046+
frames = [p.strip() for p in parts if p.strip() and p.strip() not in ("→", "∧", "∨", "↔", "∥", ";", "->")]
20472047

20482048
has_r_namespace = False
20492049
has_r_hazardous_or_irreversible = False
@@ -2905,7 +2905,7 @@ def register(self, template: MacroTemplate) -> None:
29052905
clean = _re.sub(r'\{[^}]+\}', 'X', chain)
29062906
parts = _FRAME_SPLIT_RE.split(clean)
29072907
frames = [p.strip() for p in parts
2908-
if p.strip() and p.strip() not in "\u2192\u2227\u2228\u2194\u2225;"]
2908+
if p.strip() and p.strip() not in ("\u2192", "\u2227", "\u2228", "\u2194", "\u2225", ";", "->")]
29092909
for frame in frames:
29102910
m = _FRAME_NS_OP_RE.match(frame)
29112911
if m:
@@ -3049,7 +3049,7 @@ def _compute_inherited_cc(self, clean_chain: str) -> str | None:
30493049
parts = _FRAME_SPLIT_RE.split(clean_chain)
30503050
for part in parts:
30513051
part = part.strip()
3052-
if not part or part in "\u2192\u2227\u2228\u2194\u2225;":
3052+
if not part or part in ("\u2192", "\u2227", "\u2228", "\u2194", "\u2225", ";", "->"):
30533053
continue
30543054
# Check if this frame is R namespace
30553055
m = _FRAME_NS_OP_RE.match(part)
@@ -3213,6 +3213,7 @@ def decode_frame(self, encoded: str) -> DecodedInstruction:
32133213
# Operator glyph to readable NL word mapping
32143214
_OPERATOR_NL: dict[str, str] = {
32153215
"\u2192": " then ", # → THEN
3216+
"->": " then ", # → ASCII shorthand
32163217
"\u2227": " and ", # ∧ AND
32173218
"\u2228": " or ", # ∨ OR
32183219
"\u2194": " iff ", # ↔ IFF

sdk/python/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "osmp"
7-
version = "2.3.2"
7+
version = "2.3.3"
88
description = "OSMP -- Octid Semantic Mesh Protocol. Deterministic agentic instruction encoding."
99
readme = "README.md"
1010
license = { file = "LICENSE" }

sdk/python/src/osmp.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1607,7 +1607,7 @@ def encode_broadcast(self, namespace: str, opcode: str) -> str:
16071607
# ─────────────────────────────────────────────────────────────────────────────
16081608

16091609
# Operators that split compound SAL instructions into frames
1610-
_FRAME_SPLIT_RE = re.compile(r'([→∧∨↔∥;])')
1610+
_FRAME_SPLIT_RE = re.compile(r'(->|[→∧∨↔∥;])')
16111611
# Pattern matching namespace:opcode after @ (prohibited: namespace-as-target)
16121612
_NS_TARGET_RE = re.compile(r'@([A-Z]{1,2}):([A-Z][A-Z0-9]+)')
16131613
# Pattern extracting namespace:opcode from a SAL frame
@@ -1859,7 +1859,7 @@ def validate_composition(
18591859

18601860
# ── Split into frames and validate each ──────────────────────────────
18611861
parts = _FRAME_SPLIT_RE.split(sal)
1862-
frames = [p.strip() for p in parts if p.strip() and p.strip() not in "→∧∨↔∥;"]
1862+
frames = [p.strip() for p in parts if p.strip() and p.strip() not in ("→", "∧", "∨", "↔", "∥", ";", "->")]
18631863

18641864
has_r_namespace = False
18651865
has_r_hazardous_or_irreversible = False
Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
# Last run: 2026-04-24, 4 passed in 0.26s (pytest-9.0.3, Python 3.14.3).
2+
# tests/test_bridge_fix.py::test_t1_nl_annotation_roundtrip_ascii_unicode_byte_identical PASSED
3+
# tests/test_bridge_fix.py::test_t2_validator_parity_ascii_unicode_equivalent_issue_sets PASSED
4+
# tests/test_bridge_fix.py::test_t3_macro_chain_ascii_arrow_validates_against_asd PASSED
5+
# tests/test_bridge_fix.py::test_t4_unicode_corpus_byte_identical_to_golden PASSED
6+
"""
7+
Bridge-fix unit tests — T1..T4
8+
9+
Covers the 2026-04-24 bridge fix that added ASCII `->` as a frame-boundary
10+
operator alongside the Unicode `→` in osmp/protocol.py and legacy src/osmp.py.
11+
12+
- T1: NL annotation round-trip. ASCII `->` and Unicode `→` produce byte-
13+
identical NL output; "then" appears in both.
14+
- T2: Validator parity. validate_composition(...) on ASCII-arrow and
15+
Unicode-arrow forms produces equivalent issue sets (same rules, same
16+
severities, same frame identifiers).
17+
- T3: Macro chain with ASCII arrow. Registering a MacroTemplate whose
18+
chain_template uses `->` validates against an ASD containing the
19+
referenced opcodes. Pre-fix, the chain was treated as one frame.
20+
- T4: Regression on Unicode corpus. 10 Unicode-arrow SAL frames decode to
21+
byte-identical golden NL outputs, ensuring the ASCII-arrow addition
22+
did not regress Unicode handling.
23+
24+
Run: pytest tests/test_bridge_fix.py -v
25+
"""
26+
from __future__ import annotations
27+
28+
import os
29+
import sys
30+
31+
# Make the sdk/python source tree importable without install.
32+
_SDK_PYTHON = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
33+
if _SDK_PYTHON not in sys.path:
34+
sys.path.insert(0, _SDK_PYTHON)
35+
36+
from osmp.protocol import (
37+
AdaptiveSharedDictionary,
38+
MacroRegistry,
39+
MacroTemplate,
40+
SALDecoder,
41+
validate_composition,
42+
)
43+
44+
45+
# ---------------------------------------------------------------------------
46+
# T1 — NL annotation round-trip
47+
# ---------------------------------------------------------------------------
48+
49+
def test_t1_nl_annotation_roundtrip_ascii_unicode_byte_identical():
50+
"""ASCII `->` and Unicode `→` must decode to byte-identical NL strings,
51+
both containing the operator word 'then'."""
52+
decoder = SALDecoder()
53+
ascii_out = decoder.decode_natural_language("H:HR>130->U:ALERT")
54+
unicode_out = decoder.decode_natural_language("H:HR>130\u2192U:ALERT")
55+
56+
assert "then" in ascii_out, f"ASCII arrow output missing 'then': {ascii_out!r}"
57+
assert "then" in unicode_out, f"Unicode arrow output missing 'then': {unicode_out!r}"
58+
assert ascii_out == unicode_out, (
59+
f"ASCII-vs-Unicode divergence:\n ascii={ascii_out!r}\n uni ={unicode_out!r}"
60+
)
61+
62+
63+
# ---------------------------------------------------------------------------
64+
# T2 — Validator parity
65+
# ---------------------------------------------------------------------------
66+
67+
def _issue_key(issue):
68+
# Ignore .message text (may vary on arrow form); compare rule + severity + frame.
69+
return (issue.rule, issue.severity, issue.frame)
70+
71+
72+
def test_t2_validator_parity_ascii_unicode_equivalent_issue_sets():
73+
"""validate_composition on `A:BAR->B:QUX` and `A:BAR→B:QUX` must fire
74+
the same rules with the same severities on the same frames."""
75+
ascii_result = validate_composition("A:BAR->B:QUX")
76+
unicode_result = validate_composition("A:BAR\u2192B:QUX")
77+
78+
ascii_keys = sorted(_issue_key(i) for i in ascii_result.issues)
79+
unicode_keys = sorted(_issue_key(i) for i in unicode_result.issues)
80+
assert ascii_keys == unicode_keys, (
81+
f"Issue-set divergence:\n ascii={ascii_keys}\n uni ={unicode_keys}"
82+
)
83+
84+
85+
# ---------------------------------------------------------------------------
86+
# T3 — Macro chain with ASCII arrow
87+
# ---------------------------------------------------------------------------
88+
89+
def test_t3_macro_chain_ascii_arrow_validates_against_asd():
90+
"""A macro whose chain_template uses the ASCII arrow must validate
91+
successfully when the referenced opcodes exist in the ASD. Pre-fix,
92+
the whole string was parsed as one frame and failed lookup."""
93+
asd = AdaptiveSharedDictionary()
94+
# Add test opcodes A:BAR and B:QUX to the ASD for this macro.
95+
asd.apply_delta(
96+
"A", "BAR", "test opcode A:BAR",
97+
AdaptiveSharedDictionary.UpdateMode.ADDITIVE, "test",
98+
)
99+
asd.apply_delta(
100+
"B", "QUX", "test opcode B:QUX",
101+
AdaptiveSharedDictionary.UpdateMode.ADDITIVE, "test",
102+
)
103+
registry = MacroRegistry(asd)
104+
template = MacroTemplate(
105+
macro_id="TEST:MACRO",
106+
chain_template="A:BAR->B:QUX",
107+
slots=(),
108+
description="Test macro using ASCII arrow frame separator.",
109+
)
110+
# Pre-fix: "A:BAR->B:QUX" parsed as single frame "A:BAR->B:QUX", ASD lookup
111+
# for namespace "A" opcode "BAR->B:QUX" would fail → ValueError.
112+
# Post-fix: split into ["A:BAR", "B:QUX"] → both lookups succeed.
113+
registry.register(template) # must not raise
114+
115+
# Verify the template is registered.
116+
assert "TEST:MACRO" in registry._macros
117+
118+
119+
# ---------------------------------------------------------------------------
120+
# T4 — Regression on Unicode corpus
121+
# ---------------------------------------------------------------------------
122+
123+
# Golden dict: Unicode-arrow SAL frames → expected NL decoder output.
124+
# Captured 2026-04-24 against sdk/python/osmp/protocol.py (post-fix).
125+
# If any of these fail byte-equal, the bridge fix regressed Unicode handling.
126+
T4_GOLDEN = {
127+
"H:HR>130\u2192U:ALERT":
128+
"(clinical) [clinical] heart rate above 130 then [operator] urgent operator alert",
129+
"H:HR>130\u2227H:SPO2<90":
130+
"[clinical] heart rate above 130 and [clinical] oxygen saturation below 90",
131+
"H:HR>130\u2228H:SPO2<90":
132+
"[clinical] heart rate above 130 or [clinical] oxygen saturation below 90",
133+
"A:ACK\u2194U:CONFIRM":
134+
"(protocol) [protocol] positive acknowledgment iff [operator] request human confirmation",
135+
"H:VITALS\u2192U:ALERT":
136+
"(clinical) [clinical] composite vitals status then [operator] urgent operator alert",
137+
"H:SPO2<90\u2192U:ALERT":
138+
"(clinical) [clinical] oxygen saturation below 90 then [operator] urgent operator alert",
139+
"B:FIRE\u2192M:EVA":
140+
"(building) [building] FIRE then [emergency] evacuation",
141+
"W:WIND>60\u2192M:EVA":
142+
"(weather) [weather] wind speed and direction above 60 then [emergency] evacuation",
143+
"X:STORE<10\u2192U:ESCALATE":
144+
"(energy) [energy] storage state below 10 then [operator] escalate to human decision maker",
145+
"E:GPS\u2192U:ALERT":
146+
"(sensor) [sensor] gps coordinates then [operator] urgent operator alert",
147+
}
148+
149+
150+
def test_t4_unicode_corpus_byte_identical_to_golden():
151+
"""Decode 10 Unicode-arrow SAL frames; each NL output must be byte-
152+
identical to the golden string. Catches regression in Unicode handling
153+
introduced by the ASCII-arrow frame-split fix."""
154+
decoder = SALDecoder()
155+
mismatches = []
156+
for sal, expected in T4_GOLDEN.items():
157+
actual = decoder.decode_natural_language(sal)
158+
if actual != expected:
159+
mismatches.append((sal, expected, actual))
160+
assert not mismatches, (
161+
"Unicode decoder regression(s):\n"
162+
+ "\n".join(f" {sal!r}\n expected={exp!r}\n actual ={act!r}"
163+
for sal, exp, act in mismatches)
164+
)

sdk/typescript/package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

sdk/typescript/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "osmp-protocol",
3-
"version": "2.3.2",
3+
"version": "2.3.3",
44
"description": "OSMP — Octid Semantic Mesh Protocol. Agentic AI instruction encoding for any channel.",
55
"type": "module",
66
"main": "./dist/index.js",

0 commit comments

Comments
 (0)