This guide covers testing MERLIN without a running flight simulator, verifying the command pipeline, running the automated test suite, and using the health check script.
The mock adapter (tools/mock_adapter.py) simulates the MSFS SimConnect bridge. It connects to the telemetry service, streams fake telemetry, and executes any commands it receives by updating its internal state. No MSFS, no .NET, no Windows required.
The simplest way to run in mock mode is through the startup script:
./scripts/start.sh --mockThis starts all services (ChromaDB, telemetry service, mock adapter, web server) and displays a MOCK MODE banner when ready.
If you prefer to manage services individually, start the mock adapter directly:
python tools/mock_adapter.pyPrerequisites: the telemetry service must be running on ws://localhost:8080/ws/ingest.
python tools/mock_adapter.py --help
| Flag | Default | Description |
|---|---|---|
--url |
ws://localhost:8080/ws/ingest |
Telemetry service ingest WebSocket URL |
--adapter-id |
msfs-adapter |
Adapter ID to register as |
--aircraft |
Cessna 172 Skyhawk |
Simulated aircraft name |
--altitude |
3000.0 |
Starting altitude MSL in feet |
--airspeed |
110.0 |
Starting indicated airspeed in knots |
--hz |
2.0 |
Telemetry update frequency in Hz |
Examples:
# Simulate a 747 at cruise
python tools/mock_adapter.py --aircraft "Boeing 747-8" --altitude 35000 --airspeed 250
# High-frequency telemetry updates for stress testing
python tools/mock_adapter.py --hz 10
# Custom telemetry service URL (e.g., remote host)
python tools/mock_adapter.py --url ws://192.168.1.50:8080/ws/ingestThe mock adapter maintains a MockAircraftState that starts with realistic default values:
- Position: 40.6413N, -73.7781W (near JFK)
- Altitude: 3,000 ft MSL, 2,500 ft AGL
- Airspeed: 110 kt indicated
- Heading: 270 magnetic
- Engine: 2,300 RPM, 24.0 manifold pressure
- Gear: DOWN, Flaps: 0%
Telemetry values drift subtly over time (heading wobble, airspeed oscillation) to simulate a live flight. The altitude changes based on vertical speed, so the data looks realistic to the orchestrator's flight phase detector.
When commands arrive, the mock adapter applies them to its state. For example, a GEAR_UP command sets gear_handle to False, and the next telemetry frame reflects the change.
The full command pipeline without MSFS is:
Voice/Text -> STT -> Claude -> set_aircraft_control tool -> telemetry service -> mock adapter
- Start in mock mode:
./scripts/start.sh --mock - Open
http://localhost:3838 - Type or speak a command: "Lower the gear" or "Set heading to one eight zero"
- Watch Claude call the
set_aircraft_controltool in the chat response - Check the mock adapter log for the executed command:
tail -f logs/mock_adapter.logThe set_aircraft_control tool supports 11 aircraft systems:
| System | Example Voice Commands | Actions |
|---|---|---|
| flaps | "Give me full flaps", "Flaps up", "Set flaps to 20" | up, full, 1, 2, 3, set, incr, decr |
| gear | "Gear down", "Retract the gear" | up, down, toggle |
| autopilot | "Engage autopilot", "Set heading to 270", "Set altitude 5000" | on, off, heading, heading_hold, altitude, altitude_hold, vertical_speed, vs_hold, speed, speed_hold, nav, approach |
| throttle | "Set throttle to 75 percent" | set (0-100%) |
| radio | "Tune COM1 to 121.5" | com1, com2, nav1, nav2 |
| barometer | "Set altimeter to 29.92" | set |
| trim | "Set trim to nose up" | set |
| parking_brake | "Set the parking brake" | toggle |
| spoilers | "Deploy spoilers", "Spoilers to 50 percent" | toggle, set |
| mixture | "Set mixture to full rich" | set (0-100%) |
| propeller | "Set prop to 2500 RPM" | set (0-100%) |
When Claude executes a command, it responds with a brief, aviation-style confirmation. The tool description instructs Claude to confirm without unnecessary verbosity:
- "Gear down" -> Claude executes
set_aircraft_control(system="gear", action="down")-> "Gear down, three green." - "Give me full flaps" -> Claude executes
set_aircraft_control(system="flaps", action="full")-> "Flaps full." - "Set heading to 270" -> Claude executes
set_aircraft_control(system="autopilot", action="heading", value=270)-> "Heading bug two seven zero."
You can send a command directly to the telemetry service without going through Claude, using a Python one-liner. This is useful for verifying the telemetry service to adapter pipeline in isolation.
python3 -c "
import asyncio, json, websockets
async def send():
async with websockets.connect('ws://localhost:8080/ws/telemetry') as ws:
cmd = {
'type': 'command',
'command': 'GEAR_DOWN',
'value': 0
}
await ws.send(json.dumps(cmd))
print('Sent:', json.dumps(cmd))
ack = await asyncio.wait_for(ws.recv(), timeout=5.0)
print('Ack:', ack)
asyncio.run(send())
"If the mock adapter is running, you will see the command logged in logs/mock_adapter.log and receive an acknowledgment JSON message with "success": true.
The mock adapter prints every received command with a colored prefix:
>>> COMMAND RECEIVED: Gear DOWN (id: a3b2c1d4)
>>> COMMAND RECEIVED: Flaps FULL (100%) (id: e5f6a7b8)
>>> COMMAND RECEIVED: HDG Bug -> 270 (id: c9d0e1f2)
>>> COMMAND RECEIVED: AP Master -> ON (id: 34a5b6c7)
Each entry includes the human-readable description and the first 8 characters of the command ID. Commands also update the mock state, so the next telemetry frame will reflect the change (e.g., after GEAR_DOWN, telemetry reports gear_handle: true).
The full command history is stored in MockAircraftState.command_log as a list of dictionaries with time, command, value, and description fields.
The orchestrator has a comprehensive test suite using pytest and pytest-asyncio.
cd orchestrator
python3 -m pytest tests/ -vKey test categories:
- Unit tests: config, flight phase state machine, tools, Claude client, Whisper client, context store, TTS preprocessor, screen capture
- Integration tests: WebSocket reconnection, health monitor, delta detection, query classification, orchestrator end-to-end, tool chain, Whisper pipeline
Run a specific test file:
python3 -m pytest tests/test_tools.py -vRun tests matching a keyword:
python3 -m pytest tests/ -v -k "flight_phase"The SimConnect bridge has xUnit tests. Run from Windows or from WSL using the Windows dotnet:
cd adapters/msfs
dotnet testcd telemetry-service
python3 -m pytest tests/ -vThe health check script verifies all subsystems are operational. It checks API keys, service connectivity, the test suite, and v2 module imports.
./scripts/healthcheck.shSample output:
MERLIN v2 Health Check
===================================================
API Keys:
PASS Anthropic API key
PASS Deepgram API key
PASS Cartesia API key
Services:
PASS Web server (port 3838)
PASS ChromaDB (port 8000)
PASS Telemetry service (port 8080)
Test Suite:
PASS Python tests (247 passed)
v2 Modules:
PASS Deepgram STT client
PASS Cartesia TTS client
PASS Emergency detector
PASS Response validator
PASS Aviation chunker
PASS Cross-encoder reranker
PASS Aviation tools
===================================================
13 passed 0 failed 0 warnings (13 total)
===================================================
The script exits with code 0 if all required checks pass, or code 1 if any required check fails. Use it as an acceptance test after deployment or configuration changes.
| Category | Checks |
|---|---|
| API Keys | ANTHROPIC_API_KEY, DEEPGRAM_API_KEY (if STT=deepgram), CARTESIA_API_KEY (if TTS=cartesia) |
| Services | Web server (3838), ChromaDB (8000), Telemetry service (8080), Whisper (9090, if STT=whisper) |
| Test Suite | Runs pytest tests/ -q in the orchestrator directory |
| v2 Modules | Import checks for Deepgram, Cartesia, emergency detector, validator, chunker, reranker, aviation tools |