Feat/sandbox by huanghuoguoguo · Pull Request #2072 · langbot-app/LangBot

huanghuoguoguo · 2026-03-22T03:30:33Z

LangBot Box：沙箱执行系统

概述

本 PR 引入 LangBot Box，让 LLM Agent、MCP Server，以及后续的 Skill/工具执行都能在隔离环境中运行 Shell 命令、Python 脚本和长生命周期进程。

当前实现已经不是"代码主要都在 LangBot 主仓"那种结构了，职责现在明确拆成两层：

LangBot 主仓：负责产品集成层，包括 sandbox_exec 工具暴露、Profile/宿主机路径策略、MCP Box-stdio 集成、状态接口，以及运行时连接管理。
langbot-plugin-sdk：负责 Box Runtime 底座，包括协议、模型、错误类型、Session 生命周期、Backend 抽象、Docker/Podman/nsjail 执行后端，以及独立运行的 Box Server。

换句话说，这个 PR 现在本质上是一个跨仓协作的沙箱能力接入：LangBot 侧负责接入和策略，SDK 侧承载大部分可复用的运行时实现。

分支: feat/sandbox

功能

sandbox_exec 原生工具：LLM Agent 获得一个原生工具，可在隔离环境中运行 Shell 命令和 Python 脚本，用于精确计算、结构化解析、临时文件处理和代码执行。
MCP Server 隔离运行（Box-stdio）：stdio 模式的 MCP Server 在 Box 可用时会自动运行在沙箱中，支持依赖安装、路径重写和 stdio-over-WebSocket 桥接。
可配置的安全边界：支持网络开关、CPU/内存/PID 限制、只读根文件系统、宿主机挂载白名单、危险路径阻断。
可插拔执行后端：当前运行时支持 Podman、Docker 和 nsjail 三种 Backend，统一走同一套 BoxRuntime 生命周期管理。
可观测性接口：LangBot 暴露 /api/v1/box/status、/api/v1/box/sessions、/api/v1/box/errors 供运维和调试使用。

架构

                         LangBot 主进程
                               │
         ┌─────────────────────┼─────────────────────┐
         │                     │                     │
  NativeToolLoader      RuntimeMCPSession       BoxService
  (sandbox_exec)        (MCP Box-stdio)   (策略 / Profile / 校验)
         │                     │                     │
         └─────────────────────┼─────────────────────┘
                               │
                BoxRuntimeConnector (LangBot)
                               │
                ActionRPCBoxClient (SDK)
                               │
               ┌───────────────┴────────────────┐
               │ stdio（默认）                   │ WebSocket（显式配置 runtime_url）
               │ 子进程（本地 & Docker 部署）     │ ws://<remote-host>:5411
               ▼                                 ▼
          langbot_plugin.box.server（SDK 独立服务）
                               │
         ┌─────────────────────┼─────────────────────┐
         │                     │                     │
  BoxServerHandler       BoxRuntime            aiohttp WS Relay
  (Action RPC)      (Session / 进程管理)      (:5410, MCP attach)
                               │
                  ┌────────────┼────────────┐
                  │            │            │
             PodmanBackend DockerBackend NsjailBackend
                               │
                  容器 / nsjail 进程隔离环境

分层与职责

层	仓库	主要模块	职责
集成层	LangBot	`pkg/box/service.py`	Profile 应用、宿主机路径校验、输出裁剪、对外 API
连接层	LangBot	`pkg/box/connector.py`	选择 stdio 子进程或远程 WebSocket 连接 Box Runtime
工具接入层	LangBot	`provider/tools/loaders/native.py`	暴露 `sandbox_exec` 给模型
MCP 集成层	LangBot	`provider/tools/loaders/mcp.py`	将 `stdio` MCP Server 接入 Box Session / managed process
HTTP 可观测层	LangBot	`api/http/controller/groups/box.py`	暴露状态、Session、错误列表接口
协议层	SDK	`langbot_plugin.box.actions` / `client.py`	Action RPC 动作定义与客户端调用
模型层	SDK	`langbot_plugin.box.models`	`BoxSpec`、`BoxProfile`、`BoxSessionInfo` 等共享模型
运行时层	SDK	`langbot_plugin.box.runtime`	Session TTL、复用、managed process 生命周期
Backend 层	SDK	`backend.py` / `nsjail_backend.py`	Docker / Podman / nsjail 执行抽象
服务层	SDK	`langbot_plugin.box.server`	独立运行的 Box Server + MCP WebSocket Relay

核心设计决策

1. Runtime 底座下沉到 SDK

现在 Box 的核心不再放在 LangBot 主仓，而是下沉到 langbot-plugin-sdk/src/langbot_plugin/box/。这样做的原因是：

Box Runtime 本身是一个可独立运行的服务，天然更适合放在共享基础设施层。
LangBot 和 Box Runtime 复用了 SDK 里现有的 Action RPC / IO 抽象，不需要在主仓重复维护一套协议栈。
Box 的模型、错误、客户端、后端探测、独立服务入口都更偏"运行时底座"，不应和 LangBot 产品逻辑耦合在一起。

LangBot 主仓保留的是产品语义相关能力：是否暴露工具、如何应用 Profile、哪些宿主机路径允许挂载、MCP 如何接入、HTTP 如何观测。

2. 同进程架构

Box Runtime 作为 LangBot 的子进程运行，通过 stdio 与 LangBot 主进程通信。无论本地开发还是 Docker 部署，行为一致：

LangBot 通过 BoxRuntimeConnector 启动 python -m langbot_plugin.box.server --port 5410 子进程，并用 stdio 建立连接。
Box Runtime 进程本身只是一个纯调度进程：它通过 docker socket / nsjail 命令创建和管理沙箱，不执行任何用户代码，也不直接操作文件系统。因此不需要像 Plugin Runtime 那样单独容器隔离。
Docker 部署时，LangBot 容器挂载 docker.sock 即可，Box Runtime 子进程直接访问宿主 Docker 引擎。

如需将 Box Runtime 部署到独立主机，可在 config.yaml 中显式配置 runtime_url，此时 LangBot 通过 WebSocket 连接远程 Runtime。

3. Session 复用

Session 是 Box 的核心调度单元。BoxRuntime 维护一个 session_id -> RuntimeSession 映射：

sandbox_exec 默认以 query_id 作为 session_id
MCP Box-stdio 以 mcp-{uuid} 形式持有独立 Session
同一 Session 内的多次执行会复用已有隔离环境，而不是每次重新创建容器 / nsjail 工作目录

Session 带 TTL（默认 300 秒）。回收条件是：

last_used_at 超过 TTL
且当前没有运行中的 managed process

这保证了：

sandbox_exec 可以在同一次对话里做多步有状态执行
MCP Server 不会因为空闲 TTL 被误清理

4. Profile 体系在 LangBot 层生效

sandbox_exec 不直接把所有隔离参数完全裸露给模型，而是先通过 LangBot 的 BoxService 应用 Profile：

未传的字段由 Profile 默认值补齐
被锁定的字段会强制覆盖用户/模型传参
timeout_sec 会被 clamp 到 profile.max_timeout_sec

当前内置 Profile 仍包括：

Profile	网络	CPU	内存	根文件系统	挂载	最大超时
`default`	OFF	1.0	512MB	只读	读写	120s
`offline_readonly`	OFF（锁定）	0.5	256MB	只读（锁定）	只读（锁定）	60s
`network_basic`	ON	1.0	512MB	只读	读写	120s
`network_extended`	ON	2.0	1024MB	可写	读写	300s

MCP Box-stdio 不走这套 Profile，而是走 MCPServerBoxConfig 独立配置，因为它的信任模型与 LLM 生成代码不同。

5. Backend 抽象与探测顺序

SDK 里的 BoxRuntime 现在统一从以下顺序探测可用 Backend：

PodmanBackend
DockerBackend
NsjailBackend

三者都实现同一套 BaseSandboxBackend 接口，上层 BoxService / BoxRuntimeConnector / ActionRPCBoxClient 都不感知底层具体是容器还是 nsjail。

6. MCP Box-stdio 模式

LangBot 中的 RuntimeMCPSession 在检测到 stdio MCP 且 Box 可用时，会执行下面这条链路：

用 BoxService.create_session() 创建 Session
根据 pyproject.toml / requirements.txt 自动安装依赖
把宿主机路径改写为容器内 /workspace/...
用 start_managed_process() 启动 MCP 进程
通过 Box Runtime 暴露的 WebSocket Relay 连接到该进程的 stdin/stdout
再由 LangBot 内部 MCP Client 完成协议初始化和工具发现

MCP 协议语义仍然在 LangBot 侧，SDK 里的 Box Runtime 只负责"把一个托管进程安全地跑起来并提供 attach 能力"。

7. Host Path 挂载

Box 把宿主机目录挂载到沙箱内固定的 /workspace：

sandbox_exec：默认取 config.yaml 中的 box.default_host_workspace
MCP Box-stdio：由 LangBot 从 MCP command/args 推断项目根目录，或使用 MCP 配置里的 box.host_path

Docker 部署下，LangBot 容器挂载宿主机目录（如 ./data/box:/workspaces），Box Runtime 子进程运行在同一容器内，直接访问该挂载目录并据此创建实际容器挂载。LangBot 侧负责路径白名单校验。

核心接口

LangBot：`BoxService`

class BoxService:
    available: bool

    async def execute_sandbox_tool(
        parameters: dict,
        query: Query,
    ) -> dict

    async def execute_skill_tool(
        skill_data: dict,
        tool_def: dict,
        parameters: dict,
        query: Query,
    ) -> dict

    async def create_session(
        spec_payload: dict,
        skip_host_mount_validation: bool = False,
    ) -> dict

    async def start_managed_process(
        session_id: str,
        process_payload: dict,
    ) -> dict

    def get_managed_process_websocket_url(
        session_id: str,
    ) -> str

SDK：`BoxSpec`

class BoxSpec(pydantic.BaseModel):
    cmd: str = ''
    workdir: str = '/workspace'
    timeout_sec: int = 30
    network: BoxNetworkMode = OFF
    session_id: str
    env: dict[str, str] = {}
    image: str = 'python:3.11-slim'
    host_path: str | None = None
    host_path_mode: BoxHostMountMode = RW
    cpus: float = 1.0
    memory_mb: int = 512
    pids_limit: int = 128
    read_only_rootfs: bool = True

SDK：`BaseSandboxBackend`

class BaseSandboxBackend(ABC):
    name: str

    async def is_available() -> bool
    async def start_session(spec: BoxSpec) -> BoxSessionInfo
    async def exec(session: BoxSessionInfo, spec: BoxSpec) -> BoxExecutionResult
    async def stop_session(session: BoxSessionInfo) -> None
    async def start_managed_process(session, spec) -> asyncio.subprocess.Process
    async def cleanup_orphaned_containers(instance_id: str) -> None

通信方式

Action RPC

Box 复用 langbot_plugin.runtime.io 这一套 Action RPC / Connection / Handler 基础设施。当前 Box Runtime 暴露的动作包括：

Action	含义
`box_health`	健康检查
`box_status`	获取运行时状态
`box_exec`	在 Session 内执行命令
`box_create_session`	创建 Session
`box_get_session`	获取单个 Session
`box_get_sessions`	获取全部 Session
`box_delete_session`	删除 Session
`box_start_managed_process`	启动托管进程
`box_get_managed_process`	获取托管进程状态
`box_get_backend_info`	获取当前 Backend 信息
`box_shutdown`	优雅关闭 Runtime

传输模式

模式	场景	实现
stdio（默认）	本地开发、Docker 部署	LangBot 拉起 `langbot_plugin.box.server` 子进程并通过 stdio 通信
WebSocket	显式配置 `runtime_url` 的远程部署	LangBot 连接 `ws://<remote-host>:5411`

WebSocket Relay

Box Runtime 还会在 :5410 起一个轻量 aiohttp 服务，用于 MCP 托管进程 attach：

GET /v1/sessions/{session_id}/managed-process/ws

该接口负责把 WebSocket 文本消息桥接到托管进程的 stdin/stdout。

部署方式

本地开发

无需额外服务编排。LangBot 会自动启动本地 Box Runtime 子进程。

box:
  profile: 'default'
  default_host_workspace: './data/box-workspaces/default'
  allowed_host_mount_roots:
    - './data/box-workspaces'
    - '/tmp'

宿主机需要具备至少一种可用后端：Podman、Docker 或 nsjail。

Docker Compose

Box Runtime 作为子进程运行在 LangBot 容器内，无需单独容器。LangBot 容器需挂载容器运行时 socket：

services:
  langbot:
    image: rockchin/langbot:latest
    volumes:
      - ./data:/app/data
      - ./data/box:/workspaces
      # Mount container runtime socket for Box sandbox (Docker backend).
      - /var/run/docker.sock:/var/run/docker.sock

LangBot 启动时自动拉起 Box Runtime 子进程，通过 stdio 通信，通过 http://127.0.0.1:5410 访问 managed-process relay。

远程部署（可选）

如需将 Box Runtime 部署到独立主机，可在 config.yaml 中配置 runtime_url：

box:
  runtime_url: 'http://remote-box-host:5410'

此时 LangBot 通过 WebSocket 连接远程 Runtime，不再启动本地子进程。

安全模型

禁止挂载路径：/etc、/proc、/sys、/dev、/root、/boot、容器运行时 socket 等路径被硬编码阻断。Windows 环境额外阻断 C:\Windows、C:\Program Files 等系统路径。
允许挂载根目录白名单：只有 allowed_host_mount_roots 下的路径才允许挂载到 /workspace。
Profile 锁定：安全关键字段可由管理员锁定，模型侧无法覆盖。
资源限制：CPU、内存、PID 限制由 Backend 在容器 / nsjail 层实际执行。
只读根文件系统：容器 Backend 默认开启；nsjail Backend 也固定以只读系统挂载为核心模型。
输出截断：原始 stdout/stderr 各自有 1MB 上限，避免高吞吐命令把内存打满。
Session TTL：空闲 Session 默认 300 秒自动回收，但有运行中 managed process 时不会被回收。
孤儿清理：容器 Backend 启动时会清理前一实例残留的 langbot.box=true 容器。
Windows 支持：通过 Docker Desktop 支持 Windows 平台（仅 Docker 后端；Podman 和 nsjail 仅限 Linux）。

Skill / 插件如何接入

1. 通过 `sandbox_exec`

最简单的接入方式仍然是把 sandbox_exec 放进模型工具列表，让模型在需要时自行调用。

2. 直接调用 `BoxService`

适合插件、Skill 或平台内部逻辑明确需要执行固定命令的场景：

result = await ap.box_service.execute_sandbox_tool(
    parameters={'cmd': 'python3 -c "print(42)"', 'timeout_sec': 10},
    query=query,
)

3. MCP Server in Box

stdio MCP Server 在 Box 可用时自动运行在沙箱内，并支持通过 box 字段覆盖镜像、网络、挂载模式、启动超时等参数：

{
  "name": "my-mcp-server",
  "mode": "stdio",
  "command": "python",
  "args": ["server.py"],
  "box": {
    "image": "node:20",
    "network": "on",
    "host_path_mode": "ro",
    "startup_timeout_sec": 180
  }
}

文件结构

LangBot 主仓

src/langbot/pkg/box/
├── __init__.py
├── connector.py        # BoxRuntimeConnector，选择 stdio / ws 连接
└── service.py          # BoxService，Profile / 安全策略 / 对外 API

src/langbot/pkg/provider/tools/loaders/
├── native.py           # sandbox_exec 工具定义
└── mcp.py              # MCP Box-stdio 集成

src/langbot/pkg/api/http/controller/groups/
└── box.py              # /api/v1/box/status /sessions /errors

`langbot-plugin-sdk`

src/langbot_plugin/box/
├── __init__.py
├── __main__.py
├── actions.py          # Box Action RPC 动作枚举
├── backend.py          # BaseSandboxBackend + Docker / Podman Backend
├── client.py           # BoxRuntimeClient / ActionRPCBoxClient
├── errors.py           # Box 错误类型
├── models.py           # BoxSpec / BoxProfile / BoxSessionInfo 等
├── nsjail_backend.py   # nsjail Backend
├── runtime.py          # BoxRuntime，Session / managed process 生命周期
├── security.py         # 宿主机路径与安全校验
└── server.py           # 独立 Box Server + WebSocket Relay

部署与测试

LangBot/docker/docker-compose.yaml                       # 容器编排（Box Runtime 内嵌于 LangBot 容器）
LangBot/src/langbot/templates/config.yaml               # box 配置段

LangBot/tests/unit_tests/box/                           # BoxService / Connector 单测
LangBot/tests/unit_tests/provider/test_mcp_box_integration.py
LangBot/tests/integration_tests/box/                    # 端到端集成测试
langbot-plugin-sdk/tests/box/test_nsjail_backend.py     # nsjail Backend 单测

测试覆盖

LangBot 单测：覆盖 BoxService、BoxRuntimeConnector、sandbox_exec 接入、MCP Box 配置与路径改写等逻辑。
LangBot 集成测试：覆盖端到端执行、Session 持久化、超时、网络隔离、managed process 生命周期、MCP Server in Box。
SDK 单测：覆盖 nsjail Backend 的探测、执行、Session 清理与隔离行为。

Q&A

Q: Profile 是全局的吗？模型能覆盖哪些参数？

是全局配置，来源于 config.yaml 的 box.profile。未锁定字段可被模型覆盖；锁定字段始终回退到 Profile 值。

Q: MCP Server 为什么不走 Profile？

因为 MCP Server 是管理员显式配置的可信进程，需求和 LLM 生成代码不同。它默认需要更高可用性，比如联网安装依赖，所以走 MCPServerBoxConfig 独立配置。

Q: Session TTL 会不会把 MCP Server 提前清掉？

不会。只要 Session 上还有运行中的 managed process，TTL 回收逻辑就会跳过它。

Q: 现在没有 Docker / Podman 怎么办？

Runtime 会按 Podman -> Docker -> nsjail 的顺序探测可用 Backend。三者都没有时，BoxService.available = False，sandbox_exec 不会暴露给模型，stdio MCP 也会回退到宿主机直接运行。

Q: `nsjail` 现在是什么状态？

已经接入当前代码路径，不再只是规划。它是 BoxRuntime 的正式候选 Backend 之一，只是在实际部署中是否命中它，取决于宿主机上是否安装并可用。

Q: 如何接入新的 Backend？

实现 BaseSandboxBackend 接口并加入 BoxRuntime.backends 探测列表即可。LangBot 集成层、Action RPC 协议、工具定义都不需要改。

Q: 为什么 Box Runtime 不需要独立容器？

Box Runtime 进程本身只是一个纯调度进程：通过 docker socket 或 nsjail 命令创建和管理沙箱，不执行任何用户代码，也不直接操作文件系统。与 Plugin Runtime 不同（插件会直接操作文件系统、安装依赖、运行第三方代码），Box Runtime 没有隔离需求，作为子进程运行在 LangBot 容器内更简单，也避免了跨容器的路径映射和网络跳转。

Q: Windows 支持情况？

Windows 平台仅支持 Docker 后端（通过 Docker Desktop）。Podman 和 nsjail 依赖 Linux 内核特性（namespace、cgroups 等），仅限 Linux 环境使用。

codecov · 2026-03-22T03:41:21Z

Codecov Report

❌ Patch coverage is 60.97712% with 631 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/langbot/pkg/provider/tools/loaders/mcp.py	47.30%	127 Missing ⚠️
src/langbot/pkg/box/backend.py	35.35%	117 Missing ⚠️
src/langbot/pkg/box/server.py	40.90%	104 Missing ⚠️
src/langbot/pkg/box/runtime.py	77.59%	54 Missing ⚠️
src/langbot/pkg/box/connector.py	48.38%	48 Missing ⚠️
src/langbot/pkg/box/service.py	83.78%	36 Missing ⚠️
src/langbot/pkg/box/client.py	67.41%	29 Missing ⚠️
src/langbot/pkg/utils/managed_runtime.py	39.02%	25 Missing ⚠️
src/langbot/pkg/provider/tools/loaders/native.py	40.54%	22 Missing ⚠️
src/langbot/pkg/pipeline/process/handler.py	11.11%	16 Missing ⚠️
... and 9 more

📢 Thoughts on this report? Let us know!

@@ -1,5 +1,5 @@
 import React, { Suspense } from 'react';
-import { createBrowserRouter, Navigate } from 'react-router-dom';
+import { createBrowserRouter, Navigate, Outlet } from 'react-router-dom';


+
+        if not is_stream:
+            yield final_msg
+            initial_response_emitted = True


+            initial_response_emitted = True
+        elif not initial_response_emitted:
+            yield final_msg
+            initial_response_emitted = True


+    current = Path(__file__).resolve()
+    for parent in current.parents:
+        if (parent / 'pyproject.toml').exists() and (parent / 'main.py').exists():
+            _source_root = parent


+            _source_root = parent
+            return parent
+
+    _source_root = None


+            return
+        result = reload_skills()
+        if inspect.isawaitable(result):
+            await result


 import langbot_plugin.api.entities.builtin.resource.tool as resource_tool
 import langbot_plugin.api.entities.builtin.provider.message as provider_message
 from ....entity.persistence import mcp as persistence_mcp
+from .mcp_stdio import BoxStdioSessionRuntime, MCPServerBoxConfig, MCPSessionErrorPhase


…uncation - Implement head+tail output truncation (60/40 split) so LLM sees both beginning and final results; add streaming byte-limited reads in backend to prevent unbounded memory usage (_MAX_RAW_OUTPUT_BYTES = 1MB) - Define BoxProfile model with locked fields and max_timeout_sec clamping - Add four built-in profiles: default, offline_readonly, network_basic, network_extended with differentiated resource and security constraints - Add resource limit fields to BoxSpec (cpus, memory_mb, pids_limit, read_only_rootfs) and pass corresponding container CLI flags (--cpus, --memory, --pids-limit, --read-only, --tmpfs) - Profile loaded from config (box.profile), applied in service layer before BoxSpec validation; locked fields cannot be overridden by tool-call parameters

management

After the architecture settled on always using an independent Box Runtime service, several pieces of compatibility code and design shortcuts were left behind. This commit cleans them up: - Remove `LocalBoxRuntimeClient` and `create_box_runtime_client` from production code (moved to test-only helper). - Remove unused `_clip_bytes` method from backend. - Remove `__langbot_session_placeholder__` hack by making `BoxSpec.cmd` default to empty and validating non-empty only in `runtime.execute()`. - Extract `get_box_config()` helper to eliminate 5× duplicated config access boilerplate. - Remove `session_id`/`host_path`/`host_path_mode` from the LLM-facing tool schema to enforce request-scoped session isolation. - Fix dual shutdown path: `NativeToolLoader.shutdown()` no longer calls `box_service.shutdown()` (handled by `Application.dispose()`). - Simplify `_assert_session_compatible` with a loop. - Inline client creation in `BoxRuntimeConnector`. - Remove redundant `BOX__RUNTIME_URL` env var from docker-compose (auto-detected by code). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… dep install, security ## Summary When Podman/Docker is available, all stdio-mode MCP servers now automatically run inside Box containers with dependency installation, path rewriting, and lifecycle management. When no container runtime exists, LangBot starts normally and stdio MCP falls back to host-direct execution. ## What changed ### MCP stdio → Box integration (mcp.py) - Add `MCPServerBoxConfig` pydantic model for structured box configuration with validation and defaults (network, host_path_mode, timeouts, resources) - Auto-infer `host_path` from command/args with venv detection: recognizes `.venv/bin/python` patterns and walks up to the project root - Rewrite host paths to container `/workspace` paths transparently - Replace venv python commands with container-native `python` - Auto-detect `pyproject.toml`/`setup.py`/`requirements.txt` and run `pip install` inside the container before starting the MCP server - Copy project to `/tmp` before install to handle read-only mounts - Add retry with exponential backoff (3 retries, 2s/4s/8s delays) - Add Box managed process health monitoring (poll every 5s) - Fix session leak: `_cleanup_box_stdio_session()` now runs in `finally` block of `_lifecycle_loop`, covering all exit paths - Fix retry logic: `_ready_event` is only set after all retries exhaust or on success, not on first failure - Enhance `get_runtime_info_dict()` with `box_session_id` and `box_enabled` ### Box security (security.py — new) - `validate_sandbox_security()` blocks dangerous host paths: `/etc`, `/proc`, `/sys`, `/dev`, `/root`, `/boot`, `/run`, docker.sock, podman socket - Called at the start of `CLISandboxBackend.start_session()` ### Box models (models.py) - Add `BoxHostMountMode.NONE` — skips volume mount entirely - Adjust `validate_host_mount_consistency` to allow arbitrary workdir when `host_path_mode=NONE` ### Box backend (backend.py) - Add `validate_sandbox_security()` call in `start_session()` - Add `langbot.box.config_hash` label on containers for drift detection - Handle `BoxHostMountMode.NONE` — skip `-v` mount arg - Add `cleanup_orphaned_containers()` to base class (no-op default) and CLI implementation (single batched `rm -f` command) ### Box runtime (runtime.py) - Call `cleanup_orphaned_containers()` during `initialize()` to remove lingering containers from previous runs ### Box service (service.py) - Graceful degradation: `initialize()` catches runtime errors and sets `available=False` instead of crashing LangBot startup - Add `available` property and guard on `execute_sandbox_tool()` - Add `skip_host_mount_validation` parameter to `build_spec()` and `create_session()` — MCP paths are admin-configured and trusted, bypassing `allowed_host_mount_roots` restrictions meant for LLM-generated sandbox_exec commands ### Default behavior - stdio MCP servers automatically use Box when `box_service.available` is True (Podman/Docker detected); no explicit `box` config needed - When no container runtime exists, falls back to host-direct stdio - MCP Box defaults: `network=on` (for pip install), `read_only_rootfs=false` (for site-packages), `host_path_mode=ro`, `startup_timeout=120s` ### Tests - `test_box_security.py`: blocked paths, safe paths, subpath rejection - `test_mcp_box_integration.py`: config model, path rewriting, venv unwrap, host_path inference, payload building, runtime info, box availability check - `test_box_service.py`: `BoxHostMountMode.NONE` validation tests

…ession API, and integration tests ## Changes ### Precise orphan container cleanup - Runtime generates a unique instance_id on startup - Every container gets a `langbot.box.instance_id` label - `cleanup_orphaned_containers()` only removes containers from previous instances, preserving containers owned by the current one - Containers from older versions (no label) are also cleaned up - `cleanup_orphaned_containers` added to `BaseSandboxBackend` as a no-op default method, removing hasattr duck-typing ### Fine-grained MCP error classification - New `MCPSessionErrorPhase` enum with 7 phases: session_create, dep_install, process_start, relay_connect, mcp_init, runtime, tool_call - Each phase in `_init_box_stdio_server()` sets the error phase before re-raising, enabling precise failure diagnosis - `retry_count` tracked across retry attempts - `get_runtime_info_dict()` exposes `error_phase` and `retry_count` ### GET /v1/sessions/{id} API - `BoxRuntime.get_session()` returns session details including managed process info when present - `handle_get_session` HTTP handler + route in server.py - `BoxRuntimeClient.get_session()` abstract method + remote impl ### stdio defaults to Box when runtime is available - `_uses_box_stdio()` checks `box_service.available` instead of requiring explicit `box` key in server_config - `BoxService.initialize()` catches runtime errors gracefully, sets `available=False` instead of crashing LangBot startup - When no container runtime exists, stdio MCP falls back to host-direct execution ### Code quality (from /simplify review) - Extracted `_VENV_DIRS` / `_VENV_BIN_DIRS` module-level constants - Removed dead `_box_network_mode()` method and unused `bc` variable - Fixed broken import `from ....box.models` → `from ...box.models` - Cached `_resolve_host_path()` result — computed once, passed through - Config hash now includes `host_path` field - Batched orphan cleanup into single `rm -f` command ### Session leak fix - `_cleanup_box_stdio_session()` now runs in `_lifecycle_loop`'s finally block, covering all exit paths (normal shutdown, error, retry, final failure) ### Integration tests - 6 end-to-end tests covering managed process lifecycle, WebSocket stdio bidirectional IO, session cleanup verification, single session query, process exit detection, and orphan cleanup safety

- Fix O(n²) stderr trimming in runtime.py with running length tracker - Remove dead code: RESERVED_CONTAINER_PATHS, _subprocess_wait_task, unused config_hash computation, unused imports - Deduplicate connection callback in BoxRuntimeConnector, parse URL once - Use enum comparison instead of stringly-typed spec.network.value check - Replace manual _result_to_dict/_session_to_dict with model_dump() - Cache NativeToolLoader tool definition and sandbox system guidance - Extract _is_path_under() helper to eliminate duplicated path checks - Import SANDBOX_EXEC_TOOL_NAME from native.py instead of redefining - Add JSON startswith guard in logging_utils to skip futile json.loads - Fix ruff lint errors (F401 unused imports, F841 unused variables)

- Move sandbox system-prompt guidance from LocalAgentRunner into BoxService.get_system_guidance() so all box domain knowledge stays in the box module. - Remove standalone logging_utils.py; merge format_result_log() into MessageHandler base class alongside cut_str(). - Strip sandbox-specific JSON parsing from log formatting; tool results now use generic truncation. - Revert TYPE_CHECKING changes in stage.py and runner.py that were unrelated to this feature. - Skip two test files affected by a pre-existing circular import (runner ↔ app) until the import cycle is resolved in a separate PR.

Replace the per-message session_id with a template-based system configurable per pipeline via 'Sandbox Scope' in the local-agent panel. Default scope is per-chat ({launcher_type}_{launcher_id}). Unify skill exec into the same container as default exec — skills are mounted at /workspace/.skills/{name}/ via extra_mounts instead of getting separate containers. All pipeline-bound skills are injected at container creation time. - Add box-session-id-template to pipeline metadata (select, 4 options, 8 languages) - Add resolve_box_session_id() and build_skill_extra_mounts() to BoxService - Rewrite native.py skill exec path to use execute_tool with shared session - Update tests for new session_id format - Add design doc: docs/review/box-session-scope.md

Display sandbox count and a detailed list of active sessions including session ID, image, backend, resources (CPU/memory), network mode, and last used time. Fetched from GET /api/v1/box/sessions in parallel. Includes i18n for all 8 supported languages.

Log Box runtime initialization result (success with profile info, or failure warning). Log native tool availability status at ToolManager startup so it's immediately clear whether exec/read/write/edit tools are registered for the LLM.

Add 'image' field to box config section. When set, it overrides the profile default image (python:3.11-slim) for all sandbox containers. Priority: caller-specified > config.yaml image > profile default.

Add 20-second heartbeat ping loop to detect silent Box runtime disconnections. On disconnect, set available=false and attempt reconnection after 3 seconds via the disconnect callback chain. - BoxRuntimeConnector: heartbeat loop, disconnect callback parameter, disconnect detection in connection callback and WS failure handler - BoxService: wire disconnect callback to toggle available state and re-initialize the connector on reconnection

…pover Add SystemStatusCards component to the monitoring dashboard showing Plugin Runtime and Box Runtime connection status with details (backend, profile, sandbox count). Remove all Box/session status from the plugin page debug popover — it now only shows debug URL and key. Includes i18n for all 8 supported languages.

…rics Replace the separate two-card row with a single compact 'System Status' card placed as the 5th column in the metrics grid. Shows green/red dots for Plugin Runtime and Box Runtime. Click to expand a popover with connection details (backend, profile, sandbox count).

Record Box connector error in BoxService and expose it as 'connector_error' in GET /api/v1/box/status when unavailable. Display error messages in the dashboard System Status popover for both Plugin Runtime (plugin_connector_error) and Box Runtime (connector_error) when they are disconnected.

…al time Poll Plugin Runtime and Box Runtime status every 30 seconds so the dashboard reflects disconnections without a manual page refresh. Also re-fetch when the popover is opened for immediate feedback.

When the Box runtime disconnects, there is a race between the heartbeat flipping _available=false and the frontend polling get_status(). If the poll arrives first, client.get_status() throws a ConnectionClosedError which propagated as a 500, causing the frontend to show a grey dot (null status) instead of a red dot with error details. Now get_status() catches RPC errors and returns available=false with the exception message as connector_error. get_sessions() returns an empty list when unavailable or on RPC failure.

The previous disconnect handler only retried once and then gave up. Now spawns a background task that retries with exponential backoff (3s, 6s, 12s, ... up to 60s) until the Box runtime is reachable again. Uses a _reconnecting guard to prevent duplicate loops. Calls connector.dispose() before each retry to clean up stale tasks.

The generic Handler.run() catches ConnectionClosedError and breaks out of its loop (normal return) instead of raising, because it has no disconnect_callback. The old code only triggered reconnection in the except branch, so a clean WebSocket close was never detected. Now treat handler.run() returning normally (after successful handshake) as a disconnect event, triggering the reconnection callback.

Pass a refreshKey prop through OverviewCards to SystemStatusCard that increments on each Refresh Data click, triggering a re-fetch of Plugin and Box runtime status alongside the monitoring data refresh.

fetchStatus(showLoading=false) never called setLoading(false), so the initial loading=true was never cleared. Simplify to always setLoading in the finally block — the spinner only shows on the very first load since subsequent fetches complete near-instantly.

…e config DynamicFormComponent's select renderer uses option.name as the value and key, but the YAML had 'value' fields. This caused the dropdown to render blank labels for all options.

Fetch box sessions alongside status and display each active sandbox in the popover with session ID, image, resources (CPU/memory), and last used time.

Add a 'Global (shared by all)' option to the sandbox scope selector. Uses a constant '{global}' template variable that always resolves to 'global', so all users and chats share one sandbox container.

Replace the dropdown popover with a proper Dialog for runtime status details. Add a small info button on the System Status card that opens the dialog. Session details now show in a spacious 2-column grid layout with full image name, backend, CPU/memory, network, mount path, and created/last-used timestamps.

Use max-w-2xl (matching other dialogs) instead of max-w-lg. Move overflow-y-auto to an inner container with overflow-hidden on DialogContent to prevent padding bleed at scroll edges.

Wrap session_id, image, and mount path fields with Tooltip components so hovering over truncated text shows the full value.

…opovers 1. Fix provider type select showing blank when editing: await loadRequesters() before loadProvider() to ensure options are populated before setting the selected value. 2. Split 'Add Model' into two separate entries: a '+ Add Model' button for manual add and a Radar icon button for scan. Each opens its own popover with only one layer of tabs (model type for manual, no tabs for scan since types are auto-detected). 3. Fix popover position: side='bottom' instead of 'left'. 4. Fix popover scroll: model type tabs stay fixed at top, content area scrolls independently when it overflows. 5. Scan mode now fetches all model types at once (no modelType filter), and routes each scanned model to the correct API based on its own type field.

…tips When the scan popover opens, automatically trigger scanning. Remove the manual 'Scan Models' button and hint text since they are no longer needed. Show a spinner while scanning, and the 'Add Selected' button only appears after scan completes and models are selected.

Remove 'Scanned Models' label and reduce top spacing in scan popover. Add a refresh icon button next to 'Add Selected' to re-trigger scanning without closing and reopening the popover.

…te_app - test_box_mcp_integration: import create_app instead of create_ws_relay_app - test_box_integration: add query.variables for template session_id - test_skill_tools: mock box_service.execute_tool instead of execute_spec_payload since skill exec now goes through the unified execute_tool path

Display remaining time before each sandbox is cleaned up, calculated from (session_ttl_sec - elapsed since last_used_at). Shows amber text when under 60 seconds remaining. TTL is sourced from the Box runtime status API. Includes i18n for all 8 supported languages.

huanghuoguoguo mentioned this pull request Mar 22, 2026

feat(box): add box runtime package and lbp box CLI command langbot-app/langbot-plugin-sdk#44

Open

RockChinQ force-pushed the master branch from 2947b25 to 4d6f109 Compare March 25, 2026 13:11

huanghuoguoguo force-pushed the feat/sandbox branch 2 times, most recently from 6007b79 to 726da24 Compare March 28, 2026 01:11

huanghuoguoguo force-pushed the feat/sandbox branch from 726da24 to 1f958e8 Compare April 8, 2026 01:52

RockChinQ force-pushed the feat/sandbox branch from f10891b to e49a1b7 Compare April 14, 2026 14:51

github-code-quality bot found potential problems Apr 17, 2026

View reviewed changes

Comment thread web/src/router.tsx

@@ -1,5 +1,5 @@

import React, { Suspense } from 'react';

import { createBrowserRouter, Navigate } from 'react-router-dom';

import { createBrowserRouter, Navigate, Outlet } from 'react-router-dom';

github-code-quality bot found potential problems Apr 17, 2026

View reviewed changes

RockChinQ force-pushed the feat/sandbox branch from 6cbdc4a to 4201647 Compare April 17, 2026 12:28

huanghuoguoguo and others added 16 commits April 19, 2026 20:20

feat(box): add sandbox_exec tool loop for local-agent calculations

938a155

feat(box): add host workspace mounting and sandbox_exec guidance

c0a43a8

feat(box): add obs

94f8148

refactor(box): unify box service lifecycle and local runtime

64e312c

management

feat: add test

b64da58

fix: fix box intergration test

06ceaf3

refactor: use rpc

70b114f

fix: import

3b5647b

fix: ruff

64dd19f

fix: ruff

a07923b

RockChinQ added 20 commits April 19, 2026 20:21

feat(box): support custom sandbox container image via config.yaml

4f81d39

Add 'image' field to box config section. When set, it overrides the profile default image (python:3.11-slim) for all sandbox containers. Priority: caller-specified > config.yaml image > profile default.

fix(web): auto-refresh system status and show disconnect errors in re…

e30e2b2

…al time Poll Plugin Runtime and Box Runtime status every 30 seconds so the dashboard reflects disconnections without a manual page refresh. Also re-fetch when the popover is opened for immediate feedback.

fix(web): refresh system status card when clicking Refresh Data button

bd436b7

Pass a refreshKey prop through OverviewCards to SystemStatusCard that increments on each Refresh Data click, triggering a re-fetch of Plugin and Box runtime status alongside the monitoring data refresh.

fix: use 'name' instead of 'value' for select options in sandbox scop…

c35e4d5

…e config DynamicFormComponent's select renderer uses option.name as the value and key, but the YAML had 'value' fields. This caused the dropdown to render blank labels for all options.

feat(web): show active sandbox details in dashboard Box status popover

69f0f2b

Fetch box sessions alongside status and display each active sandbox in the popover with session ID, image, resources (CPU/memory), and last used time.

feat(box): add global sandbox scope option

8f47027

Add a 'Global (shared by all)' option to the sandbox scope selector. Uses a constant '{global}' template variable that always resolves to 'global', so all users and chats share one sandbox container.

fix(web): widen system status dialog and fix scroll border issue

10a2772

Use max-w-2xl (matching other dialogs) instead of max-w-lg. Move overflow-y-auto to an inner container with overflow-hidden on DialogContent to prevent padding bleed at scroll edges.

feat(web): add tooltips for truncated fields in system status dialog

fa74c75

Wrap session_id, image, and mount path fields with Tooltip components so hovering over truncated text shows the full value.

RockChinQ force-pushed the feat/sandbox branch from 0d18fae to fa74c75 Compare April 19, 2026 12:26

RockChinQ added 3 commits April 19, 2026 20:47

fix(web): tighten scan popover layout and add rescan button

c1340c2

Remove 'Scanned Models' label and reduce top spacing in scan popover. Add a refresh icon button next to 'Add Selected' to re-trigger scanning without closing and reopening the popover.

github-code-quality bot found potential problems Apr 19, 2026

View reviewed changes

Comment thread web/src/app/home/components/models-dialog/components/AddModelPopover.tsx Fixed

RockChinQ added 5 commits April 19, 2026 20:54

fix(web): reduce top spacing in scan popover

44e2c8b

fix(web): remove all top spacing from scan tab content

dc1ba9a

Implement feature X to enhance user experience and optimize performance

3bd0ce2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/sandbox #2072

Feat/sandbox #2072
huanghuoguoguo wants to merge 70 commits intomasterfrom
feat/sandbox

huanghuoguoguo commented Mar 22, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

huanghuoguoguo commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LangBot Box：沙箱执行系统

概述

功能

架构

分层与职责

核心设计决策

1. Runtime 底座下沉到 SDK

2. 同进程架构

3. Session 复用

4. Profile 体系在 LangBot 层生效

5. Backend 抽象与探测顺序

6. MCP Box-stdio 模式

7. Host Path 挂载

核心接口

LangBot：BoxService

SDK：BoxSpec

SDK：BaseSandboxBackend

通信方式

Action RPC

传输模式

WebSocket Relay

部署方式

本地开发

Docker Compose

远程部署（可选）

安全模型

Skill / 插件如何接入

1. 通过 sandbox_exec

2. 直接调用 BoxService

3. MCP Server in Box

文件结构

LangBot 主仓

langbot-plugin-sdk

部署与测试

测试覆盖

Q&A

Q: Profile 是全局的吗？模型能覆盖哪些参数？

Q: MCP Server 为什么不走 Profile？

Q: Session TTL 会不会把 MCP Server 提前清掉？

Q: 现在没有 Docker / Podman 怎么办？

Q: nsjail 现在是什么状态？

Q: 如何接入新的 Backend？

Q: 为什么 Box Runtime 不需要独立容器？

Q: Windows 支持情况？

Uh oh!

codecov bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

huanghuoguoguo commented Mar 22, 2026 •

edited

Loading

LangBot：`BoxService`

SDK：`BoxSpec`

SDK：`BaseSandboxBackend`

1. 通过 `sandbox_exec`

2. 直接调用 `BoxService`

`langbot-plugin-sdk`

Q: `nsjail` 现在是什么状态？

codecov bot commented Mar 22, 2026 •

edited

Loading