Skip to content

Latest commit

 

History

History
198 lines (144 loc) · 10.5 KB

File metadata and controls

198 lines (144 loc) · 10.5 KB

MCP Vault: File Synchronization Architecture

The Problem

Claude cannot read files that are on your personal server. It only has access to what you provide in the conversation (copy-paste, upload) or via remote MCP tools. The goal is to give it permanent access to a collection of markdown files without depending on a specific software (no Obsidian, no Notion).

The Solution: Three Independent Layers

┌─────────────┐      rclone       ┌──────────────┐     Worker MCP     ┌─────────┐
│   Server    │  ──────sync─────> │  Cloudflare   │  <───tools────── │  Claude  │
│  (files)    │  <─────sync────── │  R2 Bucket    │                   │  (MCP)   │
└─────────────┘                   └──────────────┘                   └─────────┘
    local                            storage                           access

Layer 1: Storage (Cloudflare R2)

An R2 bucket named mcp-vault. This is object storage (S3-compatible) hosted by Cloudflare. It stores markdown files as-is, with folder hierarchy preserved.

Why R2 instead of a classic server? It's accessible from the Cloudflare Worker with zero latency (same infrastructure), has no outbound bandwidth fees, and requires zero server maintenance.

Layer 2: Synchronization (rclone on your server)

rclone is a command-line tool that synchronizes files between local folder and remote storage (here R2). It works like rsync, but for cloud.

It runs on your server as a scheduled task (cron every 5 minutes). When a markdown file changes locally, rclone pushes it to R2 on the next cycle. And vice versa: if Claude writes a file via the Worker, rclone pulls it to your server.

Your server doesn't need the Worker code, Node.js, or npm. It just needs rclone and an R2 API key.

Layer 3: MCP Worker (Cloudflare Workers)

The Worker is the code in this repository (second-brain-vault). It's a tiny serverless server that exposes MCP tools to Claude:

  • alive: verify the server is operational
  • tree: display vault directory structure (like unix tree), with configurable depth and subfolder filter
  • list_files: list files in the R2 bucket (with folder filter)
  • search_files: search files by name, extension, and/or folder (fast, no content scanning)
  • read_file: read text file content
  • write_file: write or update a file (with auto content-type and metadata)
  • delete_file: delete a file (one at a time, confirm: true required, no wildcards or entire folders)
  • get_upload_url: generate a temporary URL (60s) to upload files via curl (max 50 MB)
  • get_download_url: generate a temporary URL (60s) to download files via curl

The Worker is protected by GitHub OAuth. Only authorized users (verified by ALLOWED_GITHUB_ID and ALLOWED_GITHUB_LOGIN environment variables) are allowed; all others receive a 403. Claude calls it via HTTPS with a token, the Worker reads/writes to R2, and returns the result.

The Worker code deploys from any machine (your laptop, your server, anywhere) with npm run deploy. Once deployed, it runs on Cloudflare infrastructure; nothing to host.

What Runs Where

Component Where Installation
Markdown files Your server (~/documents/vault/ or similar) Your files, nothing to install
rclone (sync) Your server (cron every 5 min) rclone + R2 config
R2 bucket mcp-vault Cloudflare (ENAM) Nothing (managed service)
MCP Worker Cloudflare Workers Nothing (already deployed)
Worker source code Your machine Node.js + npm (for wrangler deploy)

Data Flow

Claude reads a file:

  1. Claude sends an MCP request to the Worker (read_file, key = notes/todo.md)
  2. Worker verifies OAuth token
  3. Worker reads the file from R2
  4. Worker returns content to Claude

Claude writes a small text file:

  1. Claude sends an MCP request to the Worker (write_file, key + content)
  2. Worker writes to R2 with correct content-type
  3. On the next rclone cycle (max 5 min), file syncs to your server

Claude uploads a large or binary file:

  1. Claude calls get_upload_url with desired key
  2. Worker generates an R2 presigned URL valid for 60 seconds (max 50 MB)
  3. Claude executes curl -X PUT -T file "<url>" to send the file directly to R2
  4. On the next rclone cycle, file syncs to your server

You modify a file locally:

  1. You edit a markdown file in ~/documents/vault/ on your server
  2. rclone detects the change on next cycle and pushes to R2
  3. Claude can read the updated version via read_file

Security

  • Worker is protected by OAuth 2.1 via GitHub (dual verification: username + numeric ID)
  • Any other GitHub account is rejected with 403
  • R2 API keys (for rclone) are in ~/.config/rclone/rclone.conf on your server
  • Worker secrets (GitHub OAuth, cookie encryption, R2 keys) are in Cloudflare Secrets (not in code)
  • R2 is not publicly exposed; only the Worker accesses it via internal binding
  • File key validation: no traversal (..), no hidden files, no control characters, max 512 chars
  • Presigned URLs: fixed 60-second TTL (not configurable by client), upload limited to 50 MB

Reference Identifiers and URLs

Resource Value
Worker URL https://<your-worker-name>.<your-domain>.workers.dev
MCP URL (for Claude) https://<your-worker-name>.<your-domain>.workers.dev/mcp
Cloudflare Account ID <YOUR_ACCOUNT_ID>
KV Namespace ID <YOUR_KV_NAMESPACE_ID>
R2 Bucket mcp-vault
Remote rclone r2-vault:mcp-vault
Local folder (server) ~/documents/vault/ or similar
GitHub OAuth callback https://<your-worker-name>.<your-domain>.workers.dev/callback

What Has Been Done (Chronological Summary)

  1. Worker MCP + OAuth: created second-brain-vault project with an alive tool and GitHub OAuth authentication via @cloudflare/workers-oauth-provider. Deployed and tested via MCP Inspector and Claude.

  2. Access control: added filter in github-handler.ts that verifies GitHub ID and login before authorizing access. All other accounts are rejected.

  3. R2 bucket: created mcp-vault bucket on Cloudflare. Added VAULT binding in wrangler.jsonc.

  4. Basic file tools: added list_files, read_file, write_file. Each tool accesses R2 bucket via internal binding.

  5. rclone on server: installed rclone, configured remote r2-vault with R2 API keys, initialized bidirectional sync (bisync --resync).

  6. Cron job: added cron on server that runs rclone bisync every 5 minutes. Log stored outside the sync folder.

The cron line:

*/5 * * * * /path/to/rclone bisync ~/documents/vault r2-vault:mcp-vault --create-empty-src-dirs --exclude "*.log" --exclude ".DS_Store" 2>&1 >> ~/documents/.vault-sync.log

Key points:

  • Log is stored outside the sync folder to prevent it from syncing to R2
  • --exclude "*.log" prevents any .log files from syncing (additional security)
  • --exclude ".DS_Store" prevents macOS system files from syncing
  • Absolute path to rclone is necessary since cron doesn't have homebrew PATH
  1. Enhanced security: file key validation (anti-traversal, anti-hidden-files, length limit), auto content-type on write_file.

  2. Presigned URLs: added get_upload_url and get_download_url for file transfers via curl (TTL 60s, max 50 MB). Requires R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, CF_ACCOUNT_ID.

  3. Search: added search_files to find files by name, extension, and/or folder without scanning content.

  4. Directory tree: added tree tool that displays vault structure like unix tree. Uses R2 list() with delimiter to build tree efficiently. Configurable depth (default 3), subfolder filter, file sizes, and counts. Version 1.2.0.

  5. Deletion: added delete_file tool with safeguards (one file at a time, no wildcards, no folders, confirm: true required, existence check). Version 1.3.0.

  6. Fix presigned URLs: corrected two bugs in get_upload_url. (a) ContentLength: MAX_UPLOAD_SIZE in PutObjectCommand forced exact 50 MB size instead of maximum; removed. (b) AWS SDK v3 (3.700+) auto-adds x-amz-checksum-crc32 header with placeholder value that breaks presigned URLs on R2; fixed by adding requestChecksumCalculation: "WHEN_REQUIRED" to S3Client. Version 1.3.1.

  7. write_file safeguards: updated tool description to guide Claude to get_upload_url after 1,000 characters. Hard block in code at 2,000 characters (LLM margin). get_upload_url is now the default for vault writes.

  8. Git versioning: initialized git versioning on the project. .gitignore excludes node_modules/, .wrangler/, .env, .dev.vars, *.log, .DS_Store.

How to Verify Everything Works

Test 1: Worker responds

In MCP Inspector (npx @modelcontextprotocol/inspector), connect to https://<your-worker-name>.<your-domain>.workers.dev/mcp and call the alive tool. Expected result: success message.

Test 2: Write via Worker

In Inspector, call write_file with key: "test/verify.md" and content: "# Verification\nIt works.". Expected result: write confirmation.

Test 3: Read via Worker

In Inspector, call read_file with key: "test/verify.md". Expected result: the file content.

Test 4: Sync R2 to server

On your server, force a sync:

rclone bisync ~/documents/vault r2-vault:mcp-vault --create-empty-src-dirs

Then verify:

cat ~/documents/vault/test/verify.md

Expected result: "# Verification\nIt works."

Test 5: Sync server to R2

On your server, create a file:

echo "# From server" > ~/documents/vault/test/local.md
rclone bisync ~/documents/vault r2-vault:mcp-vault --create-empty-src-dirs

In Inspector, call read_file with key: "test/local.md". Expected result: "# From server"

Test 6: Cron is running

Wait 5 minutes, then check the log:

cat ~/documents/.vault-sync.log

Expected result: recent log lines (no errors).

Uninstall

To cleanly remove everything:

  • Cron: crontab -e then delete the rclone line
  • rclone: rm /path/to/rclone && rm -rf ~/.config/rclone/
  • Worker: wrangler delete from the project folder on your machine
  • R2 bucket: delete via Cloudflare dashboard (R2 > mcp-vault > Delete)
  • KV Namespace: delete via dashboard (Workers & Pages > KV > namespace > Delete)
  • R2 API key: revoke via dashboard (R2 > Manage R2 API Tokens)
  • GitHub OAuth App: delete in GitHub (Settings > Developer settings > OAuth Apps)