NeneBot: A RAG-Powered Ayachi Nene AI Companion

A RAG-Powered Conversational AI for Ayachi Nene
"メンカタカラメヤサイダブルニンニクアブラマシマシ！"

A RAG conversational AI combining FAISS semantic retrieval with pluggable LLM backends (Claude, DeepSeek, or local Ollama). Features multi-turn session memory, SSE token streaming, a modern Vue 3 immersive Galgame UI, and similarity-threshold filtering for hallucination-free character reproduction.

Vision

Traditional AI role-playing bots often suffer from two fatal flaws: "Hallucinations" (making up fake lore) and "OOC" (Out of Character responses). While conventional Fine-tuning can help, it is hardware-intensive and rarely eradicates these issues completely.

NeneBot is an attempt to bring RAG (Retrieval-Augmented Generation) architecture to Galgame character simulation:

External Memory Engine: By slicing and vectorizing the original script of Sanoba Witch, we give the AI "true" memories.
Authentic Reproduction: The LLM is forced to reference retrieved original dialogue, perfectly capturing Nene's gentle and shy personality.
Pluggable LLM Backend: Swap between local Ollama and cloud APIs (Claude, DeepSeek) with a single environment variable — no code changes required.
Ultimate Front-end Aesthetics: Ditching clunky terminal interfaces for an immersive, modern visual novel (Galgame) UI.

Features

Flexible LLM Backend: Use local Ollama (Qwen 2.5, Llama, etc.) for full privacy, or plug in a cloud API (Claude, DeepSeek) via a single LLM_PROVIDER env var for higher quality.
Multi-Turn Memory: Per-session conversation history (sliding window) keeps Nene contextually aware across turns.
Real-Time Token Streaming: SSE-based streaming delivers a native typewriter effect — responses appear word by word.
Millisecond Semantic Retrieval: Utilizes Meta's FAISS vector database alongside the bge-small-zh embedding model to pinpoint relevant historical scripts.
Threshold Fallback Mechanism: Features a custom match_threshold filter (cosine similarity, default 0.55). If the topic is unfamiliar, Nene seamlessly transitions to zero-shot character playing rather than forcing irrelevant memories.
Immersive Visual Experience: A stunning Vue 3 + Vite front-end featuring a dark glassmorphism UI, typewriter effects, and dynamic breathing layouts.
Out-of-the-Box Automation: Includes 1-click installation and startup scripts for both Windows and Linux. No terminal anxiety required.

Quick Start

Option A — One-Click Cloud Deploy (no local setup)

Deploy to Railway in under 5 minutes. Users only need a browser URL.

Fork this repo and connect it to Railway.

In Railway's Variables panel, set:

LLM_PROVIDER=deepseek
OPENAI_COMPAT_API_KEY=sk-...

Railway builds the frontend, installs deps, and starts the server automatically.
Share the generated *.railway.app URL — done.

Option B — Local Setup

Before starting locally, decide which LLM backend you want to use:

Local Ollama: zero API cost, fully local, recommended for offline/private use.
Cloud API (Claude / DeepSeek / OpenAI-compatible): better quality and easier setup on lower-end machines.

Create a local .env file before your first run:

cp .env.example .env

Then edit only the fields relevant to your provider:

# Option 1: Local Ollama
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://127.0.0.1:11434
LLM_MODEL_NAME=qwen2.5

# Option 2: DeepSeek
# LLM_PROVIDER=deepseek
# OPENAI_COMPAT_API_KEY=sk-...
# OPENAI_COMPAT_BASE_URL=https://api.deepseek.com
# OPENAI_COMPAT_MODEL=deepseek-chat

# Option 3: Claude
# LLM_PROVIDER=claude
# ANTHROPIC_API_KEY=sk-ant-...
# CLAUDE_MODEL_NAME=claude-haiku-4-5-20251001

Good to know: This repo already includes data/raw/train.jsonl and a prebuilt vector_store/ directory. On a fresh machine, scripts/setup.sh will also rebuild the FAISS index if needed.

For Windows Users

Step 1: Install Prerequisites (Skip if already installed)

Download and install Python 3.10+. [CRITICAL]: Ensure you check Add Python to PATH at the bottom of the installer!
Download and install Node.js (LTS version).
(Only if using local Ollama) Download and install Ollama for Windows.

Step 2: Download NeneBot Click the green Code button on this GitHub page and select Download ZIP. Extract it to a folder on your PC (e.g., D:\NeneBot).

Step 2.5: Configure your provider Copy .env.example to .env, then fill in the keys only if you are using a cloud provider.

Step 3: One-Click Ignition! Open the extracted folder and double-click start_windows.bat.

Grab a coffee. The script will automatically download dependencies, wake up the AI engine, and launch your browser.
Once the UI pops up, Nene is ready to chat!

For Linux Users

Open your terminal and execute the following elegant commands:

# 1. Clone the repository
git clone https://github.com/your-username/NeneBot.git
cd NeneBot

# 2. Create your local environment file
cp .env.example .env

# 3. Grant execution permissions to scripts
chmod +x scripts/setup.sh scripts/run.sh

# 4. Run the automated setup (Only required once)
./scripts/setup.sh

# 5. Ignite the engines!
./scripts/run.sh

Tip: Once started, visit http://localhost:5173 in your browser for the UI. The backend API Swagger docs are located at http://localhost:8000/docs.

Important: Do not open frontend/index.html directly in the browser. The application requires a running FastAPI backend (/v1/*) and should be accessed through the Vite dev server (5173) or the compiled production build served by FastAPI.

Unified Launcher

For day-to-day usage, prefer the unified launcher instead of remembering multiple raw commands:

# Full local development mode: backend + frontend
python scripts/launch.py dev

# Backend only
python scripts/launch.py local

# Telegram polling bot
python scripts/launch.py telegram

If you already created a local virtual environment, use:

./venv/bin/python scripts/launch.py dev

Option C — Single-Port Local Run (production-like)

If you want to access the full app from one URL only instead of running Vite separately:

# 1. Backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# 2. Frontend build
cd frontend
npm install
npm run build
cd ..

# 3. Serve both API and frontend from FastAPI
python -m uvicorn src.main:app --host 0.0.0.0 --port 8000

Then open:

http://localhost:8000 → Full application
http://localhost:8000/admin → Read-only admin console
http://localhost:8000/docs → API docs

Option D — Docker Compose (app + Redis)

For a more production-like local stack with persistent session storage:

cp .env.example .env
docker compose -f deploy/docker_compose.yml up --build

This stack starts:

app on http://localhost:8000
redis on localhost:6379

Recommended .env settings for this mode:

SESSION_BACKEND=redis
REDIS_URL=redis://redis:6379/0

Operational endpoints:

GET /health → service health, vector index status, frontend mode, session backend status
GET /health/live / GET /health/ready → split liveness/readiness probes
GET /metrics → Prometheus-style metrics text
Response header X-Request-ID → request correlation id for logs and API debugging

Optional API protection:

API_AUTH_ENABLED=true
API_AUTH_TOKENS=frontend|chat:dev-chat-token,ops|ops:dev-ops-token
API_AUTH_REGISTRY_PATH=./config/api_tokens.json

When enabled, requests must include either:

Authorization: Bearer <token>
X-API-Key: <token>

Scope rules:

/v1/* requires chat
/health* and /metrics require ops
Legacy tokens without explicit scopes still get chat + ops for backward compatibility

You can also move tokens into a local registry file instead of .env:

[
  { "name": "frontend", "token": "replace-with-chat-token", "scopes": ["chat"] },
  { "name": "ops-dashboard", "token": "replace-with-ops-token", "scopes": ["ops"] }
]

See config/api_tokens.json.example for the full format.

Server logs also emit JSON audit fields such as action, endpoint, session_id, auth_subject, auth_scopes, and request_id.

Optional tracing:

TRACING_ENABLED=true
TRACING_SERVICE_NAME=nenebot
TRACING_EXPORTER=console

Current spans:

http.request
rag.retrieve
llm.request

How to verify tracing works locally:

Install updated dependencies:
```
pip install -r requirements.txt
```

Enable tracing in .env:

TRACING_ENABLED=true
TRACING_EXPORTER=console

Start the API server and send one chat request.
Check the server stdout. You should see span output containing names like:
- http.request
- rag.retrieve
- llm.request
Confirm span attributes include fields such as:
- request_id
- http_path
- provider_name
- rag.retrieved_count
- auth.subject when a token is used

If you see normal API responses and span dumps in the terminal, tracing is wired correctly.

Read-only admin MVP:

GET /admin/api/overview → runtime, health, LLM, retrieval, auth, integrations summary
GET /admin/api/metrics/summary → preview of rendered Prometheus metrics
GET /admin/api/knowledge/overview → dataset summary and vector store summary
POST /admin/api/knowledge/import → import JSONL dataset content
POST /admin/api/knowledge/rebuild → rebuild vector index from current dataset
http://localhost:8000/admin → browser admin console

How to verify the admin console works:

Configure an ops token in .env or config/api_tokens.json.
Build the frontend and start the app.
Open http://localhost:8000/admin.
When prompted, paste the ops token.
Confirm the page shows:
- service/environment/version
- health status
- LLM provider/model
- configured auth identities
- metrics preview lines

If the page loads and these cards populate, the admin MVP is working.

Knowledge base operations:

The admin console now includes a knowledge panel for:
- viewing dataset preview
- validating JSONL before writing
- importing JSONL content
- rebuilding the vector index
Imported content must be JSONL, one JSON object per line, with a messages list.

How to verify knowledge import and rebuild:

Open http://localhost:8000/admin with an ops token.

Paste one valid JSONL line into the knowledge textarea, for example:

{"messages":[{"role":"system","content":"You are Nene."},{"role":"user","content":"你好"},{"role":"assistant","content":"你好呀，保科君。"}]}

Click VALIDATE first and confirm the dry-run succeeds.
Click IMPORT + REBUILD.
Confirm the page updates:
- dataset line count changes
- preview shows the imported user/assistant pair
- vector store summary refreshes
Send a normal chat request and confirm the service still answers normally.

If the dataset summary updates and rebuild completes without error, the knowledge workflow is connected correctly.

Release gate:

See RELEASE_CHECKLIST.md before tagging a preview release.

Option E — Telegram Bot (long polling)

The simplest IM integration path is Telegram via the official Bot API.

Create a bot with @BotFather
Copy the token into .env
Start the polling adapter:

cp .env.example .env

# Fill in:
# TELEGRAM_BOT_TOKEN=123456:ABC...
# LLM_PROVIDER=deepseek   # or ollama / claude / openai

python scripts/launch.py telegram

Notes:

Telegram messages are mapped to internal session ids like telegram:<chat_id>
Built-in commands: /start, /help, /reset, /model
/reset clears that chat's memory window
Default mode is long polling, so you do not need a public webhook URL yet

To switch to webhook mode later:

TELEGRAM_MODE=webhook
TELEGRAM_PUBLIC_BASE_URL=https://your-domain.com
TELEGRAM_WEBHOOK_PATH=/integrations/telegram/webhook
TELEGRAM_WEBHOOK_SECRET=your-secret

Then start the normal API server. Telegram will POST updates to:

POST /integrations/telegram/webhook

If TELEGRAM_WEBHOOK_SECRET is set, the server verifies the X-Telegram-Bot-Api-Secret-Token header.

Architecture

This project strictly adheres to microservice and front/back-end decoupling standards:

NeneBot/
├── 📂 data/             # Raw script corpus (for vectorization)
├── 📂 vector_store/     # FAISS persistent vector index
├── 📂 frontend/         # Vue 3 + Vite immersive UI
├── 📂 src/              # FastAPI core backend service
│   ├── api/             # Routing and Pydantic data validation
│   ├── core/            # pydantic-settings config and global exceptions
│   ├── infrastructure/  # External adapters (FAISS, Ollama, Claude, DeepSeek)
│   └── services/        # Core business logic (RAG pipeline, Embeddings, Sessions)
├── 📂 scripts/          # DevOps toolbox (Setup, Run, Linters)
├── 📄 railway.toml      # One-click Railway deployment config
├── 📄 .env.example      # Environment variable template
├── 📄 pyproject.toml    # Industrial linter configs (Ruff & Mypy)
└── 📄 requirements.txt  # Python dependency list

Advanced Configuration

For developers who want to tweak the bot, you can easily customize Nene:

Switch LLM Provider: Set LLM_PROVIDER in .env to ollama, claude, deepseek, or openai. See .env.example for the full list of required keys.
Adjust Strictness: Modify MATCH_THRESHOLD (default 0.55) in .env or directly in src/services/rag_pipeline.py. Lower values make her stick strictly to the script; higher values allow more creative freedom.
Change Sprites & Backgrounds: Replace nene_sprite.png and bg_room.png in the frontend/public/ directory. Changes apply instantly in dev thanks to Vite HMR.
Modify Character Persona: Edit the _CHARACTER_CARD constant in src/services/rag_pipeline.py to add new personality traits or instructions.
Rebuild the Memory Index: If you replace data/raw/train.jsonl, run python scripts/init_vector_db.py to regenerate vector_store/.
Session Persistence: Set SESSION_BACKEND=redis and configure REDIS_URL to persist chat memory across restarts.
Operations: Use /health for diagnostics and X-Request-ID to correlate client failures with server logs.
Unified Launcher: Use python scripts/launch.py dev, local, or telegram depending on the runtime mode you want.
Telegram Adapter: Set TELEGRAM_BOT_TOKEN and run python scripts/launch.py telegram to attach the bot to Telegram via long polling.

FAQ

1. "Python / Node is not recognized as an internal or external command" on Windows?

You either haven't installed Python/Node.js, or forgot to add them to your environment variables. Reinstall them and ensure you check the "Add to PATH" option.

2. The chat shows a connection error or "Nene's thoughts disconnected"?

First verify that the backend is actually running:

Frontend dev mode: open http://localhost:5173
Backend health check: open http://localhost:8000/health
API docs: open http://localhost:8000/docs

If using Ollama: The Ollama service may not be running, or your machine ran out of VRAM/RAM. Try running ollama run qwen2.5 manually. On Linux/WSL, also ensure no system proxy is intercepting localhost traffic (unset http_proxy).

If using a cloud API: Verify that your ANTHROPIC_API_KEY or OPENAI_COMPAT_API_KEY is set correctly in .env and that LLM_PROVIDER matches.

3. Can I open the page directly without starting anything?

No. This is not a static HTML demo.

The Vue page depends on the FastAPI backend for /v1/chat/stream, session memory, and RAG retrieval. Use one of these two modes instead:

Development mode: ./scripts/run.sh then open http://localhost:5173
Single-port mode: build frontend/dist, start FastAPI, then open http://localhost:8000

4. Can I swap the character to someone else (e.g., Ayase Mitsukasa)?

Absolutely! This is a universal architecture. Simply:

Replace data/raw/train.jsonl with Ayase's dialogue data.
Run python scripts/init_vector_db.py to rebuild the memory database.
Replace the sprite assets in frontend/public/.
Update the character name and persona in the system prompt.

Contributing

We welcome contributions from the community! Whether it's fixing bugs, improving the CSS aesthetics, or providing better script datasets, please follow these steps:

Fork the Project.
Create your Feature Branch: git checkout -b feature/AmazingFeature
Commit your Changes: git commit -m 'feat: Add some AmazingFeature'
Push to the Branch: git push origin feature/AmazingFeature
Open a Pull Request.

(Note: Before submitting PRs, please run ./scripts/run_linter.sh to ensure your code passes our strict Ruff and Mypy checks.)

License

Distributed under the GNU General Public License v3.0. This project is for technical exploration and learning purposes only. The copyright of the character sprites, background art, and game scripts belongs to the original creator (Yuzusoft). Please do not use them for commercial purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeneBot: A RAG-Powered Ayachi Nene AI Companion

Vision

Features

Quick Start

Option A — One-Click Cloud Deploy (no local setup)

Option B — Local Setup

For Windows Users

For Linux Users

Unified Launcher

Option C — Single-Port Local Run (production-like)

Option D — Docker Compose (app + Redis)

Option E — Telegram Bot (long polling)

Architecture

Advanced Configuration

FAQ

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.github/workflows		.github/workflows
assets		assets
config		config
data		data
deploy		deploy
frontend		frontend
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_ch.md		README_ch.md
RELEASE_CHECKLIST.md		RELEASE_CHECKLIST.md
pyproject.toml		pyproject.toml
railway.toml		railway.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
start_windows.bat		start_windows.bat

Folders and files

Latest commit

History

Repository files navigation

NeneBot: A RAG-Powered Ayachi Nene AI Companion

Vision

Features

Quick Start

Option A — One-Click Cloud Deploy (no local setup)

Option B — Local Setup

For Windows Users

For Linux Users

Unified Launcher

Option C — Single-Port Local Run (production-like)

Option D — Docker Compose (app + Redis)

Option E — Telegram Bot (long polling)

Architecture

Advanced Configuration

FAQ

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages