From READY 2026 Tech Exchange #42: AI-Powered Support: Mining Knowledge from Ticket Data
This repository demonstrates how semantic clustering turns a support ticket backlog into KB articles, and how IRIS GraphRAG enables hybrid vector + graph retrieval at scale.
Uses PlanetCare, a fictional EMR system — all data is synthetic, safe to share and adapt.
| Notebook | What it shows |
|---|---|
planetcare_clustering_demo.ipynb |
MDS gap → HDBSCAN clustering → resolution anchoring → KB article → wiki augmentation + Graph_KG audit |
planetcare_system_demo.ipynb |
iris-vector-graph API: kg_KNN_VEC, kg_VECTOR_GRAPH_SEARCH, kg_GRAPH_WALK, Cypher |
Both notebooks require IRIS Community and an OpenAI API key (or compatible alternative).
Prerequisites:
- Docker (for IRIS Community — free container)
- OpenAI API key (or see
.env.examplefor OpenRouter / Ollama alternatives) - Python 3.10+
Step 1 — Clone and install
git clone https://github.com/intersystems-community/ready2026-knowledge-graph-demo
cd ready2026-knowledge-graph-demo
pip install -r requirements.txtStep 2 — Start IRIS Community
cd docker && docker compose up -d && cd ..First run pulls the IRIS image (~2 GB). Subsequent starts are fast.
IRIS is ready when docker ps shows healthy for planetcare-iris.
Step 3 — Set your API key
export OPENAI_API_KEY=sk-...
# Or: cp .env.example .env (edit for OpenRouter, Ollama, etc.)Step 4 — Initialize Graph_KG
python setup/setup_iris.pyThis loads 276 PlanetCare tickets into IRIS Graph_KG, builds 384-dim embeddings via iris-vector-graph, and uses GPT-4o-mini to extract entity edges (AFFECTS / EXHIBITS / FIXED_BY). Takes ~5 minutes.
Expected output:
PlanetCare Demo — IRIS Setup
IRIS connection: OK
Embedder: all-MiniLM-L6-v2 (384-dim)
LLM: gpt-4o-mini
276 ticket nodes created in Graph_KG
276 embeddings stored
606 graph edges created
Setup complete!
Step 5 — Open the notebooks
jupyter notebook notebooks/Start with planetcare_clustering_demo.ipynb — that's the one that generated the most discussion at READY.
Using OpenRouter or Ollama instead? Copy
.env.exampleto.envand setLLM_PROVIDER.
Any OpenAI-compatible endpoint works.
- The data quality problem — 276 PlanetCare tickets, 58% with no resolution text. The norm, not the exception.
- HDBSCAN clustering — embed problem descriptions with
all-MiniLM-L6-v2, clusters emerge automatically without any labeling. - Why it works — the model encodes semantic intent, not keywords. Similar problems group together across different phrasings.
- Resolution-anchored discovery — the 42% with solutions become anchors. Each anchor suggests fixes for similar unresolved tickets in its cluster.
- KB article generation — GPT-4o-mini synthesizes a structured article from the cluster's anchor tickets.
- Wiki augmentation — article appended to
data/planetcare_wiki/billing.md(pre-existing doc with documented gaps). AUTHORED_KB and SOURCED_KB edges recorded in Graph_KG.
Uses iris-vector-graph directly:
engine.kg_KNN_VEC(query_json, k=8) # K-nearest by embedding similarity
engine.kg_VECTOR_GRAPH_SEARCH(query_json) # hybrid: vector + graph expansion
ops.kg_GRAPH_WALK("pc_ticket:PC-00001", depth=2) # traverse entity edges
ops.kg_NEIGHBORHOOD_EXPANSION(entity_nodes) # find tickets sharing same entities
engine.execute_cypher("MATCH (t:PCTicket)...") # direct Cypher queriesnotebooks/
planetcare_clustering_demo.ipynb ← The main demo
planetcare_system_demo.ipynb ← iris-vector-graph API showcase
data/
planetcare_demo_tickets.json ← 276 synthetic PlanetCare tickets
planetcare/
questionnaire_clusters_anon.csv ← 295 anonymized questionnaire tickets
questionnaire_kb_articles_anon.json
planetcare_wiki/
billing.md ← Pre-existing KB, documented gaps
laboratory.md ← Pre-existing KB, documented gaps
pharmacy.md ← Mostly empty — gaps noted
setup/
setup_iris.py ← Initializes Graph_KG via IRISGraphEngine, loads tickets, builds embeddings
embedder.py ← Embedding provider abstraction (local / OpenAI / OpenRouter / custom)
docker/
docker-compose.yml ← IRIS Community container
.env.example ← All configuration options
requirements.txt
All data is synthetic — no real patient data, no real hospital names.
| File | Description |
|---|---|
data/planetcare_demo_tickets.json |
276 synthetic PlanetCare tickets · 7 categories · 42% resolved / 58% open |
data/planetcare/questionnaire_clusters_anon.csv |
295 anonymized questionnaire tickets with cluster labels |
data/planetcare_wiki/*.md |
Pre-existing KB articles with documented knowledge gaps |
data/planetcare_wiki/ is the fictional PlanetCare KB.
Pre-existing articles document what's known, with explicit "Knowledge Gaps" sections.
Running the clustering notebook augments these articles — AI-generated sections are appended, clearly marked with source ticket count, agent ID, timestamp, and a human-review warning.
Provenance is recorded in Graph_KG:
MATCH (agent)-[:AUTHORED_KB]->(kb:KBArticle)<-[:SOURCED_KB]-(ticket:PCTicket)
RETURN agent.agent_id, kb.article_id, kb.category, ticket.ticket_id
LIMIT 20This demo is built on iris-vector-graph.
See the iris-vector-graph examples for more API patterns.
What setup_iris.py does:
engine.initialize_schema()— sets up Graph_KG tables and indexesengine.create_node()— createspc_ticket:PC-#####nodesengine.store_embeddings()— stores vectors inkg_NodeEmbeddingsforkg_KNN_VECengine.create_edge()— builds AFFECTS/EXHIBITS/FIXED_BY/SOURCED_KB/AUTHORED_KB edges
The pipeline works on any ticket system:
- Export tickets to JSON:
[{ticket_id, Summary, Problem, Solution, Classification, Status}, ...] - Replace
data/planetcare_demo_tickets.json - Run
python setup/setup_iris.py - Open the notebooks — clustering and retrieval work on any EMR data
InterSystems READY 2026 — Tech Exchange #42
AI-Powered Support: Mining Knowledge from Ticket Data
Thomas Dyar, Sr. Manager AI Platform & Ecosystems, InterSystems
MIT