Problem
MemoryService.add() is invoked at the end of every chat turn with the
entire LangGraph message history (response["messages"] /
state.values["messages"]). Two failure modes stack:
- Repeated re-extraction. Mem0's fact extractor runs against every
prior turn on every call, so the same messages are paid for N times
and near-duplicate rows slip past dedup into pgvector.
- Permissive default prompt. Mem0's out-of-the-box extractor
captures greetings, acknowledgements, transient task instructions,
and assistant chit-chat — exactly the noise we don't want persisted.
Result: longterm_memory fills with rows like "2 + 2",
"Assistant offered to help", duplicate restatements of the same
preference, etc. Memory search quality degrades, token spend on the
extractor grows linearly with session length.
Reproduction
- Start a session, send
"what is the sum of 2 + 2" → a memory row is written "Asked how much is 2 + 2".
- Send any follow-up → the extractor re-processes the
sum plus
the new turn; more near-duplicate rows land in pgvector.
- Inspect
longterm_memory rows for the user — pointless entries
dominate the legitimate facts.
Impact
- Polluted long-term memory → worse retrieval relevance on
memory_service.search.
- Unnecessary LLM spend on
LONG_TERM_MEMORY_MODEL (extractor re-runs
the full history).
- Cost scales O(turns²) over a session instead of O(turns).
Proposed fix in PR #66
Problem
MemoryService.add()is invoked at the end of every chat turn with theentire LangGraph message history (
response["messages"]/state.values["messages"]). Two failure modes stack:prior turn on every call, so the same messages are paid for N times
and near-duplicate rows slip past dedup into pgvector.
captures greetings, acknowledgements, transient task instructions,
and assistant chit-chat — exactly the noise we don't want persisted.
Result:
longterm_memoryfills with rows like "2 + 2","Assistant offered to help", duplicate restatements of the same
preference, etc. Memory search quality degrades, token spend on the
extractor grows linearly with session length.
Reproduction
"what is the sum of 2 + 2"→ a memory row is written "Asked how much is 2 + 2".sumplusthe new turn; more near-duplicate rows land in pgvector.
longterm_memoryrows for the user — pointless entriesdominate the legitimate facts.
Impact
memory_service.search.LONG_TERM_MEMORY_MODEL(extractor re-runsthe full history).
Proposed fix in PR #66