The Letta team’s August 2025 research on LoCoMo (Long Context Memory benchmark) provides some empirical evidence here:
Key finding: Letta achieved 74.0% using simple filesystem tools (grep, search_files, open, close) with GPT-4o mini, outperforming Mem0’s specialized graph tools at 68.5%.
Why simpler tools won:
Training data advantage: Agents are extremely effective with filesystem tools because coding tasks are heavily represented in LLM training. The models already know how to navigate files.
Agent capability matters more than mechanism: The critical factor is whether the agent can effectively use the retrieval tool (when/how to call it), not the underlying data structure (graphs vs vectors vs files).
Complexity paradox: More sophisticated structures like knowledge graphs may actually be harder for LLMs to understand and use effectively.
From the blog post: “What matters: Can the agent effectively USE the retrieval tool? Exact retrieval mechanism (graphs vs vectors) matters less than agent capabilities.”
Where KGs might still have value:
Explicit relationship modeling for specific domains (medical ontologies, etc.)
Human-interpretable reasoning chains
Cases where you need formal inference rules
But for general agent memory? The research suggests simple tools with good agent orchestration beats complex infrastructure.
The blog post refers to “Mem0’s reported 68.5% score for their top-performing graph variant”, which is where I got that characterization.
But you’re right to question it - the blog also notes that Mem0’s LoCoMo results were “controversial” and they never clarified their methodology or provided code when asked.
So I may have overstated the graph connection based on that single phrase. Do you know if Mem0 actually uses knowledge graphs, or was “graph variant” referring to something else?
“Mem0 extracts entities and relationships from memory writes, storing embeddings in a vector database while maintaining relationships in a graph backend.”
They support Neo4j and Memgraph as graph backends. The workflow:
So the “graph variant” in the blog post was referring to their actual knowledge graph implementation. The Medium article “Demystifying the brilliant architecture of mem0” lists “Graph Storage for Relationships” as one of their 5 core architectural pillars.
FalkorDB’s GraphRAG-SDK is an interesting example - it’s specifically optimized for their graph database (FalkorDB) and takes a different architectural approach than Mem0:
Key features:
Ontology-first: Auto-generates ontologies from unstructured data or lets you define them manually
Multi-agent orchestrator: Supports specialized agents (e.g., “restaurant agent”, “attractions agent”) coordinated by an orchestrator
Graph-native: Built specifically for FalkorDB’s graph database, not a hybrid vector+graph approach
Example workflow:
Create ontology from sources (URLs, PDFs, etc.)
Build knowledge graph using that ontology
Query via chat interface
The orchestrator pattern is interesting for domain-specific use cases (like the trip planning example in their docs). Each agent becomes an expert in its graph, and the orchestrator routes questions.
Worth noting: It’s tightly coupled to FalkorDB, so you’d be committing to that database stack. But if you’re looking for a practical KG implementation to explore, it’s actively maintained and has decent documentation.
Have you tried it yet, or just evaluating options?