Eval tools for Letta

cameron · July 28, 2025, 6:18pm

Hey y’all, we’re trying to figure out what kind of evals tooling we want to support.

If you’re interested in feedback, could you give us your thoughts on:

What tools you use currently, if any?
What you want to know when you use eval tools?
What do you evaluate now? Memory persistence, conversation quality, retrieval accuracy, something else?
What’s broken? Where do existing eval tools fail you when testing stateful agents?
What’s your biggest memory eval pain point? Cross-session consistency? Memory management decisions? Long-term coherence?
What would convince you to adopt new eval tooling? Better memory-specific metrics? Easier integration? Cost?
Where are you headed? Planning production deployments? Need compliance/safety evals?

Comments welcome!

Topic		Replies	Views
Agent memory: Letta vs Mem0 vs Zep vs Cognee Community	3	1587	October 29, 2025
Agent memory solutions: Letta vs Mem0 vs Zep vs Cognee General	1	463	October 25, 2025
SKILL.md for Letta General	7	177	November 13, 2025
Voice-First Refiner Agent: System Prompt, Memory Management, and Mode Switching Agent Design	0	23	February 11, 2026
Welcome to the Letta community forum! Announcements	0	133	May 21, 2025