Here’s a focused memory architecture design skill:
SKILL.md
---
name: letta-memory-architect
description: Guide for designing effective memory architectures in Letta agents. Use when users need help structuring memory blocks, choosing between memory types, or optimizing memory management patterns.
license: MIT
---
# Letta Memory Architect
This skill guides the design of effective memory architectures for Letta agents, including memory block structure, memory type selection, and concurrency patterns.
## When to Use This Skill
Use this skill when users are:
- Designing memory block structure for a new agent
- Choosing between core memory, archival memory, and conversation history
- Optimizing memory block organization for performance
- Implementing shared memory between agents
- Debugging memory-related issues (size limits, concurrency)
## Memory Architecture Process
### 1. Memory Type Selection
Consult `references/memory-types.md` for detailed comparison. Quick guidance:
**Core Memory (in-context):**
- Always accessible in agent's context window
- Use for: current state, active context, frequently referenced information
- Limit: Keep total core memory under 80% of context window
**Archival Memory (out-of-context):**
- Semantic search over vector database
- Use for: historical records, large knowledge bases, past interactions
- Access: Agent must explicitly call archival_memory_search
- Note: NOT automatically populated from context overflow
**Conversation History:**
- Past messages from current conversation
- Retrieved via conversation_search tool
- Use for: referencing earlier discussion, tracking conversation flow
### 2. Memory Block Design
**Core principle:** One block per distinct functional unit.
**Essential blocks:**
- `persona`: Agent identity, behavioral guidelines, capabilities
- `human`: User information, preferences, context
**Add domain-specific blocks based on use case:**
For customer support:
```yaml
company_policies:
description: "Company policies and procedures. Reference when handling customer requests."
read_only: true
product_knowledge:
description: "Product features and common issues. Update when learning new solutions."
read_only: false
customer:
description: "Current customer's context and history. Update as you learn more about them."
read_only: false
For coding assistants:
project_context:
description: "Current project architecture and active tasks. Update as project evolves."
coding_standards:
description: "Team's coding standards and review checklist. Reference before code suggestions."
read_only: true
current_task:
description: "Active task and implementation progress. Update as work progresses."
See references/memory-patterns.md for more domain examples.
3. Label and Description Best Practices
Labels:
- Use underscores, not spaces:
brand_guidelinesnotbrand guidelines - Keep short and descriptive:
customer_profile,project_context - Think like variable names
Descriptions:
Use instructional style for blocks the agent actively manages:
Good:
"Brand tone and style guidelines. Reference this when generating content to ensure consistency with brand identity."
Poor:
"Contains brand information"
Template for active blocks:
[What this block contains]. [When to reference it]. [When/how to update it].
Consult references/description-patterns.md for examples.
4. Size Management
Character limits per block:
- Typical limit: 2000-5000 characters
- Monitor via block size in ADE or API
When approaching limits:
- Split by topic:
customer_profile→customer_business,customer_preferences - Split by time:
interaction_history→recent_interactions, archive older to archival memory - Archive historical data: Move old information to archival memory
- Consolidate with memory_rethink: Summarize and rewrite block
See references/size-management.md for strategies.
5. Concurrency Patterns
When multiple agents share memory blocks or agent processes concurrent requests:
Safest operations:
memory_insert: Append-only, minimal race conditions- Database uses PostgreSQL row-level locking
Risk of race conditions:
memory_replace: Target string may change before writememory_rethink: Last-writer-wins, no merge
Best practices:
- Design for append operations when possible
- Use memory_insert for concurrent writes
- Reserve memory_rethink for single-agent exclusive access
Consult references/concurrency.md for patterns.
Validation Questions
Before finalizing memory architecture:
- Is core memory total under 80% of context window?
- Is each block focused on one functional area?
- Are descriptions clear about when to read/write?
- Have you planned for size growth and overflow?
- If multi-agent, are concurrency patterns considered?
Common Antipatterns to Avoid
Too few blocks:
# Bad: Everything in one block
agent_memory: "Agent is helpful. User is John..."
Split into focused blocks instead.
Too many blocks:
Creating 10+ blocks when 3-4 would suffice. Start minimal, expand as needed.
Poor descriptions:
# Bad
data: "Contains data"
Provide actionable guidance instead.
Ignoring size limits:
Letting blocks grow indefinitely until they hit limits. Monitor and manage proactively.
Next Steps
After architecture design:
- Create memory blocks via ADE or API
- Test agent behavior with representative queries
- Monitor memory tool usage patterns
- Iterate on structure based on actual usage
## references/memory-types.md
```markdown
# Memory Types in Letta
## Core Memory (In-Context)
**What it is:**
- Structured sections of agent's context window
- Always visible to agent
- Persists across all interactions
**When to use:**
- Current state and active context
- Frequently referenced information
- User preferences and agent identity
- Information that should be immediately accessible
**Structure:**
- Label: unique identifier
- Description: guides agent usage
- Value: the actual data
- Limit: character limit per block
**Access:**
- Always in context (no tool call needed)
- Agent can edit via memory_insert, memory_replace, memory_rethink
**Best for:**
- User profile information
- Agent personality and guidelines
- Current project/task context
- Active conversation metadata
## Archival Memory (Out-of-Context)
**What it is:**
- Vector database storing embedded content
- Semantic search over historical data
- Not automatically included in context
**When to use:**
- Large knowledge bases
- Historical interaction records
- Past project notes
- Reference documentation
**Access:**
- Agent must explicitly call archival_memory_search
- Results brought into context on demand
- Agent can add via archival_memory_insert
**Important notes:**
- NOT auto-populated from context overflow
- Must be explicitly added by agent
- Separate from memory blocks (not connected)
**Best for:**
- Past conversation summaries
- Historical customer interactions
- Large documentation sets
- Long-term knowledge accumulation
## Conversation History
**What it is:**
- Past messages from current conversation
- Moves out of context window as conversation grows
- Stored in database, searchable
**When to use:**
- Referencing earlier discussion
- Tracking conversation flow
- Finding specific past exchanges
**Access:**
- Agent calls conversation_search tool
- Can filter by date range, role, query
**Automatic behavior:**
- Messages automatically move to history when context full
- Agent can trigger summarization when needed
**Best for:**
- Multi-turn conversation context
- Tracking what was already discussed
- Finding specific user requests
## Memory Type Selection Guide
| Use Case | Core Memory | Archival Memory | Conversation History |
|----------|-------------|-----------------|----------------------|
| User name and preferences | ✓ | | |
| Agent personality | ✓ | | |
| Current task status | ✓ | | |
| Last 10 messages | (in context) | | |
| Messages 50+ turns ago | | | ✓ |
| Past project notes | | ✓ | |
| Large documentation | | ✓ | |
| Company policies | ✓ | | |
| Historical customer data | | ✓ | |
## Common Misconceptions
**Myth:** "Archival memory is automatically populated when context overflows"
**Reality:** Archival memory must be explicitly added via archival_memory_insert tool
**Myth:** "Memory blocks and archival memory are connected"
**Reality:** They are completely separate systems
**Myth:** "Conversation history is lost when context fills"
**Reality:** It's stored in database and accessible via conversation_search
references/memory-patterns.md
# Memory Block Patterns by Domain
## Customer Support Agent
```yaml
persona:
label: persona
description: "Your role as customer support agent, including communication style and escalation criteria."
value: |
I am a customer support agent for [Company]. I respond professionally,
empathetically, and efficiently. I escalate to humans when: [criteria].
company_policies:
label: company_policies
description: "Company policies and procedures. Reference when handling returns, warranties, or service requests."
read_only: true
value: |
Return Policy: [details]
Warranty: [details]
Service Terms: [details]
product_knowledge:
label: product_knowledge
description: "Product features and common solutions. Update when learning new troubleshooting steps."
value: |
Product A: [features, common issues]
Product B: [features, common issues]
customer:
label: customer
description: "Current customer's information and interaction history. Update as you learn more about them."
value: |
Name: [extracted from conversation]
Issue: [current problem]
History: [relevant past interactions]
Coding Assistant
persona:
label: persona
description: "Your approach to coding assistance and communication with developers."
value: |
I help write clean, maintainable code. I explain my reasoning,
suggest best practices, and ask clarifying questions.
project_context:
label: project_context
description: "Current project architecture and goals. Update as you learn about the codebase."
value: |
Project: [name]
Stack: [technologies]
Architecture: [patterns]
Current Focus: [active work]
coding_standards:
label: coding_standards
description: "Team coding standards and review checklist. Reference before making suggestions."
read_only: true
value: |
Style Guide: [details]
Testing Requirements: [coverage, patterns]
Review Checklist: [items]
current_task:
label: current_task
description: "Active coding task and implementation progress. Update as work advances."
value: |
Task: [description]
Approach: [planned solution]
Progress: [completed steps]
Blockers: [current issues]
Personal Assistant
persona:
label: persona
description: "Your role as personal assistant and communication preferences."
value: |
I help manage your schedule, remind you of tasks, and provide
proactive assistance. I'm concise unless you ask for detail.