SKILL.md for Letta

Here’s a focused memory architecture design skill:

SKILL.md

---
name: letta-memory-architect
description: Guide for designing effective memory architectures in Letta agents. Use when users need help structuring memory blocks, choosing between memory types, or optimizing memory management patterns.
license: MIT
---

# Letta Memory Architect

This skill guides the design of effective memory architectures for Letta agents, including memory block structure, memory type selection, and concurrency patterns.

## When to Use This Skill

Use this skill when users are:
- Designing memory block structure for a new agent
- Choosing between core memory, archival memory, and conversation history
- Optimizing memory block organization for performance
- Implementing shared memory between agents
- Debugging memory-related issues (size limits, concurrency)

## Memory Architecture Process

### 1. Memory Type Selection

Consult `references/memory-types.md` for detailed comparison. Quick guidance:

**Core Memory (in-context):**
- Always accessible in agent's context window
- Use for: current state, active context, frequently referenced information
- Limit: Keep total core memory under 80% of context window

**Archival Memory (out-of-context):**
- Semantic search over vector database
- Use for: historical records, large knowledge bases, past interactions
- Access: Agent must explicitly call archival_memory_search
- Note: NOT automatically populated from context overflow

**Conversation History:**
- Past messages from current conversation
- Retrieved via conversation_search tool
- Use for: referencing earlier discussion, tracking conversation flow

### 2. Memory Block Design

**Core principle:** One block per distinct functional unit.

**Essential blocks:**
- `persona`: Agent identity, behavioral guidelines, capabilities
- `human`: User information, preferences, context

**Add domain-specific blocks based on use case:**

For customer support:
```yaml
company_policies:
  description: "Company policies and procedures. Reference when handling customer requests."
  read_only: true

product_knowledge:
  description: "Product features and common issues. Update when learning new solutions."
  read_only: false

customer:
  description: "Current customer's context and history. Update as you learn more about them."
  read_only: false

For coding assistants:

project_context:
  description: "Current project architecture and active tasks. Update as project evolves."
  
coding_standards:
  description: "Team's coding standards and review checklist. Reference before code suggestions."
  read_only: true

current_task:
  description: "Active task and implementation progress. Update as work progresses."

See references/memory-patterns.md for more domain examples.

3. Label and Description Best Practices

Labels:

  • Use underscores, not spaces: brand_guidelines not brand guidelines
  • Keep short and descriptive: customer_profile, project_context
  • Think like variable names

Descriptions:
Use instructional style for blocks the agent actively manages:

Good:

"Brand tone and style guidelines. Reference this when generating content to ensure consistency with brand identity."

Poor:

"Contains brand information"

Template for active blocks:

[What this block contains]. [When to reference it]. [When/how to update it].

Consult references/description-patterns.md for examples.

4. Size Management

Character limits per block:

  • Typical limit: 2000-5000 characters
  • Monitor via block size in ADE or API

When approaching limits:

  1. Split by topic: customer_profilecustomer_business, customer_preferences
  2. Split by time: interaction_historyrecent_interactions, archive older to archival memory
  3. Archive historical data: Move old information to archival memory
  4. Consolidate with memory_rethink: Summarize and rewrite block

See references/size-management.md for strategies.

5. Concurrency Patterns

When multiple agents share memory blocks or agent processes concurrent requests:

Safest operations:

  • memory_insert: Append-only, minimal race conditions
  • Database uses PostgreSQL row-level locking

Risk of race conditions:

  • memory_replace: Target string may change before write
  • memory_rethink: Last-writer-wins, no merge

Best practices:

  • Design for append operations when possible
  • Use memory_insert for concurrent writes
  • Reserve memory_rethink for single-agent exclusive access

Consult references/concurrency.md for patterns.

Validation Questions

Before finalizing memory architecture:

  1. Is core memory total under 80% of context window?
  2. Is each block focused on one functional area?
  3. Are descriptions clear about when to read/write?
  4. Have you planned for size growth and overflow?
  5. If multi-agent, are concurrency patterns considered?

Common Antipatterns to Avoid

Too few blocks:

# Bad: Everything in one block
agent_memory: "Agent is helpful. User is John..."

Split into focused blocks instead.

Too many blocks:
Creating 10+ blocks when 3-4 would suffice. Start minimal, expand as needed.

Poor descriptions:

# Bad
data: "Contains data"

Provide actionable guidance instead.

Ignoring size limits:
Letting blocks grow indefinitely until they hit limits. Monitor and manage proactively.

Next Steps

After architecture design:

  1. Create memory blocks via ADE or API
  2. Test agent behavior with representative queries
  3. Monitor memory tool usage patterns
  4. Iterate on structure based on actual usage

## references/memory-types.md

```markdown
# Memory Types in Letta

## Core Memory (In-Context)

**What it is:**
- Structured sections of agent's context window
- Always visible to agent
- Persists across all interactions

**When to use:**
- Current state and active context
- Frequently referenced information
- User preferences and agent identity
- Information that should be immediately accessible

**Structure:**
- Label: unique identifier
- Description: guides agent usage
- Value: the actual data
- Limit: character limit per block

**Access:**
- Always in context (no tool call needed)
- Agent can edit via memory_insert, memory_replace, memory_rethink

**Best for:**
- User profile information
- Agent personality and guidelines
- Current project/task context
- Active conversation metadata

## Archival Memory (Out-of-Context)

**What it is:**
- Vector database storing embedded content
- Semantic search over historical data
- Not automatically included in context

**When to use:**
- Large knowledge bases
- Historical interaction records
- Past project notes
- Reference documentation

**Access:**
- Agent must explicitly call archival_memory_search
- Results brought into context on demand
- Agent can add via archival_memory_insert

**Important notes:**
- NOT auto-populated from context overflow
- Must be explicitly added by agent
- Separate from memory blocks (not connected)

**Best for:**
- Past conversation summaries
- Historical customer interactions
- Large documentation sets
- Long-term knowledge accumulation

## Conversation History

**What it is:**
- Past messages from current conversation
- Moves out of context window as conversation grows
- Stored in database, searchable

**When to use:**
- Referencing earlier discussion
- Tracking conversation flow
- Finding specific past exchanges

**Access:**
- Agent calls conversation_search tool
- Can filter by date range, role, query

**Automatic behavior:**
- Messages automatically move to history when context full
- Agent can trigger summarization when needed

**Best for:**
- Multi-turn conversation context
- Tracking what was already discussed
- Finding specific user requests

## Memory Type Selection Guide

| Use Case | Core Memory | Archival Memory | Conversation History |
|----------|-------------|-----------------|----------------------|
| User name and preferences | ✓ | | |
| Agent personality | ✓ | | |
| Current task status | ✓ | | |
| Last 10 messages | (in context) | | |
| Messages 50+ turns ago | | | ✓ |
| Past project notes | | ✓ | |
| Large documentation | | ✓ | |
| Company policies | ✓ | | |
| Historical customer data | | ✓ | |

## Common Misconceptions

**Myth:** "Archival memory is automatically populated when context overflows"
**Reality:** Archival memory must be explicitly added via archival_memory_insert tool

**Myth:** "Memory blocks and archival memory are connected"
**Reality:** They are completely separate systems

**Myth:** "Conversation history is lost when context fills"
**Reality:** It's stored in database and accessible via conversation_search

references/memory-patterns.md

# Memory Block Patterns by Domain

## Customer Support Agent

```yaml
persona:
  label: persona
  description: "Your role as customer support agent, including communication style and escalation criteria."
  value: |
    I am a customer support agent for [Company]. I respond professionally,
    empathetically, and efficiently. I escalate to humans when: [criteria].

company_policies:
  label: company_policies
  description: "Company policies and procedures. Reference when handling returns, warranties, or service requests."
  read_only: true
  value: |
    Return Policy: [details]
    Warranty: [details]
    Service Terms: [details]

product_knowledge:
  label: product_knowledge
  description: "Product features and common solutions. Update when learning new troubleshooting steps."
  value: |
    Product A: [features, common issues]
    Product B: [features, common issues]

customer:
  label: customer
  description: "Current customer's information and interaction history. Update as you learn more about them."
  value: |
    Name: [extracted from conversation]
    Issue: [current problem]
    History: [relevant past interactions]

Coding Assistant

persona:
  label: persona
  description: "Your approach to coding assistance and communication with developers."
  value: |
    I help write clean, maintainable code. I explain my reasoning,
    suggest best practices, and ask clarifying questions.

project_context:
  label: project_context
  description: "Current project architecture and goals. Update as you learn about the codebase."
  value: |
    Project: [name]
    Stack: [technologies]
    Architecture: [patterns]
    Current Focus: [active work]

coding_standards:
  label: coding_standards
  description: "Team coding standards and review checklist. Reference before making suggestions."
  read_only: true
  value: |
    Style Guide: [details]
    Testing Requirements: [coverage, patterns]
    Review Checklist: [items]

current_task:
  label: current_task
  description: "Active coding task and implementation progress. Update as work advances."
  value: |
    Task: [description]
    Approach: [planned solution]
    Progress: [completed steps]
    Blockers: [current issues]

Personal Assistant

persona:
  label: persona
  description: "Your role as personal assistant and communication preferences."
  value: |
    I help manage your schedule, remind you of tasks, and provide
    proactive assistance. I'm concise unless you ask for detail.