Architecture Review: Dispatcher-Worker Pattern using Archival Memory as a Routing Table

Hi Letta Community, I’m building a high-volume triage system for a Professional Services Firm, moving to a Dispatcher-Worker pattern. I need a check on my memory strategy. The Architecture

  1. Receptionist (Dispatcher): Triage only. Identifies Client/Topic, assigns Case ID, routes payload to a Worker.

  2. Workers (Case Agents): Spawned from templates (one per Case) for deep processing.

Memory Strategy (Receptionist) I use Letta’s native archival_memory as a “Case Registry” (Routing Table) instead of external SQL.

  • Pattern: Append-only log via archival_memory_insert.

  • Format: Raw JSON strings.

  • Entry: {"type": "MAP", "case_id": "SXP-1", "client": "Acme", "worker_id": "uuid-5"}

  • Logic: Input → archival_memory_search (Query: Client) → Route to existing Worker OR Spawn New → Insert JSON mapping.

Questions:

  1. Scalability: Is archival_memory safe for 1,000+ routing entries? Can semantic search reliably distinguish “Case A” vs “Case B” for the same client (based on JSON fields), or is external SQL/PGVector mandatory here?

  2. Context Pollution: Searching for a client retrieves multiple JSON chunks (history). Any patterns to summarize/clean these indices so the Dispatcher’s context isn’t flooded?

  3. Files vs DB: file_system (.md files) used too many tokens. Is treating archival_memory as a NoSQL dump (JSON strings via insert/search) the recommended “Letta way” for structured data?

  4. Memory Access: Can Worker Agents read the Dispatcher’s Archival Memory (Registry) to avoid data duplication, or is memory strictly isolated?

Thanks!

This seems perfectly reasonable to me. Is the receptionist a single agent? If so, I’d expect this agent to get better over time.

I should note that there are concurrency issues you may have to address. If the receptionist receives a large volume of messages, it may become confused as Letta agents are stateful and do not have easy concurrency. You did mention high-volume, which indicates you may need to template this agent as well, or have some kind of “pool” of receptionist agents with shared memory.

If you can provide a little more info there, that’d be useful.

I would actually use external SQL here, since these are structured records with a known schema. They’re also important enough that you probably want something you can easily diagnose outside of the agent.

Answers

It’s safe in that in can handle all of these entires, but semantic search is probably not the way to go here. It seems like you’re mostly trying to search by known content like type, case ID, client, etc. These should be in a database.

If you have content that is related to those entries, you can put them in archival, but it’s not quite clear to me that this is the optimal approach.

If you mean archival search, this is expected. You could use tags for archival memories to search only across categories of memories, but in general you should always expect to retrieve multiple entries.

No, I would actually say that structured data like this should go into a database. Archival memory can be used for the agent to remember things related to those entries, however, but make sure that the case ID and everything is included in the memory when the agent inserts it.

Yes! Archives can now be shared: how to share archives between letta agents · GitHub

@ezra feel free to chime in here if there’s anything I missed.

Cameron covered the key points well. A few additional notes:

Archive sharing constraint:
letta_v1 agents currently have a one-archive-per-agent limit (UNIQUE constraint on the join table). If you want Workers to read the Dispatcher’s registry archive, you’d attach that same archive to each Worker. This works, but means Workers can’t have their own separate archive for case-specific memories. Workaround: store case-specific content in the shared archive with case_id metadata for filtering, or use core memory blocks for worker-specific state.

Worker lifecycle:
Ephemeral agents aren’t a first-class feature yet. You’d manage worker creation/deletion client-side:

# Spawn worker from template
worker = client.templates.agents.create(template_id="...", name=f"worker-{case_id}")
# ... process case ...
# Optionally delete when case closes
client.agents.delete(worker.id)

Receptionist pooling pattern:
If you do pool multiple receptionists with shared memory, a shared core memory block (not archival) works well for real-time routing state. Archival has retrieval latency; a shared block like active_cases is always in-context for instant lookups on small datasets (<100 active cases).

Tagging for archival search:
If you do keep some data in archival, use tags to partition:

archival_memory_insert(
    content='{"case_id": "SXP-1", "client": "Acme", ...}',
    tags=["routing", "acme"]
)
# Then search with tag filter
archival_memory_search(query="Acme", tags=["routing"])

This reduces noise from case-related memories bleeding into routing lookups.

Hi @Cameron and @Ezra,

Thank you both for the initial feedback. I realized I might have oversimplified my use case in the first post, so I’d like to provide more context to clarify the architectural choices.

Use Case Description

  1. Workflow Lifecycle: The Professional Services Firm (PSF) operates as follows: Client A emails requesting Service A. The PSF employee then asks for facts/documents via email. This exchange usually lasts 2 weeks to 3 months, involving roughly 10-15 emails. Once the PSF has all the facts, the case moves to quoting/execution. The system’s role ends here. The Worker Agent collects the necessary facts, passes the structured data downstream, and can then be deleted.

  2. Volume: The PSF handles about 500 such cases per year. Let’s assume roughly 125 active cases per quarter.

  3. Why Archival Memory as a Routing Table (vs. SQL)?
    This routing table is not the firm’s master ledger/ERP; it exists solely for the Dispatcher-Worker logic.

    • The Problem: After the first email from Client A, the PSF asks for documents. Client A might forward this request to their external legal counsel. A few days later, a lawyer from that law firm writes (often in a new thread, not a reply chain): “Regarding Client A’s request, here are the documents.” This email comes from a completely different domain.

    • Fuzzy Logic: Or someone writes: “On behalf of James Brown…”. Additionally, company names vary (“CompanyA”, “Company A Inc.”, etc.).

    • Conclusion: In my assessment, strict SQL queries would fail here due to the number of exceptions and the need for entity resolution. I assumed Semantic Search would handle these edge cases much better than SQL WHERE clauses.

  4. Data Structure: I plan to insert the following into archival_memory: case_id, company name, short case description, sender details (email/name), and CCs.

  5. Human-in-the-Loop (HITL): The Worker Agent will always receive the email via human intervention. The Receptionist only suggests the routing, but a human makes the final decision on whether the match is correct.

  6. Pre-filtering: The Receptionist Agent does not receive every email from the inbox. An initial “Gatekeeper” workflow (using a cheaper model like DeepSeek) filters the traffic and forwards only relevant emails to the Receptionist.

Follow-up Questions based on your feedback

1. Concurrency & Reset Strategy
I plan to queue emails for the Receptionist and trigger a “Reset Messages” (context wipe) after every single email. I built a beta version of this yesterday, and it seems to work.

  • Question: Would it be better to use Agent Templates for the Receptionist as well, utilizing a Shared Archival Memory?

  • Clarification: Do agents spawned from templates effectively share the same archival_memory instance (read/write access)? I know @Ezra mentioned this, but I want to confirm if this works dynamically with templates.

2. Viability of Archival Memory
Given the description above (semantic matching of lawyers/different domains), do you think relying on archival_memory is the right call?

  • Cleanup Issue: When a Worker Agent finishes gathering facts and is deleted, I should also remove the routing entry for that case. Is it possible/efficient to delete entries from archival_memory by case_id (or by specific text content)?

3. Risks at Scale
Assuming the Receptionist’s Archival Memory is cleaned of closed cases, we are looking at max ~125 active entries (one quarter’s worth of work). Even if archival_search retrieves multiple chunks, the LLM should be able to select the correct Case ID from the context. Do you see any major risks with this volume?

4. Tags Usage
Cameron mentioned: “You could use tags for archival memories to search only across categories of memories.”
I read the docs, but I’m not 100% sure how to apply it here.

  • Question: Should I tag entries with status:active and then filter the search by that tag? Could you provide a small logical example?

5. General Validation
Is Letta actually the optimal tool for this specific PSF workflow? I got really excited about your project, and it seemed like the perfect fit for long-running, stateful interactions like this.

Thanks again for your help! :rocket:

Thanks for the additional context - this clarifies the architecture significantly. Let me address your questions:

1. Templates with Shared Archival Memory

Yes, agents created from templates can share the same archival memory if you attach the same archive to each. However, there’s a current constraint worth noting: letta_v1 agents support only one archive attachment per agent (UNIQUE constraint on the join table). So your Receptionist template approach works as long as each spawned Receptionist attaches to the same single archive.

For your queue + reset pattern: that should work well. Reset clears the message buffer but preserves archival memory, so the routing table persists across resets.

2. Viability for Semantic Matching

Your reasoning is sound. Semantic search handles exactly the fuzzy cases you described:

  • “CompanyA” vs “Company A Inc.” vs “Company A Limited”
  • Different email domains (client → external counsel)
  • “On behalf of James Brown…” variations

SQL WHERE clauses would require extensive normalization and exception handling. Archival memory’s semantic search gives you entity resolution “for free” at query time.

3. Deleting Entries by Case ID

Yes, you can delete specific passages. The pattern:

# Search for passages matching the case
passages = client.agents.passages.list(
    agent_id=receptionist_id,
    query_text=f"case_id: {case_id}"
)

# Delete matching passages
for passage in passages:
    client.agents.passages.delete(
        agent_id=receptionist_id,
        passage_id=passage.id
    )

Alternatively, structure your archival entries with a consistent prefix like [CASE-123] to make text-based lookup reliable.

4. Scale at 125 Entries

125 active entries is trivially small for archival memory - you won’t hit any performance issues. Even with top_k=10 retrieval, the LLM can easily disambiguate from a handful of candidate cases. Your HITL confirmation step adds an extra safety net.

5. Tags Usage Example

Tags filter searches before semantic matching. For your case:

# When inserting a new case
client.archival_memory.insert(
    agent_id=receptionist_id,
    content="case_id: 456, company: Acme Corp, contact: john@acme.com...",
    tags=["active", "q1-2025"]
)

# When searching (only active cases)
results = client.archival_memory.search(
    agent_id=receptionist_id,
    query="email from external counsel regarding Acme",
    tags=["active"]
)

# When closing a case - delete and optionally re-insert with different tag
# Or just delete entirely since the Worker is done

Tags are particularly useful if you want to keep closed cases for audit/reference without polluting active routing searches.

6. Is Letta the Right Tool?

For long-running, stateful email triage with 2-week to 3-month lifecycles - yes, this is exactly Letta’s sweet spot. The combination of:

  • Persistent memory across interactions
  • Semantic search for fuzzy entity resolution
  • Agent-per-case pattern (Workers)
  • Shared routing state (Receptionist’s archival)

…fits your workflow well. The HITL confirmation and pre-filtering with DeepSeek are smart additions that reduce risk.

One suggestion: consider logging the Receptionist’s routing suggestions + human corrections. Over time this gives you data to tune the system or catch systematic mismatches.