letta-development-guide

安装量: 48
排名: #15430

安装

npx skills add https://github.com/letta-ai/skills --skill letta-development-guide

Comprehensive guide for designing and building effective Letta agents with appropriate architectures, memory configurations, model selection, and tool setups.

When to Use This Skill

Use this skill when:

  • Starting a new Letta agent project

  • Choosing between agent architectures (letta_v1_agent vs memgpt_v2_agent)

  • Designing memory block structure and architecture

  • Selecting appropriate models for your use case

  • Planning tool configurations

  • Optimizing memory management and performance

  • Implementing shared memory between agents

  • Debugging memory-related issues

Quick Start Guide

Minimal Working Example

from letta_client import Letta

client = Letta()
agent = client.agents.create(
    name="my-assistant",
    model="openai/gpt-4o",
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "persona", "value": "You are a helpful assistant."},
        {"label": "human", "value": "The user's name and preferences."},
    ],
)

# Send a message
response = client.agents.messages.create(
    agent_id=agent.id,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.messages[-1].content)

1. Architecture Selection

Use letta_v1_agent when:

  • Building new agents (recommended default)

  • Need compatibility with reasoning models (GPT-4o, Claude Sonnet 4)

  • Want simpler system prompts and direct message generation

Use memgpt_v2_agent when:

  • Maintaining legacy agents

  • Require specific tool patterns not yet supported in v1

For detailed comparison, see references/architectures.md.

2. Memory Architecture Design

Memory is the foundation of effective agents. Letta provides three memory types:

Core Memory (in-context):

  • Always accessible in agent's context window

  • Use for: current state, active context, frequently referenced information

  • Limit: Keep total core memory under 80% of context window

Archival Memory (out-of-context):

  • Semantic search over vector database

  • Use for: historical records, large knowledge bases, past interactions

  • Access: Agent must explicitly call archival_memory_search

  • Note: NOT automatically populated from context overflow

Conversation History:

  • Past messages from current conversation

  • Retrieved via conversation_search tool

  • Use for: referencing earlier discussion, tracking conversation flow

See references/memory-architecture.md for detailed guidance.

3. Memory Block Design

Core principle: One block per distinct functional unit.

Essential blocks:

  • persona: Agent identity, behavioral guidelines, capabilities

  • human: User information, preferences, context

Add domain-specific blocks based on use case:

  • Customer support: company_policies, product_knowledge, customer

  • Coding assistant: project_context, coding_standards, current_task

  • Personal assistant: schedule, preferences, contacts

Memory block guidelines:

  • Keep blocks focused and purpose-specific

  • Use clear, instructional descriptions

  • Monitor size limits (typically 2000-5000 characters per block)

  • Design for append operations when sharing memory between agents

See references/memory-patterns.md for domain examples and references/description-patterns.md for writing effective descriptions.

4. Model Selection

Match model capabilities to agent requirements:

For production agents:

  • GPT-4o or Claude Sonnet 4 for complex reasoning

  • GPT-4o-mini for cost-efficient general tasks

  • Claude Haiku 3.5 for fast, lightweight operations

  • Gemini 2.0 Flash for balanced speed/capability

Avoid for production:

  • Small Ollama models (<7B parameters) - poor tool calling

  • Models without reliable function calling support

See references/model-recommendations.md for detailed guidance.

5. Tool Configuration

Start minimal: Attach only tools the agent will actively use.

Common starting points:

  • Memory tools (memory_insert, memory_replace, memory_rethink): Core for most agents

  • File system tools: Auto-attached when folders are connected

  • Custom tools: For domain-specific operations (databases, APIs, etc.)

Tool Rules: Use to enforce sequencing when needed (e.g., "always call search before answer")

Consult references/tool-patterns.md for common configurations.

Advanced Topics

Memory Size Management

When approaching character limits:

  • Split by topic: customer_profilecustomer_business, customer_preferences

  • Split by time: interaction_historyrecent_interactions, archive older to archival memory

  • Archive historical data: Move old information to archival memory

  • Consolidate with memory_rethink: Summarize and rewrite block

See references/size-management.md for strategies.

Concurrency Patterns

When multiple agents share memory blocks or an agent processes concurrent requests:

Safest operations:

  • memory_insert: Append-only, minimal race conditions

  • Database uses PostgreSQL row-level locking

Risk of race conditions:

  • memory_replace: Target string may change before write

  • memory_rethink: Last-writer-wins, no merge

Best practices:

  • Design for append operations when possible

  • Use memory_insert for concurrent writes

  • Reserve memory_rethink for single-agent exclusive access

Consult references/concurrency.md for detailed patterns.

Validation Checklist

Before finalizing your agent design:

Architecture:

Does the architecture match the model's capabilities? Is the model appropriate for expected workload and latency requirements?

Memory:

Is core memory total under 80% of context window? Is each block focused on one functional area? Are descriptions clear about when to read/write? Have you planned for size growth and overflow? If multi-agent, are concurrency patterns considered?

Tools:

Are tools necessary and properly configured? Are memory blocks granular enough for effective updates?

Common Antipatterns

Too few memory blocks:

# Bad: Everything in one block
agent_memory: "Agent is helpful. User is John..."

Split into focused blocks instead.

Too many memory blocks: Creating 10+ blocks when 3-4 would suffice. Start minimal, expand as needed.

Poor descriptions:

# Bad
data: "Contains data"

Provide actionable guidance instead. See references/description-patterns.md.

Ignoring size limits: Letting blocks grow indefinitely until they hit limits. Monitor and manage proactively.

Implementation Steps

1. Design Phase

  • Choose architecture based on requirements

  • Design memory block structure

  • Select appropriate model

  • Plan tool configuration

2. Creation Phase (SDK)

Python:

from letta_client import Letta

client = Letta()  # Uses LETTA_API_KEY env var

# Create agent with custom memory blocks
agent = client.agents.create(
    name="my-agent",
    model="openai/gpt-4o",  # or "anthropic/claude-sonnet-4-20250514"
    embedding="openai/text-embedding-3-small",
    memory_blocks=[
        {"label": "persona", "value": "You are a helpful assistant..."},
        {"label": "human", "value": "User preferences and context..."},
        {"label": "project", "value": "Current project details..."},
    ],
    description="Agent for helping with X",
)
print(f"Created agent: {agent.id}")

TypeScript:

import Letta from "letta-client";

const client = new Letta();

const agent = await client.agents.create({
  name: "my-agent",
  model: "openai/gpt-4o",
  embedding: "openai/text-embedding-3-small",
  memoryBlocks: [
    { label: "persona", value: "You are a helpful assistant..." },
    { label: "human", value: "User preferences and context..." },
    { label: "project", value: "Current project details..." },
  ],
  description: "Agent for helping with X",
});
console.log(`Created agent: ${agent.id}`);

Note: Letta Code CLI (letta command) creates agents interactively. Use letta --new-agent to start fresh, then /rename and /description to configure.

3. Testing Phase

  • Test with representative queries

  • Monitor memory tool usage patterns

  • Verify tool calling behavior

4. Iteration Phase

  • Refine memory block structure based on actual usage

  • Optimize system instructions

  • Adjust tool configurations

References

For detailed information on specific topics, consult the reference materials:

  • references/architectures.md - Architecture comparison and selection

  • references/memory-architecture.md - Memory types and when to use them

  • references/memory-patterns.md - Domain-specific memory block examples

  • references/description-patterns.md - Writing effective block descriptions

  • references/size-management.md - Managing memory block size limits

  • references/concurrency.md - Multi-agent memory sharing patterns

  • references/model-recommendations.md - Model selection guidance

  • references/tool-patterns.md - Common tool configurations

返回排行榜