- Self-Improving Agent
- "An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research
- Overview
- This is a
- universal self-improvement system
- that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:
- Multi-Memory Architecture
-
- Semantic + Episodic + Working memory
- Self-Correction
-
- Detects and fixes skill guidance errors
- Self-Validation
-
- Periodically verifies skill accuracy
- Hooks Integration
-
- Auto-triggers on skill events (before_start, after_complete, on_error)
- Evolution Markers
- Traceable changes with source attribution Research-Based Design Based on 2025 research: Research Key Insight Application SimpleMem Efficient lifelong memory Pattern accumulation system Multi-Memory Survey Semantic + Episodic memory World knowledge + experiences Lifelong Learning Continuous task stream learning Learn from every skill use Evo-Memory Test-time lifelong learning Real-time adaptation The Self-Improvement Loop ┌─────────────────────────────────────────────────────────────────┐ │ UNIVERSAL SELF-IMPROVEMENT │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Skill Event → Extract Experience → Abstract Pattern → Update │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ MULTI-MEMORY SYSTEM │ │ │ ├─────────────────────────────────────────────────────┤ │ │ │ Semantic Memory │ Episodic Memory │ Working Memory │ │ │ │ (Patterns/Rules) │ (Experiences) │ (Current) │ │ │ │ memory/semantic/ │ memory/episodic/ │ memory/working/│ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ FEEDBACK LOOP │ │ │ │ User Feedback → Confidence Update → Pattern Adapt │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ When This Activates Automatic Triggers (via hooks) Event Trigger Action before_start Any skill starts Log session start after_complete Any skill completes Extract patterns, update skills on_error Bash returns non-zero exit Capture error context, trigger self-correction Manual Triggers User says "自我进化", "self-improve", "从经验中学习" User says "分析今天的经验", "总结教训" User asks to improve a specific skill Evolution Priority Matrix Trigger evolution when new reusable knowledge appears: Trigger Target Skill Priority Action New PRD pattern discovered prd-planner High Add to quality checklist Architecture tradeoff clarified architecting-solutions High Add to decision patterns API design rule learned api-designer High Update template Debugging fix discovered debugger High Add to anti-patterns Review checklist gap code-reviewer High Add checklist item Perf/security insight performance-engineer, security-auditor High Add to patterns UI/UX spec issue prd-planner, architecting-solutions High Add visual spec requirements React/state pattern debugger, refactoring-specialist Medium Add to patterns Test strategy improvement test-automator, qa-expert Medium Update approach CI/deploy fix deployment-engineer Medium Add to troubleshooting Multi-Memory Architecture 1. Semantic Memory ( memory/semantic-patterns.json ) Stores abstract patterns and rules reusable across contexts: { "patterns" : { "pattern_id" : { "id" : "pat-2025-01-11-001" , "name" : "Pattern Name" , "source" : "user_feedback|implementation_review|retrospective" , "confidence" : 0.95 , "applications" : 5 , "created" : "2025-01-11" , "category" : "prd_structure|react_patterns|async_patterns|..." , "pattern" : "One-line summary" , "problem" : "What problem does this solve?" , "solution" : { ... } , "quality_rules" : [ ... ] , "target_skills" : [ ... ] } } } 2. Episodic Memory ( memory/episodic/ ) Stores specific experiences and what happened : memory/episodic/ ├── 2025/ │ ├── 2025-01-11-prd-creation.json │ ├── 2025-01-11-debug-session.json │ └── 2025-01-12-refactoring.json { "id" : "ep-2025-01-11-001" , "timestamp" : "2025-01-11T10:30:00Z" , "skill" : "debugger" , "situation" : "User reported data not refreshing after form submission" , "root_cause" : "Empty callback in onRefresh prop" , "solution" : "Implement actual refresh logic in callback" , "lesson" : "Always verify callbacks are not empty functions" , "related_pattern" : "callback_verification" , "user_feedback" : { "rating" : 8 , "comments" : "This was exactly the issue" } } 3. Working Memory ( memory/working/ ) Stores current session context : memory/working/ ├── current_session.json # Active session data ├── last_error.json # Error context for self-correction └── session_end.json # Session end marker Self-Improvement Process Phase 1: Experience Extraction After any skill completes, extract: What happened : skill_used : { which skill } task : { what was being done } outcome : { success | partial | failure } Key Insights : what_went_well : [ what worked ] what_went_wrong : [ what didn't work ] root_cause : { underlying issue if applicable } User Feedback : rating : { 1 - 10 if provided } comments : { specific feedback } Phase 2: Pattern Abstraction Convert experiences to reusable patterns: Concrete Experience Abstract Pattern Target Skill "User forgot to save PRD notes" "Always persist thinking to files" prd-planner "Code review missed SQL injection" "Add security checklist item" code-reviewer "Callback was empty, didn't work" "Verify callback implementations" debugger "Net APY position ambiguous" "UI specs need exact relative positions" prd-planner Abstraction Rules: If experience_repeats 3+ times : pattern_level : critical action : Add to skill's "Critical Mistakes" section If solution_was_effective : pattern_level : best_practice action : Add to skill's "Best Practices" section If user_rating >= 7 : pattern_level : strength action : Reinforce this approach If user_rating <= 4 : pattern_level : weakness action : Add to "What to Avoid" section Phase 3: Skill Updates Update the appropriate skill files with evolution markers :
- Pattern Added (2025-01-12)
- **
- Pattern
- **
-
- Always verify callbacks are not empty functions
- **
- Source
- **
-
- Episode ep-2025-01-12-001
- **
- Confidence
- **
- 0.95
Updated Checklist
[ ] Verify all callbacks have implementations
[ ] Test callback execution paths Correction Markers (when fixing wrong guidance):
Corrected Guidance Use direct state monitoring instead of callback chains: ```typescript // ✅ Do: Direct state monitoring const prevPendingCount = usePrevious(pendingCount);
Phase 4: Memory Consolidation
- Update semantic memory (
memory/semantic-patterns.json) - Store episodic memory (
memory/episodic/YYYY-MM-DD-{skill}.json) - Update pattern confidence based on applications/feedback
- Prune outdated patterns (low confidence, no recent applications)
Self-Correction (on_error hook)
Triggered when: - Bash command returns non-zero exit code - Tests fail after following skill guidance - User reports the guidance produced incorrect results Process: ```markdown
Self-Correction Workflow
- Detect Error
- Capture error context from working/last_error.json
- Identify which skill guidance was followed
- Verify Root Cause
- Was the skill guidance incorrect?
- Was the guidance misinterpreted?
- Was the guidance incomplete?
- Apply Correction
- Update skill file with corrected guidance
- Add correction marker with reason
- Update related patterns in semantic memory
- Validate Fix
- Test the corrected guidance
- Ask user to verify Example:
- Self-Correction: Click-Time Computation
- **
- Issue
- **
-
- Using useMemo for claimable IDs caused stale data
- **
- Fix
- **
-
- Compute at click time for always-fresh data
- **
- Pattern
- **
- click_time_vs_open_time_computation Self-Validation Use the validation template in references/appendix.md when reviewing updates. Hooks Integration Wiring Hooks in Claude Code Settings Add to Claude Code settings ( ~/.claude/settings.json ): { "hooks" : { "PreToolUse" : [ { "matcher" : "Bash|Write|Edit" , "hooks" : [ { "type" : "command" , "command" : "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\"" } ] } ] , "PostToolUse" : [ { "matcher" : "Bash" , "hooks" : [ { "type" : "command" , "command" : "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\"" } ] } ] , "Stop" : [ { "matcher" : "" , "hooks" : [ { "type" : "command" , "command" : "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh" } ] } ] } } Replace ${SKILLS_DIR} with your actual skills path. Additional References See references/appendix.md for memory structure, workflow diagrams, metrics, feedback templates, and research links. Best Practices DO ✅ Learn from EVERY skill interaction ✅ Extract patterns at the right abstraction level ✅ Update multiple