Self-Improving Agent

"An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research

Overview

This is a

universal self-improvement system

that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:

Multi-Memory Architecture

Semantic + Episodic + Working memory

Self-Correction

Detects and fixes skill guidance errors

Self-Validation

Periodically verifies skill accuracy

Hooks Integration

Auto-triggers on skill events (before_start, after_complete, on_error)
Evolution Markers: Traceable changes with source attribution Research-Based Design Based on 2025 research: Research Key Insight Application SimpleMem Efficient lifelong memory Pattern accumulation system Multi-Memory Survey Semantic + Episodic memory World knowledge + experiences Lifelong Learning Continuous task stream learning Learn from every skill use Evo-Memory Test-time lifelong learning Real-time adaptation The Self-Improvement Loop ┌─────────────────────────────────────────────────────────────────┐ │ UNIVERSAL SELF-IMPROVEMENT │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Skill Event → Extract Experience → Abstract Pattern → Update │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ MULTI-MEMORY SYSTEM │ │ │ ├─────────────────────────────────────────────────────┤ │ │ │ Semantic Memory │ Episodic Memory │ Working Memory │ │ │ │ (Patterns/Rules) │ (Experiences) │ (Current) │ │ │ │ memory/semantic/ │ memory/episodic/ │ memory/working/│ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ FEEDBACK LOOP │ │ │ │ User Feedback → Confidence Update → Pattern Adapt │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ When This Activates Automatic Triggers (via hooks) Event Trigger Action before_start Any skill starts Log session start after_complete Any skill completes Extract patterns, update skills on_error Bash returns non-zero exit Capture error context, trigger self-correction Manual Triggers User says "自我进化", "self-improve", "从经验中学习" User says "分析今天的经验", "总结教训" User asks to improve a specific skill Evolution Priority Matrix Trigger evolution when new reusable knowledge appears: Trigger Target Skill Priority Action New PRD pattern discovered prd-planner High Add to quality checklist Architecture tradeoff clarified architecting-solutions High Add to decision patterns API design rule learned api-designer High Update template Debugging fix discovered debugger High Add to anti-patterns Review checklist gap code-reviewer High Add checklist item Perf/security insight performance-engineer, security-auditor High Add to patterns UI/UX spec issue prd-planner, architecting-solutions High Add visual spec requirements React/state pattern debugger, refactoring-specialist Medium Add to patterns Test strategy improvement test-automator, qa-expert Medium Update approach CI/deploy fix deployment-engineer Medium Add to troubleshooting Multi-Memory Architecture 1. Semantic Memory ( memory/semantic-patterns.json ) Stores abstract patterns and rules reusable across contexts: { "patterns" : { "pattern_id" : { "id" : "pat-2025-01-11-001" , "name" : "Pattern Name" , "source" : "user_feedback|implementation_review|retrospective" , "confidence" : 0.95 , "applications" : 5 , "created" : "2025-01-11" , "category" : "prd_structure|react_patterns|async_patterns|..." , "pattern" : "One-line summary" , "problem" : "What problem does this solve?" , "solution" : { ... } , "quality_rules" : [ ... ] , "target_skills" : [ ... ] } } } 2. Episodic Memory ( memory/episodic/ ) Stores specific experiences and what happened : memory/episodic/ ├── 2025/ │ ├── 2025-01-11-prd-creation.json │ ├── 2025-01-11-debug-session.json │ └── 2025-01-12-refactoring.json { "id" : "ep-2025-01-11-001" , "timestamp" : "2025-01-11T10:30:00Z" , "skill" : "debugger" , "situation" : "User reported data not refreshing after form submission" , "root_cause" : "Empty callback in onRefresh prop" , "solution" : "Implement actual refresh logic in callback" , "lesson" : "Always verify callbacks are not empty functions" , "related_pattern" : "callback_verification" , "user_feedback" : { "rating" : 8 , "comments" : "This was exactly the issue" } } 3. Working Memory ( memory/working/ ) Stores current session context : memory/working/ ├── current_session.json # Active session data ├── last_error.json # Error context for self-correction └── session_end.json # Session end marker Self-Improvement Process Phase 1: Experience Extraction After any skill completes, extract: What happened : skill_used : { which skill } task : { what was being done } outcome : { success | partial | failure } Key Insights : what_went_well : [ what worked ] what_went_wrong : [ what didn't work ] root_cause : { underlying issue if applicable } User Feedback : rating : { 1 - 10 if provided } comments : { specific feedback } Phase 2: Pattern Abstraction Convert experiences to reusable patterns: Concrete Experience Abstract Pattern Target Skill "User forgot to save PRD notes" "Always persist thinking to files" prd-planner "Code review missed SQL injection" "Add security checklist item" code-reviewer "Callback was empty, didn't work" "Verify callback implementations" debugger "Net APY position ambiguous" "UI specs need exact relative positions" prd-planner Abstraction Rules: If experience_repeats 3+ times : pattern_level : critical action : Add to skill's "Critical Mistakes" section If solution_was_effective : pattern_level : best_practice action : Add to skill's "Best Practices" section If user_rating >= 7 : pattern_level : strength action : Reinforce this approach If user_rating <= 4 : pattern_level : weakness action : Add to "What to Avoid" section Phase 3: Skill Updates Update the appropriate skill files with evolution markers :

Pattern Added (2025-01-12)

Pattern

Always verify callbacks are not empty functions

Source

Episode ep-2025-01-12-001
**
Confidence
**: 0.95

Updated Checklist

[ ] Verify all callbacks have implementations

[ ] Test callback execution paths Correction Markers (when fixing wrong guidance):

Corrected Guidance Use direct state monitoring instead of callback chains: ```typescript // ✅ Do: Direct state monitoring const prevPendingCount = usePrevious(pendingCount);

Phase 4: Memory Consolidation

Update semantic memory (memory/semantic-patterns.json)
Store episodic memory (memory/episodic/YYYY-MM-DD-{skill}.json)
Update pattern confidence based on applications/feedback
Prune outdated patterns (low confidence, no recent applications)

Self-Correction (on_error hook)

Triggered when: - Bash command returns non-zero exit code - Tests fail after following skill guidance - User reports the guidance produced incorrect results Process: ```markdown

Self-Correction Workflow

Detect Error
Capture error context from working/last_error.json
Identify which skill guidance was followed
Verify Root Cause
Was the skill guidance incorrect?
Was the guidance misinterpreted?
Was the guidance incomplete?
Apply Correction
Update skill file with corrected guidance
Add correction marker with reason
Update related patterns in semantic memory
Validate Fix
Test the corrected guidance
Ask user to verify Example:

Self-Correction: Click-Time Computation

Issue

Using useMemo for claimable IDs caused stale data

Fix

Compute at click time for always-fresh data
**
Pattern
**: click_time_vs_open_time_computation Self-Validation Use the validation template in references/appendix.md when reviewing updates. Hooks Integration Wiring Hooks in Claude Code Settings Add to Claude Code settings ( ~/.claude/settings.json ): { "hooks" : { "PreToolUse" : [ { "matcher" : "Bash|Write|Edit" , "hooks" : [ { "type" : "command" , "command" : "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\"" } ] } ] , "PostToolUse" : [ { "matcher" : "Bash" , "hooks" : [ { "type" : "command" , "command" : "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\"" } ] } ] , "Stop" : [ { "matcher" : "" , "hooks" : [ { "type" : "command" , "command" : "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh" } ] } ] } } Replace ${SKILLS_DIR} with your actual skills path. Additional References See references/appendix.md for memory structure, workflow diagrams, metrics, feedback templates, and research links. Best Practices DO ✅ Learn from EVERY skill interaction ✅ Extract patterns at the right abstraction level ✅ Update multiple

安装

Updated Checklist

[ ] Verify all callbacks have implementations

Phase 4: Memory Consolidation

Self-Correction (on_error hook)

Self-Correction Workflow