CM - CASS Memory System

Procedural memory for AI coding agents. Transforms scattered sessions into persistent, cross-agent memory. Uses a three-layer cognitive architecture that mirrors human expertise development.

Why This Exists

AI coding agents accumulate valuable knowledge but it's:

Trapped in sessions - Context lost when session ends Agent-specific - Claude doesn't know what Cursor learned Unstructured - Raw logs aren't actionable guidance Subject to collapse - Naive summarization loses critical details

You've solved auth bugs three times this month across different agents. Each time you started from scratch.

CM solves this with cross-agent learning: a pattern discovered in Cursor is immediately available to Claude Code.

Three-Layer Cognitive Architecture ┌─────────────────────────────────────────────────────────────────────┐ │ EPISODIC MEMORY (cass) │ │ Raw session logs from all agents — the "ground truth" │ │ Claude Code │ Codex │ Cursor │ Aider │ PI │ Gemini │ ChatGPT │ ...│ └───────────────────────────┬─────────────────────────────────────────┘ │ cass search ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ WORKING MEMORY (Diary) │ │ Structured session summaries: accomplishments, decisions, etc. │ └───────────────────────────┬─────────────────────────────────────────┘ │ reflect + curate (automated) ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ PROCEDURAL MEMORY (Playbook) │ │ Distilled rules with confidence tracking and decay │ └─────────────────────────────────────────────────────────────────────┘

Every agent's sessions feed the shared memory. A pattern discovered in Cursor automatically helps Claude Code on the next session.

The One Command You Need cm context "" --json

Run this before starting any non-trivial task. Returns:

relevantBullets - Rules from playbook scored by task relevance antiPatterns - Things that have caused problems historySnippets - Past sessions (yours and other agents') suggestedCassQueries - Deeper investigation searches Filtering History by Source

historySnippets[].origin.kind is "local" or "remote". Remote hits include origin.host:

{ "historySnippets": [ { "source_path": "~/.claude/sessions/session-001.jsonl", "origin": { "kind": "local" } }, { "source_path": "/home/user/.codex/sessions/session.jsonl", "origin": { "kind": "remote", "host": "workstation" } } ] }

Confidence Decay System

Rules aren't immortal. Confidence decays without revalidation:

Mechanism Effect 90-day half-life Confidence halves every 90 days without feedback 4x harmful multiplier One mistake counts 4× as much as one success Maturity progression candidate → established → proven Score Decay Visualization Initial score: 10.0 (10 helpful marks today)

After 90 days (half-life): 5.0 After 180 days: 2.5 After 270 days: 1.25 After 365 days: 0.78

Effective Score Formula effectiveScore = decayedHelpful - (4 × decayedHarmful)

// Where decay factor = 0.5 ^ (daysSinceFeedback / 90)

Maturity State Machine ┌──────────┐ ┌─────────────┐ ┌────────┐ │ candidate│──────▶│ established │───▶│ proven │ └──────────┘ └─────────────┘ └────────┘ │ │ │ │ │ (harmful >25%) │ │ ▼ │ │ ┌─────────────┐ │ └────────────▶│ deprecated │◀─────────┘ └─────────────┘

Transition Rules:

Transition Criteria candidate → established 3+ helpful, harmful ratio <25% established → proven 10+ helpful, harmful ratio <10% any → deprecated Harmful ratio >25% OR explicit deprecation Anti-Pattern Learning

Bad rules don't just get deleted. They become warnings:

"Cache auth tokens for performance" ↓ (3 harmful marks) "PITFALL: Don't cache auth tokens without expiry validation"

When a rule is marked harmful multiple times (>50% harmful ratio with 3+ marks), it's automatically inverted into an anti-pattern.

ACE Pipeline (How Rules Are Created) Generator → Reflector → Validator → Curator

Stage Role LLM? Generator Pre-task context hydration (cm context) No Reflector Extract patterns from sessions (cm reflect) Yes Validator Evidence gate against cass history Yes Curator Deterministic delta merge No

Critical: Curator has NO LLM to prevent context collapse from iterative drift. LLMs propose patterns; deterministic logic manages them.

Scientific Validation

Before a rule joins your playbook, it's validated against cass history:

Proposed rule: "Always check token expiry before auth debugging" ↓ Evidence gate: Search cass for sessions where this applied ↓ Result: 5 sessions found, 4 successful outcomes → ACCEPT

Rules without historical evidence are flagged as candidates until proven.

Commands Reference Context Retrieval (Primary Workflow)

THE MAIN COMMAND - run before non-trivial tasks

cm context "implement user authentication" --json

Limit results for token budget

cm context "fix bug" --json --limit 5 --no-history

With workspace filter

cm context "refactor" --json --workspace /path/to/project

Self-documenting explanation

cm quickstart --json

System health

cm doctor --json cm doctor --fix # Auto-fix issues

Find similar rules

cm similar "error handling best practices"

Playbook Management cm playbook list # All rules cm playbook get b-8f3a2c # Rule details cm playbook add "Always run tests first" # Add rule cm playbook add --file rules.json # Batch add from file cm playbook add --file rules.json --session /path/session.jsonl # Track source cm playbook remove b-xyz --reason "Outdated" # Remove cm playbook export > backup.yaml # Export cm playbook import shared.yaml # Import cm playbook bootstrap react # Apply starter to existing

cm top 10 # Top effective rules cm stale --days 60 # Rules without recent feedback cm why b-8f3a2c # Rule provenance cm stats --json # Playbook health metrics

Learning & Feedback

Manual feedback

cm mark b-8f3a2c --helpful cm mark b-xyz789 --harmful --reason "Caused regression" cm undo b-xyz789 # Revert feedback

Session outcomes (positional: status, rules)

cm outcome success b-8f3a2c,b-def456 cm outcome failure b-x7k9p1 --summary "Auth approach failed" cm outcome-apply # Apply to playbook

Reflection (usually automated)

cm reflect --days 7 --json cm reflect --session /path/to/session.jsonl # Single session cm reflect --workspace /path/to/project # Project-specific

Validation

cm validate "Always check null before dereferencing"

Audit sessions against rules

cm audit --days 30

Deprecate permanently

cm forget b-xyz789 --reason "Superseded by better pattern"

Onboarding (Agent-Native)

Zero-cost playbook building using your existing agent:

cm onboard status # Check progress cm onboard gaps # Category gaps cm onboard sample --fill-gaps # Prioritized sessions cm onboard sample --agent claude --days 14 # Filter by agent/time cm onboard sample --workspace /path/project # Filter by workspace cm onboard sample --include-processed # Re-analyze sessions cm onboard read /path/session.jsonl --template # Rich context cm onboard mark-done /path/session.jsonl # Mark processed cm onboard reset # Start fresh

Trauma Guard (Safety System) cm trauma list # Active patterns cm trauma add "DROP TABLE" --description "Mass deletion" --severity critical cm trauma heal t-abc --reason "Intentional migration" cm trauma remove t-abc cm trauma scan --days 30 # Scan for traumas cm trauma import shared-traumas.yaml

cm guard --install # Claude Code hook cm guard --git # Git pre-commit hook cm guard --install --git # Both cm guard --status # Check installation

System Commands cm init # Initialize cm init --starter typescript # With template cm init --force # Reinitialize (creates backup) cm starters # List templates cm serve --port 3001 # MCP server cm usage # LLM cost stats cm privacy status # Privacy settings cm privacy enable # Enable cross-agent enrichment cm privacy disable # Disable enrichment cm project --format agents.md # Export for AGENTS.md

Starter Playbooks

Starting with an empty playbook is daunting. Starters provide curated best practices:

cm starters # List available cm init --starter typescript # Initialize with starter cm playbook bootstrap react # Apply to existing playbook

Built-in Starters Starter Focus Rules general Universal best practices 5 typescript TypeScript/Node.js patterns 4 react React/Next.js development 4 python Python/FastAPI/Django 4 node Node.js/Express services 4 rust Rust service patterns 4 Custom Starters

Create YAML files in ~/.cass-memory/starters/:

~/.cass-memory/starters/django.yaml

name: django description: Django web framework best practices bullets: - content: "Always use Django's ORM for database operations" category: database maturity: established tags: [django, orm]

Inline Feedback (During Work)

Leave feedback in code comments. Parsed during reflection:

// [cass: helpful b-8f3a2c] - this rule saved me from a rabbit hole

// [cass: harmful b-x7k9p1] - this advice was wrong for our use case

Agent Protocol 1. START: cm context "" --json 2. WORK: Reference rule IDs when following them (e.g., "Following b-8f3a2c...") 3. FEEDBACK: Leave inline comments when rules help/hurt 4. END: Just finish. Learning happens automatically.

You do NOT need to:

Run cm reflect (automation handles this) Run cm mark manually (use inline comments) Manually add rules to the playbook Gap Analysis Categories Category Keywords debugging error, fix, bug, trace, stack testing test, mock, assert, expect, jest architecture design, pattern, module, abstraction workflow task, CI/CD, deployment documentation comment, README, API doc integration API, HTTP, JSON, endpoint collaboration review, PR, team git branch, merge, commit security auth, token, encrypt, permission performance optimize, cache, profile

Category Status Thresholds:

Status Rule Count Priority critical 0 rules High underrepresented 1-2 rules Medium adequate 3-10 rules Low well-covered 11+ rules None Trauma Guard: Safety System

The "hot stove" principle—learn from past incidents and prevent recurrence.

How It Works Session History Trauma Registry Runtime Guard ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ rm -rf /* (oops)│ ──────▶ │ Pattern: rm -rf │ ──────▶ │ BLOCKED: This │ │ "sorry, I made │ scan │ Severity: FATAL │ hook │ command matches │ │ a mistake..." │ │ Session: abc123 │ │ a trauma pattern│ └─────────────────┘ └─────────────────┘ └─────────────────┘

Built-in Doom Patterns (20+) Category Examples Filesystem rm -rf /, rm -rf ~, recursive deletes Database DROP DATABASE, TRUNCATE, DELETE FROM without WHERE Git git push --force to main/master, git reset --hard Infrastructure terraform destroy -auto-approve, kubectl delete namespace Cloud aws s3 rm --recursive, destructive CloudFormation Pattern Storage Scope Location Purpose Global ~/.cass-memory/traumas.jsonl Personal patterns Project .cass/traumas.jsonl Commit to repo for team Pattern Lifecycle Active: Blocks matching commands Healed: Temporarily bypassed (with reason and timestamp) Deleted: Removed (can be re-added) MCP Server

Run as MCP server for agent integration:

Local-only (recommended)

cm serve --port 3001

With auth token (for non-loopback)

MCP_HTTP_TOKEN="" cm serve --host 0.0.0.0 --port 3001

Tools Exposed Tool Purpose Parameters cm_context Get rules + history task, limit?, history?, days?, workspace? cm_feedback Record feedback bulletId, helpful?, harmful?, reason? cm_outcome Record session outcome sessionId, outcome, rulesUsed? memory_search Search playbook/cass query, scope?, limit?, days? memory_reflect Trigger reflection days?, maxSessions?, dryRun? Resources Exposed URI Purpose cm://playbook Current playbook state cm://diary Recent diary entries cm://outcomes Session outcomes cm://stats Playbook health metrics Client Configuration

Claude Code (~/.config/claude/mcp.json):

{ "mcpServers": { "cm": { "command": "cm", "args": ["serve"] } } }

Graceful Degradation Condition Behavior No cass Playbook-only scoring, no history snippets No playbook Empty playbook, commands still work No LLM Deterministic reflection, no semantic enhancement Offline Cached playbook + local diary Output Format

All commands support --json for machine-readable output.

Design principle: stdout = JSON only; diagnostics go to stderr.

Success Response { "success": true, "task": "fix the auth timeout bug", "relevantBullets": [ { "id": "b-8f3a2c", "content": "Always check token expiry before auth debugging", "effectiveScore": 8.5, "maturity": "proven", "relevanceScore": 0.92, "reasoning": "Extracted from 5 successful sessions" } ], "antiPatterns": [...], "historySnippets": [...], "suggestedCassQueries": [...], "degraded": null }

Error Response { "success": false, "code": "PLAYBOOK_NOT_FOUND", "error": "Playbook file not found", "hint": "Run 'cm init' to create a new playbook", "retryable": false, "recovery": ["cm init", "cm doctor --fix"], "docs": "README.md#-troubleshooting" }

Exit Codes Code Meaning 1 Internal error 2 User input/usage 3 Configuration 4 Filesystem 5 Network 6 cass error 7 LLM/provider error Token Budget Management Flag Effect --limit N Cap number of rules --min-score N Only rules above threshold --no-history Skip historical snippets (faster) --json Structured output Configuration

Config lives at ~/.cass-memory/config.json (global) and .cass/config.json (repo).

Precedence: CLI flags > Repo config > Global config > Defaults

Security: Repo config cannot override sensitive paths or user-level consent settings.

Key Options { "provider": "anthropic", "model": "claude-sonnet-4-20250514", "budget": { "dailyLimit": 0.10, "monthlyLimit": 2.00 }, "scoring": { "decayHalfLifeDays": 90, "harmfulMultiplier": 4 }, "maxBulletsInContext": 50, "maxHistoryInContext": 10, "sessionLookbackDays": 7, "crossAgent": { "enabled": false, "consentGiven": false, "auditLog": true }, "remoteCass": { "enabled": false, "hosts": [{"host": "workstation", "label": "work"}] }, "semanticSearchEnabled": false, "embeddingModel": "Xenova/all-MiniLM-L6-v2", "dedupSimilarityThreshold": 0.85 }

Environment Variables Variable Purpose ANTHROPIC_API_KEY API key for Anthropic (Claude) OPENAI_API_KEY API key for OpenAI GOOGLE_GENERATIVE_AI_API_KEY API key for Google Gemini CASS_PATH Path to cass binary CASS_MEMORY_LLM Set to none for LLM-free mode MCP_HTTP_TOKEN Auth token for non-loopback MCP server Data Locations ~/.cass-memory/ # Global (user-level) ├── config.json # Configuration ├── playbook.yaml # Personal playbook ├── diary/ # Session summaries ├── outcomes/ # Session outcomes ├── traumas.jsonl # Trauma patterns ├── starters/ # Custom starter playbooks ├── onboarding-state.json # Onboarding progress ├── privacy-audit.jsonl # Cross-agent audit trail ├── processed-sessions.jsonl # Reflection progress └── usage.jsonl # LLM cost tracking

.cass/ # Project-level (in repo) ├── config.json # Project-specific overrides ├── playbook.yaml # Project-specific rules ├── traumas.jsonl # Project-specific patterns └── blocked.yaml # Anti-patterns to block

Automating Reflection Cron Job

Daily at 2am

0 2 * * * /usr/local/bin/cm reflect --days 7 >> ~/.cass-memory/reflect.log 2>&1

Claude Code Hook

.claude/hooks.json:

{ "post-session": ["cm reflect --days 1"] }

Privacy & Security Local-First Design All data stays on your machine No cloud sync, no telemetry Cross-agent enrichment is opt-in with explicit consent Audit log for enrichment events Secret Sanitization

Before processing, content is sanitized:

OpenAI/Anthropic/AWS/Google API keys GitHub tokens JWTs Passwords and secrets in config patterns Privacy Controls cm privacy status # Check settings cm privacy enable # Enable cross-agent enrichment cm privacy disable # Disable enrichment

Performance Characteristics Operation Typical Latency cm context (cached) 50-150ms cm context (cold) 200-500ms cm context (no cass) 30-80ms cm reflect (1 session) 5-15s cm reflect (5 sessions) 20-60s cm playbook list <50ms cm similar (keyword) 20-50ms cm similar (semantic) 100-300ms LLM Cost Estimates Operation Typical Cost Reflect (1 session) $0.01-0.05 Reflect (7 days) $0.05-0.20 Validate (1 rule) $0.005-0.01

With default budget ($0.10/day, $2.00/month): ~5-10 sessions/day.

Batch Rule Addition

After analyzing a session, add multiple rules at once:

Create JSON file

cat > rules.json << 'EOF' [ {"content": "Always run tests before committing", "category": "testing"}, {"content": "Check token expiry before auth debugging", "category": "debugging"}, {"content": "AVOID: Mocking entire modules in tests", "category": "testing"} ] EOF

Add all rules

cm playbook add --file rules.json

Track which session they came from

cm playbook add --file rules.json --session /path/to/session.jsonl

Or pipe from stdin

echo '[{"content": "Rule", "category": "workflow"}]' | cm playbook add --file -

Template Output for Onboarding

--template provides rich context for rule extraction:

cm onboard read /path/to/session.jsonl --template --json

Returns:

metadata: path, workspace, message count, topic hints context: related rules, playbook gaps, suggested focus extractionFormat: schema, categories, examples sessionContent: actual session data Integration with CASS

CASS provides episodic memory (raw sessions). CM extracts procedural memory (rules and playbooks).

CASS: Search raw sessions

cass search "authentication timeout" --robot

CM: Get distilled rules for a task

cm context "authentication timeout" --json

Troubleshooting Error Solution cass not found Install from cass repo cass search failed Run cass index --full API key missing Set ANTHROPIC_API_KEY, OPENAI_API_KEY, or GOOGLE_GENERATIVE_AI_API_KEY Playbook corrupt Run cm doctor --fix Budget exceeded Check cm usage, adjust limits Diagnostic Commands cm doctor --json # System health cm doctor --fix # Auto-fix issues cm usage # LLM budget status cm stats --json # Playbook health cm why # Rule provenance

LLM-Free Mode CASS_MEMORY_LLM=none cm context "task" --json

Installation

One-liner (recommended)

curl -fsSL https://raw.githubusercontent.com/Dicklesworthstone/cass_memory_system/main/install.sh \ | bash -s -- --easy-mode --verify

Specific version

install.sh --version v0.2.2 --verify

System-wide

install.sh --system --verify

From source

git clone https://github.com/Dicklesworthstone/cass_memory_system.git cd cass_memory_system bun install && bun run build sudo mv ./dist/cass-memory /usr/local/bin/cm

Integration with Flywheel Tool Integration CASS CM reads from cass episodic memory, writes procedural memory NTM Robot mode integrates with cm for context before agent work Agent Mail Rules can reference mail threads as provenance BV Task context enriched with relevant playbook rules

安装