MetaClaw Evolving Agent Skill by ara.so — Daily 2026 Skills collection MetaClaw is an OpenAI-compatible proxy agent that intercepts conversations, injects learned skills, and continuously improves itself through real-world interactions. It supports three modes: lightweight skills injection, immediate RL training, and a smart "madmax" scheduler that defers weight updates to idle/sleep windows. Installation
Minimal — skills injection only, no GPU required
pip install -e .
Full RL training support (torch, transformers, tinker)
pip install -e ".[rl]"
Skill evolution via LLM summarization
pip install -e ".[evolve]"
Google Calendar scheduler for madmax mode
pip install -e ".[scheduler]"
Recommended: everything
pip install -e ".[rl,evolve,scheduler]" Quick Start
One-time interactive config wizard
metaclaw setup
Start in default madmax mode (skills + RL + smart scheduler)
metaclaw start
Skills only — no GPU, no Tinker needed
metaclaw start --mode skills_only
RL mode — trains immediately when batch is full
metaclaw start --mode rl
RL without scheduler (same as above, explicit)
metaclaw start
--mode
rl
After
metaclaw start
, a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at
http://localhost:
~/.metaclaw/config.yaml
proxy : host : 0.0.0.0 port : 8080 llm : provider : kimi
kimi | qwen | claude | minimax | openai | gemini
base_url : https : //api.moonshot.cn/v1 model : moonshot - v1 - 8k
api_key loaded from env: METACLAW_LLM_API_KEY
skills : enabled : true max_injected : 5
max skills injected per turn
summarize_after_session : true rl : enabled : true backend : auto
auto | tinker | mint
batch_size : 32 algorithm : grpo opd_teacher : false
optional teacher distillation
scheduler :
madmax mode only
enabled : true sleep_hours : [ 22 , 7 ]
local 22:00–07:00
idle_timeout_minutes : 15 google_calendar : false
set true + configure OAuth for meeting detection
logging : level : info log_dir : ~/.metaclaw/logs Environment Variables export METACLAW_LLM_API_KEY = "your-llm-api-key" export METACLAW_TINKER_API_KEY = "your-tinker-api-key"
rl mode
export METACLAW_MINT_API_KEY = "your-mint-api-key"
if backend=mint
export GOOGLE_CALENDAR_CREDENTIALS_PATH = "path/to/creds.json"
scheduler
Operating Modes Mode Command GPU Required Description skills_only metaclaw start --mode skills_only No Proxy + skills injection + auto-summarization rl metaclaw start --mode rl Via API Skills + GRPO training when batch fills madmax metaclaw start Via API Skills + RL + scheduler (trains only during idle/sleep/meetings) Python API Programmatic startup import asyncio from metaclaw import MetaClawAgent , AgentConfig , Mode async def main ( ) : config = AgentConfig . from_yaml ( "~/.metaclaw/config.yaml" ) agent = MetaClawAgent ( config , mode = Mode . MADMAX ) await agent . start ( ) asyncio . run ( main ( ) ) Manual skill injection from metaclaw . skills import SkillStore , SkillInjector store = SkillStore ( path = "~/.metaclaw/skills" )
Add a skill manually
store . add ( name = "code-review-checklist" , content = "Always check for: 1) error handling, 2) type hints, 3) docstrings." , tags = [ "code" , "review" ] )
Retrieve top-k relevant skills for a query
injector
SkillInjector ( store ) relevant = injector . retrieve ( query = "review my Python function" , top_k = 3 ) for skill in relevant : print ( skill . name , skill . score ) Intercepting and recording conversations from metaclaw . proxy import ConversationInterceptor from metaclaw . memory import ExperienceBuffer buffer = ExperienceBuffer ( max_size = 1000 ) interceptor = ConversationInterceptor ( upstream_url = "https://api.moonshot.cn/v1" , on_complete = buffer . record
called after each turn with (messages, response)
)
buffer.record signature:
async def on_complete ( messages : list [ dict ] , response : dict ) -
None : . . . Triggering RL training manually from metaclaw . training import RLTrainer , TrainingConfig trainer = RLTrainer ( config = TrainingConfig ( backend = "tinker" ,
or "mint"
algorithm
"grpo" , batch_size = 32 , lora_rank = 16 , ) )
Collect a batch from the experience buffer and train
async def run_training ( buffer ) : batch = buffer . sample ( n = 32 , split = "support" )
support/query separation
result
await trainer . train ( batch ) print ( f"Training complete. Loss: { result . loss : .4f } , Steps: { result . steps } " ) Reward modeling from metaclaw . rewards import RewardModel reward_model = RewardModel ( provider = "llm" )
uses configured LLM for scoring
async def score_turn ( prompt : str , response : str ) -
float : score = await reward_model . score ( prompt = prompt , response = response ) return score
float in [-1.0, 1.0]
Skills Lifecycle Conversation turn │ ▼ SkillInjector.retrieve() ← vector search over SkillStore │ injects top-k skills into system prompt ▼ LLM responds │ ▼ ExperienceBuffer.record() ← stores (context, response, metadata) │ ▼ (end of session) SkillSummarizer.run() ← LLM extracts reusable patterns │ ▼ SkillStore.upsert() ← new/updated skills persisted to disk Integration: OpenAI SDK as Client Point any OpenAI SDK client at the MetaClaw proxy: from openai import OpenAI
MetaClaw proxy is running on localhost:8080
client
OpenAI ( base_url = "http://localhost:8080/v1" , api_key = "not-used-but-required-by-sdk" ) response = client . chat . completions . create ( model = "moonshot-v1-8k" ,
passed through to upstream
messages
[ { "role" : "user" , "content" : "Review my pull request strategy." } ] ) print ( response . choices [ 0 ] . message . content ) Skills are injected transparently — the client code does not change. Scheduler (MadMax Mode) The scheduler ensures RL weight updates never interrupt active use: from metaclaw . scheduler import MadMaxScheduler , SchedulerConfig scheduler = MadMaxScheduler ( config = SchedulerConfig ( sleep_hours = ( 22 , 7 ) ,
train between 22:00–07:00 local time
idle_timeout_minutes
15 ,
train after 15 min of no conversations
google_calendar
True ,
also train during calendar meetings
credentials_path
"creds.json" ) )
Check if it's safe to train right now
if await scheduler . is_training_window ( ) : await trainer . train ( batch ) Google Calendar Setup
1. Enable Google Calendar API in Google Cloud Console
2. Download OAuth2 credentials as creds.json
3. Set path in config or env
export GOOGLE_CALENDAR_CREDENTIALS_PATH = "/path/to/creds.json"
4. First run will open browser for OAuth consent
metaclaw start Support/Query Set Separation MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates: from metaclaw . memory import ExperienceBuffer buffer = ExperienceBuffer ( max_size = 2000 , support_ratio = 0.5
50% support, 50% query
)
During training:
support_batch
buffer . sample ( n = 16 , split = "support" )
used to compute reward signal
query_batch
buffer . sample ( n = 16 , split = "query" )
used for gradient update
await trainer . train_meta ( support = support_batch , query = query_batch ) RL Backends Tinker (default) rl : backend : tinker tinker_project : my - metaclaw - project lora_rank : 16 learning_rate : 1e-4 MinT
Install MinT compatibility layer separately
pip install metaclaw-mint rl : backend : mint mint_endpoint : https : //your - mint - endpoint Auto-detection rl : backend : auto
tries tinker first, falls back to mint, errors if neither available
Troubleshooting Proxy not reachable after metaclaw start Check port conflicts: lsof -i :8080 Change proxy.port in config and restart rl mode: "No training backend available" Ensure pip install -e ".[rl]" completed successfully Verify METACLAW_TINKER_API_KEY or METACLAW_MINT_API_KEY is set Try rl.backend: tinker explicitly instead of auto Skills not persisting between sessions Confirm skills.summarize_after_session: true in config Check write permissions on ~/.metaclaw/skills/ Run metaclaw skills list to inspect stored skills Madmax mode never trains Verify scheduler.sleep_hours covers your timezone's night Lower scheduler.idle_timeout_minutes for testing (e.g., 1 ) Check scheduler logs: ~/.metaclaw/logs/scheduler.log Google Calendar integration fails Re-run OAuth flow: delete ~/.metaclaw/token.json and restart Ensure Calendar API is enabled in your Google Cloud project OPD teacher distillation errors Only supported with rl.backend: tinker Requires a separate teacher model endpoint in config: rl : opd_teacher : true teacher_base_url : https : //api.openai.com/v1 teacher_model : gpt - 4o CLI Reference metaclaw setup
interactive config wizard
metaclaw start
start in madmax mode
metaclaw start --mode skills_only metaclaw start --mode rl metaclaw start --config path/to/config.yaml metaclaw skills list
show all stored skills
metaclaw skills delete < name
remove a skill
metaclaw skills export skills.json metaclaw status
show proxy, scheduler, training status
metaclaw logs
tail all logs
metaclaw logs --component scheduler