MetaClaw Evolving Agent Skill by ara.so — Daily 2026 Skills collection MetaClaw is an OpenAI-compatible proxy agent that intercepts conversations, injects learned skills, and continuously improves itself through real-world interactions. It supports three modes: lightweight skills injection, immediate RL training, and a smart "madmax" scheduler that defers weight updates to idle/sleep windows. Installation

Minimal — skills injection only, no GPU required

pip install -e .

Full RL training support (torch, transformers, tinker)

pip install -e ".[rl]"

Skill evolution via LLM summarization

pip install -e ".[evolve]"

Google Calendar scheduler for madmax mode

pip install -e ".[scheduler]"

Recommended: everything

pip install -e ".[rl,evolve,scheduler]" Quick Start

One-time interactive config wizard

metaclaw setup

Start in default madmax mode (skills + RL + smart scheduler)

metaclaw start

Skills only — no GPU, no Tinker needed

metaclaw start --mode skills_only

RL mode — trains immediately when batch is full

metaclaw start --mode rl

RL without scheduler (same as above, explicit)

metaclaw start --mode rl After metaclaw start , a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at http://localhost: instead of the upstream LLM endpoint. Configuration metaclaw setup writes a config file (default: ~/.metaclaw/config.yaml ). You can also edit it directly:

~/.metaclaw/config.yaml

proxy : host : 0.0.0.0 port : 8080 llm : provider : kimi

kimi | qwen | claude | minimax | openai | gemini

base_url : https : //api.moonshot.cn/v1 model : moonshot - v1 - 8k

api_key loaded from env: METACLAW_LLM_API_KEY

skills : enabled : true max_injected : 5

max skills injected per turn

summarize_after_session : true rl : enabled : true backend : auto

auto | tinker | mint

batch_size : 32 algorithm : grpo opd_teacher : false

optional teacher distillation

scheduler :

madmax mode only

enabled : true sleep_hours : [ 22 , 7 ]

local 22:00–07:00

idle_timeout_minutes : 15 google_calendar : false

set true + configure OAuth for meeting detection

logging : level : info log_dir : ~/.metaclaw/logs Environment Variables export METACLAW_LLM_API_KEY = "your-llm-api-key" export METACLAW_TINKER_API_KEY = "your-tinker-api-key"

rl mode

export METACLAW_MINT_API_KEY = "your-mint-api-key"

if backend=mint

export GOOGLE_CALENDAR_CREDENTIALS_PATH = "path/to/creds.json"

scheduler

Operating Modes Mode Command GPU Required Description skills_only metaclaw start --mode skills_only No Proxy + skills injection + auto-summarization rl metaclaw start --mode rl Via API Skills + GRPO training when batch fills madmax metaclaw start Via API Skills + RL + scheduler (trains only during idle/sleep/meetings) Python API Programmatic startup import asyncio from metaclaw import MetaClawAgent , AgentConfig , Mode async def main ( ) : config = AgentConfig . from_yaml ( "~/.metaclaw/config.yaml" ) agent = MetaClawAgent ( config , mode = Mode . MADMAX ) await agent . start ( ) asyncio . run ( main ( ) ) Manual skill injection from metaclaw . skills import SkillStore , SkillInjector store = SkillStore ( path = "~/.metaclaw/skills" )

Add a skill manually

store . add ( name = "code-review-checklist" , content = "Always check for: 1) error handling, 2) type hints, 3) docstrings." , tags = [ "code" , "review" ] )

Retrieve top-k relevant skills for a query

injector

SkillInjector ( store ) relevant = injector . retrieve ( query = "review my Python function" , top_k = 3 ) for skill in relevant : print ( skill . name , skill . score ) Intercepting and recording conversations from metaclaw . proxy import ConversationInterceptor from metaclaw . memory import ExperienceBuffer buffer = ExperienceBuffer ( max_size = 1000 ) interceptor = ConversationInterceptor ( upstream_url = "https://api.moonshot.cn/v1" , on_complete = buffer . record

called after each turn with (messages, response)

)

buffer.record signature:

async def on_complete ( messages : list [ dict ] , response : dict ) -

None : . . . Triggering RL training manually from metaclaw . training import RLTrainer , TrainingConfig trainer = RLTrainer ( config = TrainingConfig ( backend = "tinker" ,

or "mint"

algorithm

"grpo" , batch_size = 32 , lora_rank = 16 , ) )

Collect a batch from the experience buffer and train

async def run_training ( buffer ) : batch = buffer . sample ( n = 32 , split = "support" )

support/query separation

result

await trainer . train ( batch ) print ( f"Training complete. Loss: { result . loss : .4f } , Steps: { result . steps } " ) Reward modeling from metaclaw . rewards import RewardModel reward_model = RewardModel ( provider = "llm" )

uses configured LLM for scoring

async def score_turn ( prompt : str , response : str ) -

float : score = await reward_model . score ( prompt = prompt , response = response ) return score

float in [-1.0, 1.0]

Skills Lifecycle Conversation turn │ ▼ SkillInjector.retrieve() ← vector search over SkillStore │ injects top-k skills into system prompt ▼ LLM responds │ ▼ ExperienceBuffer.record() ← stores (context, response, metadata) │ ▼ (end of session) SkillSummarizer.run() ← LLM extracts reusable patterns │ ▼ SkillStore.upsert() ← new/updated skills persisted to disk Integration: OpenAI SDK as Client Point any OpenAI SDK client at the MetaClaw proxy: from openai import OpenAI

MetaClaw proxy is running on localhost:8080

client

OpenAI ( base_url = "http://localhost:8080/v1" , api_key = "not-used-but-required-by-sdk" ) response = client . chat . completions . create ( model = "moonshot-v1-8k" ,

passed through to upstream

messages

[ { "role" : "user" , "content" : "Review my pull request strategy." } ] ) print ( response . choices [ 0 ] . message . content ) Skills are injected transparently — the client code does not change. Scheduler (MadMax Mode) The scheduler ensures RL weight updates never interrupt active use: from metaclaw . scheduler import MadMaxScheduler , SchedulerConfig scheduler = MadMaxScheduler ( config = SchedulerConfig ( sleep_hours = ( 22 , 7 ) ,

train between 22:00–07:00 local time

idle_timeout_minutes

15 ,

train after 15 min of no conversations

google_calendar

True ,

also train during calendar meetings

credentials_path

"creds.json" ) )

Check if it's safe to train right now

if await scheduler . is_training_window ( ) : await trainer . train ( batch ) Google Calendar Setup

1. Enable Google Calendar API in Google Cloud Console

2. Download OAuth2 credentials as creds.json

3. Set path in config or env

export GOOGLE_CALENDAR_CREDENTIALS_PATH = "/path/to/creds.json"

metaclaw start Support/Query Set Separation MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates: from metaclaw . memory import ExperienceBuffer buffer = ExperienceBuffer ( max_size = 2000 , support_ratio = 0.5

50% support, 50% query

)

During training:

support_batch

buffer . sample ( n = 16 , split = "support" )

used to compute reward signal

query_batch

buffer . sample ( n = 16 , split = "query" )

used for gradient update

await trainer . train_meta ( support = support_batch , query = query_batch ) RL Backends Tinker (default) rl : backend : tinker tinker_project : my - metaclaw - project lora_rank : 16 learning_rate : 1e-4 MinT

Install MinT compatibility layer separately

pip install metaclaw-mint rl : backend : mint mint_endpoint : https : //your - mint - endpoint Auto-detection rl : backend : auto

tries tinker first, falls back to mint, errors if neither available

Troubleshooting Proxy not reachable after metaclaw start Check port conflicts: lsof -i :8080 Change proxy.port in config and restart rl mode: "No training backend available" Ensure pip install -e ".[rl]" completed successfully Verify METACLAW_TINKER_API_KEY or METACLAW_MINT_API_KEY is set Try rl.backend: tinker explicitly instead of auto Skills not persisting between sessions Confirm skills.summarize_after_session: true in config Check write permissions on ~/.metaclaw/skills/ Run metaclaw skills list to inspect stored skills Madmax mode never trains Verify scheduler.sleep_hours covers your timezone's night Lower scheduler.idle_timeout_minutes for testing (e.g., 1 ) Check scheduler logs: ~/.metaclaw/logs/scheduler.log Google Calendar integration fails Re-run OAuth flow: delete ~/.metaclaw/token.json and restart Ensure Calendar API is enabled in your Google Cloud project OPD teacher distillation errors Only supported with rl.backend: tinker Requires a separate teacher model endpoint in config: rl : opd_teacher : true teacher_base_url : https : //api.openai.com/v1 teacher_model : gpt - 4o CLI Reference metaclaw setup

interactive config wizard