harness-engineering-playbook

安装量: 48
排名: #15501

安装

npx skills add https://github.com/broomva/harness-engineering-skill --skill harness-engineering-playbook
Harness Engineering Playbook
Use this skill to operationalize the practices from OpenAI's Harness Engineering guide in a repo that agents can run against repeatedly and safely.
What To Load
Use
references/openai-harness-practices.md
for the full practice-to-artifact mapping.
Use
references/rollout-checklist.md
for phased adoption in active repos.
Use
references/wizard-cli.md
for Typer wizard command flows.
Use
assets/templates/
when creating or updating harness files.
Inputs
Target repository path.
Existing command surface (
make
,
npm
,
cargo
,
pytest
, etc.).
Existing CI workflows and branch protections.
Workflow
Baseline the repo and detect existing workflows.
Bootstrap harness artifacts and templates.
Apply all nine Harness Engineering practices.
Run harness audit checks and repair gaps.
Iterate after real agent runs.
Step 1: Baseline The Repo
Identify language/toolchain and canonical entrypoints.
Inventory existing checks, scripts, and CI jobs.
Record current pain points for agent runs: setup drift, unclear docs, flaky tests, missing trace IDs, slow loops.
Use a short baseline note inside
PLANS.md
so decisions remain durable.
Step 2: Bootstrap Harness Artifacts
Preferred entrypoint:
python3 scripts/harness_wizard.py init
<
repo-path
>
--profile
control
Profiles:
baseline
only core harness artifacts.
control
baseline + control-system primitives.
full
control + entropy controls (nightly audit + entropy checks). Direct shell fallback: Run: ./scripts/bootstrap_harness.sh < repo-path

This script installs safe defaults from assets/templates/ : AGENTS.md PLANS.md docs/ARCHITECTURE.md docs/OBSERVABILITY.md Makefile.harness (+ -include Makefile.harness in Makefile ) scripts/audit_harness.sh scripts/harness/{smoke,test,lint,typecheck}.sh .github/workflows/harness.yml By default, existing files are not overwritten. Pass --force to replace template-managed files. Step 3: Apply The Nine Practices Implement each practice directly in repo artifacts. 1. Make Easy To Do Hard Thing Ensure hard, high-value tasks are one command away ( make smoke , make check , make ci ). Keep setup and cleanup scripted. Make smoke checks cheap enough for frequent use. 2. Communicate Actionable Constraints With Compact Docs Keep AGENTS.md short, concrete, and command-first. Document non-obvious constraints and guardrails. Keep docs close to code and update with behavior changes. 3. Structure Codebase With Strict Boundaries And Flow Define module boundaries in docs/ARCHITECTURE.md . Parse and validate data at boundaries; use typed contracts for internal flow. Prefer one abstraction per module and one clear ownership path. 4. Build Observability In From Day 1 Emit structured logs/events with correlation IDs. Capture key transitions in long-running workflows. Define minimum observable fields in docs/OBSERVABILITY.md . 5. Optimize For Agent Flow, Not Human Flow Treat context as a first-class system dependency. Use PLANS.md for multi-step/multi-hour tasks. Front-load durable context (scope, constraints, checkpoints) so restarts stay cheap. 6. Bring Your Own Harness Standardize repo-local wrappers ( Makefile.harness , scripts/harness/ ). Wrap local infra actions in deterministic scripts. Make agent behavior reproducible across machines and runs. 7. Prototype In Natural Language First Draft logic and tests in prose before coding. Review edge cases in prose and lock acceptance criteria. Translate approved prose into code and tests. 8. Invest In Static Analysis And Linting Pin formatter/linter/typechecker versions where practical. Enforce checks in both local workflow and CI. Run static checks before long tests to shorten failure loops. 9. Manage Entropy Add periodic audits for docs drift, flaky checks, and dead scripts. Keep templates synchronized with real workflows. Remove stale abstractions quickly to keep agent context clean. For a detailed artifact matrix, load references/openai-harness-practices.md . Step 4: Validate Run: python3 scripts/harness_wizard.py audit < repo-path

Treat any MISSING or FAIL result as blocking before calling harness setup complete. Step 5: Iterate On Real Runs Observe one full agent run from clean checkout to merged change. Patch harness gaps immediately. Re-run audit. Keep AGENTS.md , PLANS.md , and architecture docs aligned with current behavior. Adaptation Rules Preserve existing project conventions and replace templates incrementally. Do not overwrite user-authored files without explicit approval. Keep command names stable; change internals behind wrappers. Favor deterministic, scriptable workflows over ad-hoc interactive steps.

返回排行榜