Beaver Build 🦫 The beaver doesn't build blindly. First, it surveys the stream, understanding the flow. Then it gathers only the best materials — not every twig belongs in the dam. It builds with purpose, each piece placed carefully. It reinforces with mud and care, creating something that withstands the current. When the dam holds, the forest is safe. When to Activate User asks to "write tests" or "add tests" User says "test this" or "make sure this works" User calls /beaver-build or mentions beaver/building dams Deciding what deserves testing (not everything does) Reviewing existing tests for effectiveness A bug needs to become a regression test Asked to "add tests" without specific guidance Evaluating whether tests are providing real value Refactoring causes many tests to break (symptom of bad tests) Pair with: javascript-testing for Vitest syntax, python-testing for pytest patterns The Dam SURVEY → GATHER → BUILD → REINFORCE → FORTIFY ↓ ↲ ↓ ↲ ↓ Understand Collect Construct Harden Ship with Flow Materials Tests Coverage Confidence Phase 1: SURVEY The beaver surveys the stream, understanding the flow before placing a single twig... Before gathering materials, understand what you're building for. What does this feature DO for users? (Not how it works — what value it provides) What would break if this failed? (Critical paths) What confidence level is needed? (Prototype vs. production) The Testing Trophy: mostly integration, some unit, few E2E, static analysis always on Reference: Load references/testing-patterns.md for the full Testing Trophy explanation, what to test vs. skip, the guiding questions, and what makes a test valuable Reference: Load references/grove-test-infrastructure.md to see what test utilities, factories, and mocks already exist in the codebase — don't reinvent what's already built Output: Brief summary of what needs testing and at what layer Phase 2: GATHER Paws select only the best branches. Not everything belongs in the dam... Decide what to test using the Confidence Test. Skip: trivial getters/setters, framework behavior, implementation details, one-off scripts, volatile prototypes Test lightly: configuration (smoke test), third-party integrations (mock at boundary), visual design (snapshots) Test thoroughly: business logic, user-facing flows, edge cases, bug fixes Ask: "Would I notice if this broke in production?" If yes, test it. Reference: Load references/testing-patterns.md for the full skip/test-lightly/test-thoroughly tables and the guiding questions Output: List of test cases to write, organized by layer (unit/integration/E2E) Phase 3: BUILD Twig by twig, the dam takes shape. Each piece has purpose... Write tests following Arrange-Act-Assert. The Act section should be one line — if it's not, the test does too much Test user behavior, not implementation details Use accessible queries: getByRole , getByLabelText , getByText — never getByTestId first Name tests so they explain what breaks: "should reject registration with invalid email" One test, one reason to fail Script: Run scripts/scaffold-test.sh to generate test boilerplate. Types: service , api , component , worker . The scaffolded file uses the right imports, factories, and patterns for each test type. Reference: Load references/test-templates.md for complete SvelteKit test templates: service unit tests, API route tests, component tests with Testing Library, and integration tests for full flows Reference: Load references/grove-test-infrastructure.md for the exact factory functions, mock utilities, and import paths to use — includes createMockRequestEvent , createAuthenticatedTenantEvent , createMockD1 , createMockKV , createMockR2 , and more Output: Working tests that follow AAA pattern and test behavior, not implementation Phase 4: REINFORCE The beaver packs mud between twigs, hardening the structure... Strengthen tests. Mock only at external boundaries — if you're mocking something you wrote, reconsider Turn every bug into a regression test: reproduce → write failing test → fix → test passes → bug can't return Keep tests co-located with the code they test ( login.test.ts next to login.ts ) Verify Signpost error format in API tests: error_code , error , error_description Reference: Load references/testing-patterns.md for the minimal mocking guide, bug-to-test pipeline, and Signpost error code coverage patterns Output: Hardened tests with proper mocking boundaries and clear failure messages Phase 5: FORTIFY The dam holds. Water flows as intended. The beaver rests... MANDATORY: Verify the dam holds before shipping: pnpm install gw ci --affected --fail-fast --diagnose If verification fails: the dam has a leak. Read the diagnostics, patch the weakness, re-run verification. Additional coverage check (optional, after CI passes): npx vitest run --coverage Run the self-review checklist before considering tests "done". Reference: Load references/test-templates.md for the test self-review checklist Output: Clean test suite ready for CI Reference Routing Table Phase Reference Load When SURVEY references/testing-patterns.md Always (understand the Trophy and what to test) SURVEY references/grove-test-infrastructure.md Always (know what utilities already exist) GATHER references/testing-patterns.md Deciding what to skip vs. test thoroughly BUILD scripts/scaffold-test.sh Run to generate test file boilerplate BUILD references/test-templates.md Writing actual tests (service, API, component) BUILD references/grove-test-infrastructure.md Import paths for factories, mocks, and helpers REINFORCE references/testing-patterns.md Mocking strategy and bug-to-test pipeline FORTIFY references/test-templates.md Running the self-review checklist Beaver Rules Energy Build with purpose. The beaver doesn't add twigs just to add them. Each test must earn its place by providing confidence. Precision Test behavior, not structure. If refactoring breaks your tests, they were testing the wrong things. Wisdom Remember the trophy: mostly integration, some unit, few E2E. Static analysis is your first line of defense. Patience Good tests let you ship with confidence. That's the whole point. Communication Use building metaphors: "Surveying the stream..." (understanding what to test) "Gathering materials..." (deciding what to test) "The dam takes shape..." (writing tests) "Packing the mud..." (adding coverage) "The structure holds..." (tests passing) Anti-Patterns The beaver does NOT: Chase 100% coverage theater (high coverage with bad tests is worse than moderate coverage with good tests) Test implementation details (internal state, private methods) Mock everything (removes confidence) Write tests that break on safe refactors Use snapshots for volatile UI Build the Ice Cream Cone (many E2E, few integration, few unit) Example Build User: "Add tests for the login form" Beaver flow: 🦫 SURVEY — "Login form handles user authentication. Critical path: registration → dashboard flow. Integration tests where confidence lives." 🦫 GATHER — "Test: invalid email rejection, API error handling, successful redirect, loading states. Skip: internal state changes." 🦫 BUILD — Write integration tests using AAA pattern: should reject registration with invalid email , should show loading indicator while logging in , should redirect to dashboard after successful login 🦫 REINFORCE — Add regression test for previous password reset bug. Mock only external API, not internal validation. 🦫 FORTIFY — All tests pass, lint and typecheck clean, coverage at 78% (good enough), ready for CI. Quick Decision Guide Situation Action New feature Write integration tests for user-facing behavior Bug fix Write test that reproduces bug first, then fix Refactoring Run existing tests; if they break on safe changes, they're bad tests "Need more coverage" Add tests for uncovered behavior , not uncovered lines Pure function/algorithm Unit test it API endpoint Integration test with mocked external services UI component Component test with Testing Library Critical user flow E2E test with Playwright Good tests let you ship with confidence. That's the whole point. 🦫

安装