E2E Testing Principles (Always Active) These apply whenever working with e2e tests, test failures, or test flakiness: Failure Taxonomy Every e2e failure is exactly one of: A. Flaky (test infrastructure issue) Race conditions, timing-dependent assertions Stale selectors after UI changes Missing waits, incorrect wait targets Network timing, mock setup ordering Symptom: passes on retry, fails intermittently B. Outdated (test no longer matches implementation) Test asserts old behavior that was intentionally changed Selectors reference removed/renamed elements API contract changed, test wasn't updated Symptom: consistent failure, app works correctly C. Bug (implementation doesn't match spec) Test correctly asserts spec'd behavior, code is wrong Only classify as bug when a spec exists to validate against If no spec exists, classify as "unverified failure" and report to the user Fix Rules by Category Flaky fixes: Replace waitForTimeout with auto-waiting locators Replace brittle CSS selectors with getByRole / getByLabel / getByTestId Fix race conditions with expect() web-first assertions Fix mock/route setup ordering (before navigation) Never add arbitrary delays - fix the underlying wait Never weaken assertions to make flaky tests pass Never add retry loops around assertions - use the framework's built-in retry Outdated fixes: Update test assertions to match current (correct) behavior Update selectors to match current DOM/API Never change source code - the implementation is correct, the test is stale Bug fixes: Quote the spec section that defines expected behavior Fix the source code to match the spec Unit tests MUST exist before the fix is complete If unit tests exist, run them to confirm If unit tests don't exist, write them first (TDD) Never change e2e assertions to match buggy code Never change API contracts or interfaces without spec backing If no spec exists, ask the user: bug or outdated test? Source Code Boundary E2e test fixes must not change: Application logic or business rules API contracts, request/response shapes Database schemas or migrations Configuration defaults The only exception: bug fixes where a spec explicitly defines the correct behavior and unit tests cover the fix. Workflow (When Explicitly Running E2E) Step 1: Discover Test Infrastructure Find e2e config: playwright.config.ts , vitest.config.ts , or project-specific setup Read package.json for the canonical e2e command Check if dev server or Tilt environment is required and running Find spec files: .spec.md , docs/.spec.md - source of truth for bug decisions Step 2: Run Tests Run with minimal reporter to avoid context overflow:
Playwright
yarn playwright test --reporter = line
Or project-specific
yarn test:e2e If a filter is specified, apply it: yarn playwright test --reporter = line -g "transfer" yarn test:e2e -- --grep "transfer" Parse failures into: Test File Error Category login flow auth.spec.ts:42 timeout waiting for selector TBD Step 3: Categorize For each failure: Read the test file Read the source code it exercises Check for a corresponding spec file Assign category: flaky, outdated, bug, or unverified Step 4: Fix by Category Apply fixes following the Principles above, in order: Flaky - fix test infrastructure issues first (unblocks other tests) Outdated - update stale assertions Bug - fix with spec + unit test gate Step 5: Re-run and Report After all fixes, re-run the suite:
E2E Results
Run: yarn test:e2e on
Fixed
- FLAKY:
auth.spec.ts:42- replaced waitForTimeout with getByRole wait - OUTDATED:
profile.spec.ts:88- updated selector after header redesign - BUG:
transfer.spec.ts:120- fixed amount validation per SPEC.md#transfers
Remaining Failures
- UNVERIFIED:
settings.spec.ts:55- no spec, needs user decision
Unit Tests Added
src/transfer.test.ts- amount validation edge cases (covers BUG fix)