Optimizes test execution by intelligently selecting which tests to run based on code changes. Instead of running the full test suite every time, this skill:
-
Maps code changes to affected test files using import dependency analysis
-
Provides tiered testing strategies for different feedback loop needs
-
Tracks test reliability to prioritize stable tests in fast runs
When I Activate
I automatically load when you mention:
-
"run affected tests" or "run impacted tests"
-
"smart test" or "intelligent testing"
-
"which tests to run" or "test selection"
-
"fast tests" or "quick tests"
-
"tests for changes" or "tests for this PR"
Core Concepts
Test Tiers
Tier 1: Fast Tests (< 1 minute)
-
Directly affected unit tests (imports changed file)
-
High-reliability tests only (no flaky tests)
-
Run on every save or pre-commit
-
Command:
pytest -m "not slow and not integration" [selected_tests]
Tier 2: Impacted Tests (< 5 minutes)
-
All tests affected by changes (direct + transitive dependencies)
-
Includes integration tests for changed modules
-
Run before commit or on PR draft
-
Command:
pytest [selected_tests]
Tier 3: Full Suite
-
Complete test suite
-
Run on PR ready-for-review or CI
-
Command:
pytest
Import Dependency Analysis
The skill builds a dependency graph by analyzing Python imports:
source_file.py
|
+-- Imported by: module_a.py, module_b.py
| |
| +-- Tested by: test_module_a.py, test_module_b.py
|
+-- Tested by: test_source_file.py (direct test)
Direct Tests: Files matching pattern test_{module}.py or {module}_test.py
Indirect Tests: Tests that import modules which import the changed file
Reliability Tracking
Tests are scored on reliability (0.0 to 1.0):
-
1.0: Always passes (stable)
-
0.5-0.9: Occasional failures (investigate)
-
< 0.5: Frequently fails (flaky - excluded from Tier 1)
Reliability is tracked in ~/.amplihack/.claude/data/test-mapping/reliability.yaml
Usage
Analyze Changes and Get Test Commands
User: What tests should I run for my changes?
Claude (using smart-test):
1. Analyzes git diff or staged changes
2. Maps changed files to test dependencies
3. Returns tiered test commands
Example Output:
------------------------------------------
Smart Test Analysis
------------------------------------------
Changed Files:
- src/amplihack/core/processor.py
- src/amplihack/utils/helpers.py
Tier 1 (Fast - 45s estimated):
pytest tests/unit/test_processor.py tests/unit/test_helpers.py -v
Tier 2 (Impacted - 3m estimated):
pytest tests/unit/test_processor.py tests/unit/test_helpers.py \
tests/integration/test_pipeline.py -v
Tier 3 (Full - 12m estimated):
pytest
Recommendation: Start with Tier 1 for quick feedback.
Build or Refresh Mapping Cache
User: Build the test mapping for this project
Claude:
1. Scans all Python files
2. Builds import dependency graph
3. Maps source files to test files
4. Saves to .claude/data/test-mapping/code_to_tests.yaml
Check Test Reliability
User: Show flaky tests
Claude:
1. Reads reliability.yaml
2. Lists tests with reliability < 0.8
3. Suggests investigation or quarantine
Process
Step 1: Identify Changed Files
# For staged changes
git diff --cached --name-only --diff-filter=ACMR
# For all uncommitted changes
git diff --name-only --diff-filter=ACMR
# For PR changes (vs main)
git diff main...HEAD --name-only --diff-filter=ACMR
Filter to only Python source files (exclude tests themselves for mapping).
Step 2: Build Import Graph
For each Python file, extract imports:
# Patterns to detect:
import module
from module import item
from package.module import item
from . import relative
from ..parent import item
Build bidirectional mapping:
-
Forward: file -> what it imports
-
Reverse: file -> what imports it
Step 3: Map to Tests
For each changed file, find tests via:
-
Direct test match:
test_{filename}.pyor{filename}_test.py -
Import-based: Tests that import the changed module
-
Transitive: Tests that import modules that import changed module (1 level)
Step 4: Apply Reliability Filter
For Tier 1 only, exclude tests with reliability < 0.8.
Step 5: Generate Commands
Output pytest commands with appropriate markers:
# Tier 1
pytest -m "not slow and not integration" tests/a.py tests/b.py
# Tier 2
pytest tests/a.py tests/b.py tests/c.py
# Tier 3
pytest
Data Storage
code_to_tests.yaml
# .claude/data/test-mapping/code_to_tests.yaml
version: 1
last_updated: "2025-11-25T10:00:00Z"
mappings:
src/amplihack/core/processor.py:
direct_tests:
- tests/unit/test_processor.py
indirect_tests:
- tests/integration/test_pipeline.py
transitive_tests:
- tests/e2e/test_full_workflow.py
src/amplihack/utils/helpers.py:
direct_tests:
- tests/unit/test_helpers.py
indirect_tests:
- tests/unit/test_processor.py # processor imports helpers
reliability.yaml
# .claude/data/test-mapping/reliability.yaml
version: 1
last_updated: "2025-11-25T10:00:00Z"
tests:
tests/unit/test_processor.py::test_basic:
passes: 98
failures: 2
reliability: 0.98
last_failure: "2025-11-20"
tests/integration/test_api.py::test_timeout:
passes: 45
failures: 15
reliability: 0.75
last_failure: "2025-11-24"
flaky_reason: "Network dependent"
Integration with Workflow
This skill integrates with DEFAULT_WORKFLOW.md:
Step 12: Run Tests and Pre-commit Hooks
-
Use Tier 1 (fast) for pre-commit
-
Quick feedback on changed code
Step 13: Mandatory Local Testing
-
Use Tier 2 (impacted) before commit
-
Ensures affected code paths are tested
CI Pipeline
-
Use Tier 2 on draft PRs
-
Use Tier 3 (full) on ready-for-review PRs
Markers Integration
Works with existing pytest markers from pyproject.toml:
-
slow- Excluded from Tier 1 -
integration- Excluded from Tier 1 -
e2e- Excluded from Tier 1 and 2 -
neo4j- Requires special environment -
requires_docker- Requires Docker daemon
Quick Reference
| Pre-commit
| 1
| < 1 min
| pytest -m "not slow" [affected]
| Pre-push
| 2
| < 5 min
| pytest [affected + transitive]
| Draft PR
| 2
| < 5 min
| pytest [affected + transitive]
| Ready PR
| 3
| Full
| pytest
| CI main
| 3
| Full
| pytest
Philosophy Alignment
Ruthless Simplicity
-
Simple tier system (1, 2, 3)
-
YAML storage over database
-
Import analysis over complex AST parsing
Zero-BS Implementation
-
Real pytest commands (copy-paste ready)
-
Actual time estimates based on test count
-
No placeholder data or mock reliability scores
Testing Pyramid
-
Tier 1 prioritizes unit tests (60%)
-
Tier 2 adds integration tests (30%)
-
Tier 3 includes E2E tests (10%)
Complementary Skills
-
test-gap-analyzer: Identifies missing tests
-
outside-in-testing: Creates E2E test scenarios
-
tester agent: Writes new tests for gaps
-
pre-commit-diagnostic: Fixes pre-commit failures
Common Patterns
Pattern 1: Quick Iteration
[Developer makes small change]
Claude: Run affected tests (Tier 1)
[45 seconds later]
Claude: 3/3 tests passed. Ready for commit.
Pattern 2: Pre-Push Validation
[Developer about to push]
Claude: Run impacted tests (Tier 2)
[3 minutes later]
Claude: 12/12 tests passed including integrations.
Pattern 3: Flaky Test Investigation
User: Tests keep failing randomly
Claude: Checking reliability data...
Found 2 flaky tests (< 0.8 reliability):
- test_api_timeout (0.75) - Network dependent
- test_concurrent_write (0.68) - Race condition
Recommend: Quarantine these tests or fix root cause.
Limitations
-
Python-only import analysis
-
Single-level transitive analysis (deeper chains excluded)
-
Reliability data requires initial seeding from test runs
-
Does not detect dynamic imports or string-based imports
When to Avoid
Do NOT use smart-test when:
-
First time setting up tests - No mapping cache exists yet; run full suite first
-
Major refactoring - When module structure changes significantly, mappings become stale
-
Configuration changes - Changes to
pytest.ini,conftest.py, or fixtures affect all tests -
CI environment variables changed - Environment-dependent tests may all need re-running
-
Database schema migrations - All database-touching tests should run
-
Flaky test investigation - Run full suite to get accurate reliability data
-
Pre-merge final check - Always run Tier 3 (full suite) before merging to main
Rule of thumb: When in doubt, run the full suite. Smart-test optimizes iteration speed, not correctness.
Error Handling and Troubleshooting
Common Issues
Issue: "No tests found for changed file"
Cause: File is new or not yet mapped
Fix: Rebuild the mapping cache
User: "Rebuild test mapping cache"
Issue: "Import analysis failed"
Cause: Syntax error in Python file or circular imports
Fix: 1. Check file for syntax errors: python -m py_compile file.py
2. Resolve circular imports
3. Rebuild mapping cache
Issue: "Reliability data missing"
Cause: No test runs have been recorded yet
Fix: Run full test suite once, then:
User: "Update test reliability with these results"
Issue: "Tier 1 tests taking too long"
Cause: Too many tests marked as "fast" or slow tests not marked
Fix: 1. Add @pytest.mark.slow to tests > 1 second
2. Add @pytest.mark.integration to integration tests
3. Review test granularity
Issue: "Cache is stale / wrong tests selected"
Cause: Module structure changed since last cache build
Fix: Delete cache and rebuild:
rm -rf .claude/data/test-mapping/*.yaml
User: "Rebuild test mapping cache"
Recovery Commands
# Verify test mapping is valid
python -c "import yaml; yaml.safe_load(open('.claude/data/test-mapping/code_to_tests.yaml'))"
# Check reliability data
python -c "import yaml; print(yaml.safe_load(open('.claude/data/test-mapping/reliability.yaml')))"
# Force full suite (bypass smart-test)
pytest --ignore-glob='**/test_slow_*'
# Find tests with no source mapping (orphaned tests)
find tests -name "test_*.py" -exec basename {} \; | sort > /tmp/tests.txt
Cache Maintenance
The mapping cache should be rebuilt when:
-
New test files are added
-
Module structure changes significantly
-
Cache is older than 7 days
Trigger manually: "Rebuild test mapping cache"
Note: Start with Tier 1 for rapid feedback. If tests pass, you likely caught any regressions. Only escalate to higher tiers when approaching commit/push milestones.