AI UT Code Review

Overview

Review AI-generated unit tests for effectiveness, coverage, assertions, negative cases, determinism, and maintainability. Output a 0-10 score, a risk level, and a must-fix checklist. Overall line coverage

must be >= 80%

; otherwise risk is at least High.

When to Use

AI-generated UT/test code review or quality evaluation

Need scoring, risk level, or must-fix checklist

Questions about coverage or assertion validity

Workflow

Confirm tests target the intended business code and key paths.

Check overall line coverage (>= 80% required).

Inspect assertions for behavioral validity; flag missing/ineffective assertions.

Verify negative/edge cases and determinism (no env/time dependency).

Score by rubric, assign risk, list must-fix items with evidence.

Scoring (0-10)

Each dimension 0-2 points. Sum = total score.

Dimension

0

1

2

Coverage

< 80%

80%+ but shallow

80%+ and meaningful

Assertion Quality

No/invalid assertions

Some weak assertions

Behavior-anchored assertions

Negative & Edge

Missing

Partial

Comprehensive

Data & Isolation

Flaky/env-dependent

Mixed

Deterministic, isolated

Maintainability

Hard to read/modify

Mixed quality

Clear structure & naming

Risk Levels

Blocker

Coverage < 80% AND key paths untested, or tests have no meaningful assertions

High

Coverage < 80% OR assertions largely ineffective

Medium

Coverage OK but weak edge cases or fragile design

Low

Minor improvements

Must-Fix Checklist

Overall line coverage >= 80%

Each test has at least one behavior-relevant assertion

Negative/exception cases exist for core logic

Tests are deterministic and repeatable

AI-Generated Test Pitfalls (Check Explicitly)

No assertions or assertions unrelated to behavior (e.g., only not-null)

Over-mocking hides real behavior

Only happy-path coverage

Tests depend on time/network/env

Missing verification of side effects

Output Format (Required, Semi-fixed)

Score

x/10 — Coverage x, Assertion Quality x, Negative & Edge x, Data & Isolation x, Maintainability x
Risk: Low/Medium/High/Blocker — 简述原因（1 行） Must-fix : [动作 + 证据] [动作 + 证据] Key Evidence : 引用具体测试用例名或覆盖率报告摘要（1-2 条） Notes : 最小修复建议或替代方案（1-2 行） Rules: 覆盖率 < 80% 风险至少 High，并必须列入 Must-fix 无断言/无效断言直接提升风险级别，必须列入 Must-fix 至少 2 条证据；证据不足需说明并降分 Common Mistakes 仅报告覆盖率，不评价断言有效性把日志输出当成断言忽略失败路径/异常路径 Example (Concise) Score: 5/10 (Coverage 1, Assertion 0, Negative 1, Data 2, Maintainability 1) Risk: High Must-fix: Tests for parseConfig() contain no behavior assertions (only logs) No negative cases for malformed input Key Evidence: parseConfig() tests only assert no crash Coverage report shows 62% lines Notes: Add assertions on outputs and side effects; add invalid input tests.

ai-generated-ut-code-review

安装