Reviews tests for authenticity and coverage. Validates tests are real validations, not stubs or mocks. Ensures TDD principles are followed. Used by the Orchestrator after the Coder completes tasks.
This skill inherits all available tools. When active, it can use any tool Claude has access to.
Review tests to ensure they are authentic validations, not superficial stubs or mocks. This skill validates that TDD principles are followed and tests provide real confidence in the implementation.
Purpose: Ensure tests actually test something meaningful and provide real validation.
Input:
Output:
Announce at start: "I'm using the test-reviewer skill to validate test authenticity and coverage."
The fundamental question: Does this test give us confidence the code works?
Real tests:
// REAL: Tests actual behavior
it('should hash password and verify correctly', async () => {
const password = 'securePassword123';
const hash = await hashPassword(password);
// Multiple meaningful assertions
expect(hash).not.toBe(password); // Hash is different
expect(hash.length).toBeGreaterThan(50); // Hash has expected length
expect(await verifyPassword(password, hash)).toBe(true); // Verification works
expect(await verifyPassword('wrong', hash)).toBe(false); // Wrong password fails
});
Fake tests:
// FAKE: Only checks it doesn't throw
it('should hash password', () => {
expect(() => hashPassword('test')).not.toThrow();
});
// FAKE: Mocks the thing being tested
it('should hash password', () => {
const mockHash = jest.fn().mockReturnValue('hashed');
expect(mockHash('password')).toBe('hashed'); // Tests the mock, not the code
});
// FAKE: Tautological
it('should return user', () => {
const user = { id: 1, name: 'Test' };
const result = user; // No actual code called
expect(result).toBe(user);
});
Verify tests are real validations:
Key question: If the implementation was completely wrong, would this test catch it?
Verify tests cover what matters:
Key question: What scenarios are NOT tested that should be?
Verify tests are well-organized:
Verify mocks are used appropriately:
Acceptable mocking:
Unacceptable mocking:
The code being tested
Core logic that should be validated
Everything (over-mocking)
Mocks are only for external dependencies
Core logic is tested with real implementations
Mocks verify interactions, not just return values
Integration tests use real implementations
Verify TDD principles were followed:
Signs of tests-after (bad):
Signs of TDD (good):
List all tests and what they claim to test:
For each test:
Compare tests to requirements:
Check for edge case coverage:
Deliver the review result.
## Test Review: APPROVED
**Task:** [Task name]
**Test files reviewed:** [List]
**Test count:** [N] tests
### Coverage Summary
| Requirement | Test(s) |
|-------------|---------|
| [Requirement 1] | `should do X`, `should handle Y error` |
| [Requirement 2] | `should validate Z` |
### Highlights
- [Good testing pattern worth noting]
- [Thorough edge case coverage in specific area]
### Minor Suggestions (optional, non-blocking)
- [Could add test for edge case X, but not blocking]
**Verdict:** Tests provide real validation. Ready for code review.
## Test Review: CHANGES_REQUESTED
**Task:** [Task name]
**Test files reviewed:** [List]
**Test count:** [N] tests
### Issues (must fix)
#### Issue 1: [Category] - [Brief description]
**Location:** `path/to/test.ts:15-25`
**Problem:**
[Description of why this test is insufficient]
**Current test:**
```typescript
[The problematic test]
Required change: [What the test should do instead]
Example fix:
[Corrected test]
[Same structure]
[N] test issues must be addressed. [M] coverage gaps must be filled.
Verdict: Return to Coder for test improvements.
## Issue Categories
| Category | Examples |
|----------|----------|
| **FAKE_TEST** | Only checks no-throw, tests mock not code, tautological |
| **MISSING_COVERAGE** | Requirement not tested, edge case not covered |
| **OVER_MOCKED** | Mocks core logic that should be tested |
| **POOR_ASSERTIONS** | Weak assertions that don't verify behavior |
| **STRUCTURE** | Unclear names, missing describe blocks, shared state |
## Common Fake Test Patterns
### 1. The No-Throw Test
```typescript
// BAD
it('should process data', () => {
expect(() => processData(input)).not.toThrow();
});
Problem: Only verifies code runs, not what it does.
Fix: Assert on the output.
// BAD
it('should save user', async () => {
const mockSave = jest.fn().mockResolvedValue({ id: 1 });
const result = await mockSave(user);
expect(mockSave).toHaveBeenCalledWith(user);
});
Problem: Tests the mock, not the actual save logic.
Fix: Test with real implementation or integration test.
// BAD
it('should return the input', () => {
const input = { foo: 'bar' };
const output = input; // No function called!
expect(output).toBe(input);
});
Problem: Proves nothing about the code.
Fix: Actually call the code being tested.
// BAD - mirrors implementation instead of testing behavior
it('should call helper then format then save', () => {
// This test will break if we refactor, even if behavior is correct
});
Problem: Tests how, not what. Fragile to refactoring.
Fix: Test outcomes and behavior, not internal steps.
// BAD
it('should update user', async () => {
await updateUser(1, { name: 'New' });
// No assertions!
});
Problem: No verification at all.
Fix: Assert on the result or side effects.
These always require changes: