Test what the code does (outputs, side effects), not how it does it (internal method calls, private state). Implementation-coupled tests break every time you refactor, even when behavior is unchanged — making tests a liability instead of a safety net.
Why This Matters
Implementation-coupled tests assert on internal details: which private methods were called, what internal state was set, how many times a dependency was invoked. When you refactor the internals without changing behavior, these tests fail — even though the code is correct. This creates a perverse incentive to avoid refactoring because "the tests will break." Tests that verify behavior (given input X, expect output Y) survive refactoring and catch real bugs.
The purpose of tests is to give you confidence that your code works correctly. Implementation-coupled tests undermine this purpose:
Refactoring breaks tests: you rename an internal function or change how a result is computed, and 20 tests fail — even though the public API and behavior are identical. Developers spend hours updating tests that didn't catch any bugs.
False confidence: tests pass because the implementation matches the test's expectations, not because the behavior is correct. If you assert "mock was called 3 times" and the bug is in what happens with the result, the test passes while the bug ships.
Refactoring fear: teams stop improving code structure because the cost of updating implementation-coupled tests exceeds the benefit of the refactoring.
Behavior-focused tests verify inputs and outputs. They survive refactoring, catch real regressions, and give genuine confidence.
The rule
Test the public API of your code: given specific inputs, assert on specific outputs or observable side effects. Don't assert on internal state, private method calls, or the number of times a dependency was invoked — unless the invocation count is the behavior being tested.
Bad example
// BAD: testing implementation detailsdescribe("UserService", () => { it("creates a user", async () => { const mockRepo = { save: vi.fn().mockResolvedValue({ id: "1", name: "Alice" }), findByEmail: vi.fn().mockResolvedValue(null), }; const mockHasher = { hash: vi.fn().mockResolvedValue("hashed-password"), }; const service = new UserService(mockRepo, mockHasher); await service.createUser({ name: "Alice", email: "a@b.com", password: "pass" }); // Testing HOW it works, not WHAT it does expect(mockRepo.findByEmail).toHaveBeenCalledWith("a@b.com"); expect(mockHasher.hash).toHaveBeenCalledWith("pass"); expect(mockRepo.save).toHaveBeenCalledTimes(1); expect(mockRepo.save).toHaveBeenCalledWith({ name: "Alice", email: "a@b.com", passwordHash: "hashed-password", }); });});
Good example
// GOOD: testing behavior — what goes in and what comes outdescribe("UserService", () => { it("creates a user and returns their profile", async () => { const service = createTestUserService(); // uses real or in-memory implementations const user = await service.createUser({ name: "Alice", email: "a@b.com", password: "pass", }); // Testing WHAT it does expect(user.name).toBe("Alice"); expect(user.email).toBe("a@b.com"); expect(user.id).toBeDefined(); }); it("rejects duplicate email addresses", async () => { const service = createTestUserService(); await service.createUser({ name: "Alice", email: "a@b.com", password: "pass" }); // Testing BEHAVIOR — the observable outcome await expect( service.createUser({ name: "Bob", email: "a@b.com", password: "pass" }) ).rejects.toThrow("Email already registered"); }); it("does not store the password in plain text", async () => { const service = createTestUserService(); const user = await service.createUser({ name: "Alice", email: "a@b.com", password: "pass", }); // This is behavior, not implementation — we care about the security property const stored = await testDb.users.findById(user.id); expect(stored.passwordHash).not.toBe("pass"); });});