NEW: Claude Code Security — research preview

Test-Driven Development with AI

Write failing tests first, then let Claude Code, Copilot, or Cursor implement — the AI TDD feedback loop

Read time: 12 min

title: "Test-Driven Development with AI" description: "Write failing tests first, then let Claude Code, Copilot, or Cursor implement — the AI TDD feedback loop" section: "Workflows" readTime: "12 min"

Test-Driven Development with AI

TDD and AI agents are a natural pair. You write the spec (the test); the agent writes the implementation. The test suite becomes the ground truth that keeps the agent from hallucinating.

Why TDD Works Better with AI

Traditional AI code generation has a key failure mode: the agent produces code that looks right but doesn't actually work. Tests expose this immediately. When you adopt a TDD workflow:

  • The test file is the unambiguous specification — no prose misinterpretation
  • Each red→green cycle gives the agent a tight feedback loop
  • The agent can run the tests itself and self-correct without your involvement
  • Regressions are caught before you ever review the output

The Core Loop

You write: failing test
     ↓
Agent runs: npm test → sees RED
     ↓
Agent writes: minimal implementation
     ↓
Agent runs: npm test → sees GREEN
     ↓
Agent refactors (optional)
     ↓
Repeat for next test

Claude Code TDD

Setup: Give Claude the Test Harness

# Describe the loop explicitly in your prompt
claude "I want to use TDD. I'll write failing tests in __tests__/. 
You should run npm test after each change and keep iterating until all tests pass. 
Never skip a test or mark it as skip/todo to make it pass."

YOLO Mode for Autonomous TDD

Enable non-interactive mode so Claude runs tests without asking permission:

claude --dangerously-skip-permissions "Implement the UserService class to pass all tests in __tests__/user.test.ts"

Or configure in settings.json:

{
  "permissions": {
    "allow": ["Bash(npm test)", "Bash(npm run test:*)", "Edit(**/src/**)"]
  }
}

Example: Feature Implementation via TDD

# 1. Write your test first
cat > __tests__/auth.test.ts << 'EOF'
describe('hashPassword', () => {
  it('returns a bcrypt hash', async () => {
    const hash = await hashPassword('mypassword');
    expect(hash).toMatch(/^\$2[ab]\$/);
  });
 
  it('different calls produce different hashes', async () => {
    const h1 = await hashPassword('same');
    const h2 = await hashPassword('same');
    expect(h1).not.toBe(h2);
  });
});
EOF
 
# 2. Hand off to Claude
claude "Implement hashPassword in src/lib/auth.ts to pass all tests in __tests__/auth.test.ts. 
Run npm test to verify. Use bcryptjs."

Handling Test Failures Mid-Session

If Claude gets stuck in a loop, give it a constraint:

The test has been failing for 3 attempts. Stop and explain what's blocking you 
before trying again.

GitHub Copilot TDD

Inline Suggestions from Tests

Copilot is context-aware — write the test file first and open the implementation file side-by-side. Ghost text suggestions will be based on what Copilot sees in the test.

// auth.test.ts — write this first
import { hashPassword, verifyPassword } from './auth';
 
test('hashPassword produces bcrypt hash', async () => {
  const hash = await hashPassword('secret');
  expect(hash).toMatch(/^\$2[ab]\$/);
});
 
// auth.ts — open this file, Copilot suggests the implementation
export async function hashPassword(password: string): Promise<string> {
  // Copilot auto-completes here based on test context
}

Agent Mode TDD

In Copilot Chat's agent mode (Ctrl+Shift+I):

@workspace I've written failing tests in src/__tests__/cart.test.ts.
Implement the CartService class in src/services/cart.ts to make all tests pass.
Run the tests to verify before finishing.

Copilot agent will use the #runCommand tool to execute npm test and iterate.

Cursor TDD

YOLO + Hooks Pattern

Cursor's YOLO mode (auto-run terminal commands) combined with a hook makes TDD fully autonomous:

// .cursor/hooks/post-edit.json
{
  "event": "PostEdit",
  "command": "npm test --passWithNoTests 2>&1 | tail -20"
}

After every file edit, the test output is injected into Cursor's context automatically.

Cursor Composer TDD

Write a failing test for a `parseCSV(input: string): Row[]` function, 
then implement it. Use the test output to guide your implementation. 
All edge cases (empty string, quoted commas, newlines in fields) 
must be covered.

Jest Configuration for AI TDD

// jest.config.js — optimized for AI feedback loops
export default {
  testEnvironment: 'node',
  // Short output — AI reads terminal output, keep it concise
  verbose: false,
  // Fail fast so agent doesn't spin through 50 failing tests
  bail: 3,
  // Clear mocks between tests to avoid state bleed
  clearMocks: true,
  // Coverage thresholds to enforce quality
  coverageThreshold: {
    global: { branches: 80, functions: 90, lines: 90 },
  },
};

Antipatterns to Avoid

AntipatternWhy it breaks AI TDD
Writing tests after implementationRemoves the specification value; agent writes tests to match its own code
Letting agent mark tests as .skipAgent "cheats" to make the suite pass
Huge test filesAgent loses context; split into focused files per module
No bail configAgent runs 200 failing tests on every iteration, burns tokens
Testing implementation detailsTests break on refactor; prefer behavior-based assertions

Checklist