Testing Philosophy¶

Test-driven development is non-negotiable in Carbon Connect. Every requirement must be captured as an automated test before writing production code.

TDD Requirement¶

The project enforces TDD through both process and automation:

Write tests first -- Capture every requirement as a failing test
Demonstrate red-green-refactor -- Show the failing test, make it pass, then refactor
PR enforcement -- The PR TDD Gate GitHub Action blocks merge if TDD evidence is missing
PR template -- Every PR description must include a ## TDD Proof section with checked items

PR TDD Checklist¶

Every pull request must confirm:

Tests written before production code
Evidence of failing test run included
All tests passing locally
Docs/configs updated
No secrets introduced

Real Integration Over Mocking¶

Core Principle

Prefer real database connections over mocked database operations. Only mock external services.

What to Test Against Real Infrastructure¶

Database operations -- Use pytest-asyncio fixtures with actual PostgreSQL connections
SQLAlchemy queries -- Use async_sessionmaker for proper async transaction management
Business logic -- Test complete service methods with real database records
API endpoints -- Use httpx.AsyncClient with the real FastAPI app

What to Mock¶

External APIs -- Claude, Climatiq, CORDIS, EU Portal, Innovate UK, etc.
Network calls -- Any HTTP request to external services
Time/datetime -- For deterministic deadline and recency tests
File system -- Only when testing file I/O error paths

Mock Rules¶

Rule 1: Never Mock Internal Services¶

# WRONG: Mocking internal database operations
mock_db = MagicMock()
mock_db.execute = AsyncMock(return_value=mock_result)

# CORRECT: Use a real database session from fixtures
async def test_create_company(db_session):
    service = CompanyService(db_session)
    company = await service.create(company_data)
    assert company.name == "Test Corp"

Rule 2: Mock External Clients Completely¶

When mocking external APIs, set explicit values for every attribute that will be checked:

# CORRECT: All attributes explicitly set
mock_client = AsyncMock()
mock_response = MagicMock()
mock_response.text = "Generated content"
mock_response.prompt_feedback = None  # Prevent MagicMock truthy values
mock_response.total_tokens = 500     # Explicit int, not MagicMock
mock_response.model = "claude-sonnet-4-20250514"  # Explicit string
mock_client.generate = AsyncMock(return_value=mock_response)

# WRONG: Partial mocking (attributes default to MagicMock objects)
mock_response = MagicMock()  # All attributes are MagicMock!
mock_response.text = "Content"  # Only text is set
# mock_response.total_tokens is MagicMock (truthy, not an int)
# mock_response.prompt_feedback is MagicMock (truthy, triggers errors)

Rule 3: Use side_effect for Error Simulation¶

# Simulate API errors
mock_client.generate = AsyncMock(
    side_effect=anthropic.APIError("Rate limited")
)

# Verify error wrapping
with pytest.raises(ServiceError) as exc_info:
    await service.generate_content(prompt)
assert "Rate limited" in str(exc_info.value)

The MagicMock Truthy Trap¶

Common Pitfall

MagicMock attributes are always truthy. This causes subtle bugs in code that checks for None or falsy values.

Problem¶

mock = MagicMock()

# These are all True, even though you might expect False/None
bool(mock.prompt_feedback)        # True (it's a MagicMock object)
bool(mock.prompt_feedback.block_reason)  # True
mock.total_tokens > 0             # Unpredictable comparison

Solution¶

Always set explicit values for attributes that will be evaluated:

mock = MagicMock()
mock.prompt_feedback = None       # Explicitly None
mock.total_tokens = 500           # Explicit integer
mock.block_reason = None          # Explicitly None

Test Isolation¶

Each test gets its own database transaction that is automatically rolled back:

@pytest_asyncio.fixture(scope="function")
async def db_session():
    """Provide a database session with automatic rollback."""
    async with async_engine.connect() as connection:
        async with connection.begin() as transaction:
            session = AsyncSession(bind=connection)
            yield session
            await transaction.rollback()

Rules for Isolation¶

Use pytest_asyncio.fixture(scope="function") for database fixtures
Clean up test data in fixture teardown, not in individual tests
Never share mutable state between tests
Each test should create its own test data

Naming Convention¶

Test names follow the pattern:

test_<feature>_should_<expected_behavior>()

Examples:

def test_matching_engine_should_return_zero_for_country_mismatch():
    ...

def test_grant_search_should_filter_by_carbon_categories():
    ...

def test_auth_login_should_reject_invalid_password():
    ...

Running Tests¶

# Run all backend tests
poetry run pytest tests/ -v

# Run specific test file
poetry run pytest tests/unit/services/test_matching_engine.py -v

# Run with coverage
poetry run pytest --cov=backend --cov-report=html

# Run only unit tests (exclude e2e)
poetry run pytest -m "not e2e"

# Run with maximum 1 failure
poetry run pytest --maxfail=1

# Run tests matching a pattern
poetry run pytest -k "test_carbon" -v