Testing Philosophy¶
Test-driven development is non-negotiable in Carbon Connect. Every requirement must be captured as an automated test before writing production code.
TDD Requirement¶
The project enforces TDD through both process and automation:
- Write tests first -- Capture every requirement as a failing test
- Demonstrate red-green-refactor -- Show the failing test, make it pass, then refactor
- PR enforcement -- The
PR TDD GateGitHub Action blocks merge if TDD evidence is missing - PR template -- Every PR description must include a
## TDD Proofsection with checked items
PR TDD Checklist¶
Every pull request must confirm:
- Tests written before production code
- Evidence of failing test run included
- All tests passing locally
- Docs/configs updated
- No secrets introduced
Real Integration Over Mocking¶
Core Principle
Prefer real database connections over mocked database operations. Only mock external services.
What to Test Against Real Infrastructure¶
- Database operations -- Use
pytest-asynciofixtures with actual PostgreSQL connections - SQLAlchemy queries -- Use
async_sessionmakerfor proper async transaction management - Business logic -- Test complete service methods with real database records
- API endpoints -- Use
httpx.AsyncClientwith the real FastAPI app
What to Mock¶
- External APIs -- Claude, Climatiq, CORDIS, EU Portal, Innovate UK, etc.
- Network calls -- Any HTTP request to external services
- Time/datetime -- For deterministic deadline and recency tests
- File system -- Only when testing file I/O error paths
Mock Rules¶
Rule 1: Never Mock Internal Services¶
# WRONG: Mocking internal database operations
mock_db = MagicMock()
mock_db.execute = AsyncMock(return_value=mock_result)
# CORRECT: Use a real database session from fixtures
async def test_create_company(db_session):
service = CompanyService(db_session)
company = await service.create(company_data)
assert company.name == "Test Corp"
Rule 2: Mock External Clients Completely¶
When mocking external APIs, set explicit values for every attribute that will be checked:
# CORRECT: All attributes explicitly set
mock_client = AsyncMock()
mock_response = MagicMock()
mock_response.text = "Generated content"
mock_response.prompt_feedback = None # Prevent MagicMock truthy values
mock_response.total_tokens = 500 # Explicit int, not MagicMock
mock_response.model = "claude-sonnet-4-20250514" # Explicit string
mock_client.generate = AsyncMock(return_value=mock_response)
# WRONG: Partial mocking (attributes default to MagicMock objects)
mock_response = MagicMock() # All attributes are MagicMock!
mock_response.text = "Content" # Only text is set
# mock_response.total_tokens is MagicMock (truthy, not an int)
# mock_response.prompt_feedback is MagicMock (truthy, triggers errors)
Rule 3: Use side_effect for Error Simulation¶
# Simulate API errors
mock_client.generate = AsyncMock(
side_effect=anthropic.APIError("Rate limited")
)
# Verify error wrapping
with pytest.raises(ServiceError) as exc_info:
await service.generate_content(prompt)
assert "Rate limited" in str(exc_info.value)
The MagicMock Truthy Trap¶
Common Pitfall
MagicMock attributes are always truthy. This causes subtle bugs in code that checks for None or falsy values.
Problem¶
mock = MagicMock()
# These are all True, even though you might expect False/None
bool(mock.prompt_feedback) # True (it's a MagicMock object)
bool(mock.prompt_feedback.block_reason) # True
mock.total_tokens > 0 # Unpredictable comparison
Solution¶
Always set explicit values for attributes that will be evaluated:
mock = MagicMock()
mock.prompt_feedback = None # Explicitly None
mock.total_tokens = 500 # Explicit integer
mock.block_reason = None # Explicitly None
Test Isolation¶
Each test gets its own database transaction that is automatically rolled back:
@pytest_asyncio.fixture(scope="function")
async def db_session():
"""Provide a database session with automatic rollback."""
async with async_engine.connect() as connection:
async with connection.begin() as transaction:
session = AsyncSession(bind=connection)
yield session
await transaction.rollback()
Rules for Isolation¶
- Use
pytest_asyncio.fixture(scope="function")for database fixtures - Clean up test data in fixture teardown, not in individual tests
- Never share mutable state between tests
- Each test should create its own test data
Naming Convention¶
Test names follow the pattern:
Examples:
def test_matching_engine_should_return_zero_for_country_mismatch():
...
def test_grant_search_should_filter_by_carbon_categories():
...
def test_auth_login_should_reject_invalid_password():
...
Running Tests¶
# Run all backend tests
poetry run pytest tests/ -v
# Run specific test file
poetry run pytest tests/unit/services/test_matching_engine.py -v
# Run with coverage
poetry run pytest --cov=backend --cov-report=html
# Run only unit tests (exclude e2e)
poetry run pytest -m "not e2e"
# Run with maximum 1 failure
poetry run pytest --maxfail=1
# Run tests matching a pattern
poetry run pytest -k "test_carbon" -v