Skip to content

Multi-Tenancy

Carbon Connect implements a pool model multi-tenancy architecture where all tenants share a single PostgreSQL database, with data isolation enforced through Row-Level Security (RLS) policies and application-level middleware. This approach balances operational simplicity with strong data isolation guarantees.


Architecture Overview

graph TB
    subgraph Request Flow
        REQ["Incoming Request<br/>(Bearer JWT)"]
        AUTH["JWT Authentication<br/>Decode token, validate user"]
        TENANT["Tenant Validation<br/>Verify tenant_id, check active"]
        HANDLER["Route Handler<br/>Business logic"]
    end

    subgraph Data Layer
        DB["PostgreSQL<br/>Single Database"]
        RLS["Row-Level Security<br/>tenant_id filtering"]
        S3P["S3 Prefix Isolation<br/>s3://bucket/tenant-{id}/"]
    end

    REQ --> AUTH
    AUTH --> TENANT
    TENANT --> HANDLER
    HANDLER --> DB
    DB --> RLS
    HANDLER --> S3P

Tenancy Model

Pool Model

All tenants share a single grant_engine database. Data isolation is enforced at two levels:

  1. Application Level: FastAPI middleware injects the authenticated user's tenant_id into every database query.
  2. Database Level: PostgreSQL Row-Level Security policies filter rows by tenant_id.

This approach was chosen over per-tenant schemas or per-tenant databases because:

  • Simpler operations: Single database to backup, migrate, and monitor
  • Cost-effective: No database-per-tenant overhead
  • Easier migrations: Alembic runs once against a single schema
  • Cross-tenant analytics: Platform-level aggregations are straightforward (when needed)

Tenant-Scoped vs Shared Tables

Scope Tables Isolation
Tenant-scoped users, companies, matches, applications, documents, notifications Filtered by tenant_id on every query
Shared grants, alembic_version Accessible to all tenants (grant data is global)

Grants are Shared

The grants table is intentionally not tenant-scoped. Grants are public funding opportunities scraped from external sources and shared across all tenants. Only matches and applications (which link companies to grants) are tenant-isolated.


Tenant Model

The Tenant model (in backend/app/models/database/tenant.py) represents an organization:

class Tenant(BaseModel):
    __tablename__ = "tenants"

    name: Mapped[str] = mapped_column(String(255), nullable=False)
    slug: Mapped[str] = mapped_column(String(100), unique=True, nullable=False, index=True)
    settings: Mapped[dict] = mapped_column(JSONB, default=dict)
    subscription_tier: Mapped[str] = mapped_column(
        String(50), default="explorer", nullable=False,
    )  # explorer, professional, enterprise
    is_active: Mapped[bool] = mapped_column(Boolean, default=True, index=True)

Subscription Tiers

Tier Description Features
explorer Free tier Basic matching, limited API calls
professional Paid tier Full matching, carbon profile, email notifications
enterprise Custom tier All features, partner API, white-label options

Middleware Implementation

Tenant isolation is enforced through FastAPI dependency injection in backend/app/api/deps.py. Every authenticated endpoint automatically receives the validated tenant context.

Authentication Chain

The authentication chain validates the JWT token, retrieves the user, and verifies the tenant in a single dependency chain:

async def get_current_user(
    token: Annotated[str, Depends(oauth2_scheme)],
    db: Annotated[AsyncSession, Depends(get_db)],
) -> User:
    """Get the current authenticated user from the JWT token."""
    # 1. Check token blacklist
    if is_token_blacklisted(token):
        raise HTTPException(status_code=401, detail="Could not validate credentials")

    # 2. Decode and validate JWT
    token_data = decode_token(token)
    if token_data is None or token_data.type != "access":
        raise HTTPException(status_code=401, detail="Could not validate credentials")

    # 3. Fetch user from database
    user = await get_user_by_email(db, email=token_data.sub)
    if user is None or not user.is_active:
        raise HTTPException(status_code=401, detail="User is inactive")

    # 4. Validate tenant is active
    result = await db.execute(select(Tenant).where(Tenant.id == user.tenant_id))
    tenant = result.scalar_one_or_none()
    if tenant is None or not tenant.is_active:
        raise HTTPException(status_code=403, detail="Tenant is inactive or does not exist")

    return user

Tenant Context Dependency

A separate dependency retrieves the full tenant object for endpoints that need it:

async def get_current_tenant(
    current_user: Annotated[User, Depends(get_current_user)],
    db: Annotated[AsyncSession, Depends(get_db)],
) -> Tenant:
    """Get the current tenant for the authenticated user."""
    result = await db.execute(select(Tenant).where(Tenant.id == current_user.tenant_id))
    tenant = result.scalar_one_or_none()

    if tenant is None:
        raise HTTPException(status_code=403, detail="Tenant not found")
    if not tenant.is_active:
        raise HTTPException(status_code=403, detail="Tenant is inactive")

    return tenant

Dependency Type Aliases

Clean type aliases simplify endpoint signatures:

DB = Annotated[AsyncSession, Depends(get_db)]
CurrentUser = Annotated[User, Depends(get_current_user)]
CurrentTenant = Annotated[Tenant, Depends(get_current_tenant)]
OptionalUser = Annotated[User | None, Depends(get_optional_user)]
AdminUser = Annotated[User, Depends(require_admin)]

Usage in Endpoints

Every tenant-scoped endpoint filters queries by tenant_id from the authenticated user:

@router.get("/companies")
async def list_companies(
    db: DB,
    current_user: CurrentUser,
    skip: int = 0,
    limit: int = 20,
):
    """List companies for the current tenant."""
    query = (
        select(Company)
        .where(Company.tenant_id == current_user.tenant_id)  # Tenant isolation
        .offset(skip)
        .limit(limit)
    )
    result = await db.execute(query)
    return result.scalars().all()

Cross-Tenant Prevention

Multiple safeguards prevent cross-tenant data access:

1. Application-Level Filtering

Every database query for tenant-scoped data includes a WHERE tenant_id = ? clause. The tenant_id is always derived from the authenticated user's JWT token -- it is never accepted as a request parameter.

2. Inactive Tenant Rejection

Requests from users belonging to inactive tenants are rejected with HTTP 403:

if tenant is None or not tenant.is_active:
    raise HTTPException(
        status_code=status.HTTP_403_FORBIDDEN,
        detail="Tenant is inactive or does not exist",
    )

3. Resource Ownership Validation

When accessing specific resources by ID, the endpoint verifies both that the resource exists and that it belongs to the current tenant:

@router.get("/companies/{company_id}")
async def get_company(company_id: UUID, db: DB, current_user: CurrentUser):
    result = await db.execute(
        select(Company).where(
            Company.id == company_id,
            Company.tenant_id == current_user.tenant_id,  # Ownership check
        )
    )
    company = result.scalar_one_or_none()
    if company is None:
        raise HTTPException(status_code=404, detail="Company not found")
    return company

4. Admin Role Gating

Administrative endpoints (partner management, platform-wide operations) require admin or owner role:

async def require_admin(
    current_user: Annotated[User, Depends(get_current_user)],
) -> User:
    if current_user.role not in ("admin", "owner"):
        raise HTTPException(status_code=403, detail="Admin privileges required")
    return current_user

S3 Storage Isolation

Document storage uses tenant-prefixed S3 paths to isolate files:

s3://grant-engine-documents/tenant-{tenant_id}/
    applications/
        {application_id}/
            application-form.pdf
            supporting-docs.zip
    companies/
        {company_id}/
            carbon-report.pdf
    exports/
        grant-matches-2026-01.csv

Key isolation rules:

  • All S3 operations include the tenant-{id}/ prefix derived from the authenticated user's tenant
  • Pre-signed URLs are scoped to specific object keys within the tenant prefix
  • There is no API endpoint that accepts arbitrary S3 paths
  • S3 bucket policies can enforce prefix-level access control as an additional layer

Database Indexes for Tenant Queries

Composite indexes optimize tenant-scoped queries:

Table Index Columns Query Pattern
users (tenant_id) List users for tenant
companies (tenant_id) List companies for tenant
matches (tenant_id) List matches for tenant
applications (tenant_id) List applications for tenant

Query Performance

All tenant-scoped queries hit the tenant_id index first, which drastically reduces the scan scope. For tables with millions of rows, the index ensures consistent sub-10ms query times regardless of total table size.


PostgreSQL Row-Level Security

In addition to application-level filtering, PostgreSQL RLS policies provide a database-level safety net. Even if application code has a bug that omits the tenant_id filter, RLS prevents data leakage.

Policy Example

-- Enable RLS on the companies table
ALTER TABLE companies ENABLE ROW LEVEL SECURITY;

-- Policy: users can only see rows matching their tenant
CREATE POLICY tenant_isolation_policy ON companies
    USING (tenant_id = current_setting('app.current_tenant_id')::uuid);

RLS Implementation Status

RLS policies are defined at the database level and supplement the application-level filtering. The application always sets the tenant context via SET LOCAL app.current_tenant_id = '{tenant_id}' at the start of each database session for RLS-enabled tables.


Tenant Lifecycle

Creation

New tenants are created automatically during user registration:

  1. User submits registration form with organization name
  2. Backend creates a Tenant record with subscription_tier = "explorer" and is_active = true
  3. Backend creates a User record with role = "owner" linked to the new tenant
  4. JWT tokens are issued for immediate access

Deactivation

Tenant deactivation (is_active = false) immediately blocks all API access for users in that tenant. Existing JWT tokens are rejected at the middleware level.

Data Retention

When a tenant is deactivated:

  • Data is retained in the database (soft delete)
  • API access is blocked
  • S3 objects remain but are inaccessible via API
  • Celery tasks for that tenant are skipped

Testing Multi-Tenancy

Tests verify tenant isolation by creating multiple tenants and confirming that users in one tenant cannot access resources from another:

async def test_cross_tenant_access_prevented(db, tenant_a_user, tenant_b_company):
    """Verify that Tenant A's user cannot access Tenant B's company."""
    response = await client.get(
        f"/api/v1/companies/{tenant_b_company.id}",
        headers={"Authorization": f"Bearer {tenant_a_user.token}"},
    )
    assert response.status_code == 404  # Not found, not 403

Return 404, Not 403

When a user requests a resource from another tenant, the system returns 404 Not Found rather than 403 Forbidden. This prevents information leakage -- the attacker cannot determine whether a resource exists in another tenant.