Multi-Tenancy¶
Carbon Connect implements a pool model multi-tenancy architecture where all tenants share a single PostgreSQL database, with data isolation enforced through Row-Level Security (RLS) policies and application-level middleware. This approach balances operational simplicity with strong data isolation guarantees.
Architecture Overview¶
graph TB
subgraph Request Flow
REQ["Incoming Request<br/>(Bearer JWT)"]
AUTH["JWT Authentication<br/>Decode token, validate user"]
TENANT["Tenant Validation<br/>Verify tenant_id, check active"]
HANDLER["Route Handler<br/>Business logic"]
end
subgraph Data Layer
DB["PostgreSQL<br/>Single Database"]
RLS["Row-Level Security<br/>tenant_id filtering"]
S3P["S3 Prefix Isolation<br/>s3://bucket/tenant-{id}/"]
end
REQ --> AUTH
AUTH --> TENANT
TENANT --> HANDLER
HANDLER --> DB
DB --> RLS
HANDLER --> S3P Tenancy Model¶
Pool Model¶
All tenants share a single grant_engine database. Data isolation is enforced at two levels:
- Application Level: FastAPI middleware injects the authenticated user's
tenant_idinto every database query. - Database Level: PostgreSQL Row-Level Security policies filter rows by
tenant_id.
This approach was chosen over per-tenant schemas or per-tenant databases because:
- Simpler operations: Single database to backup, migrate, and monitor
- Cost-effective: No database-per-tenant overhead
- Easier migrations: Alembic runs once against a single schema
- Cross-tenant analytics: Platform-level aggregations are straightforward (when needed)
Tenant-Scoped vs Shared Tables¶
| Scope | Tables | Isolation |
|---|---|---|
| Tenant-scoped | users, companies, matches, applications, documents, notifications | Filtered by tenant_id on every query |
| Shared | grants, alembic_version | Accessible to all tenants (grant data is global) |
Grants are Shared
The grants table is intentionally not tenant-scoped. Grants are public funding opportunities scraped from external sources and shared across all tenants. Only matches and applications (which link companies to grants) are tenant-isolated.
Tenant Model¶
The Tenant model (in backend/app/models/database/tenant.py) represents an organization:
class Tenant(BaseModel):
__tablename__ = "tenants"
name: Mapped[str] = mapped_column(String(255), nullable=False)
slug: Mapped[str] = mapped_column(String(100), unique=True, nullable=False, index=True)
settings: Mapped[dict] = mapped_column(JSONB, default=dict)
subscription_tier: Mapped[str] = mapped_column(
String(50), default="explorer", nullable=False,
) # explorer, professional, enterprise
is_active: Mapped[bool] = mapped_column(Boolean, default=True, index=True)
Subscription Tiers¶
| Tier | Description | Features |
|---|---|---|
explorer | Free tier | Basic matching, limited API calls |
professional | Paid tier | Full matching, carbon profile, email notifications |
enterprise | Custom tier | All features, partner API, white-label options |
Middleware Implementation¶
Tenant isolation is enforced through FastAPI dependency injection in backend/app/api/deps.py. Every authenticated endpoint automatically receives the validated tenant context.
Authentication Chain¶
The authentication chain validates the JWT token, retrieves the user, and verifies the tenant in a single dependency chain:
async def get_current_user(
token: Annotated[str, Depends(oauth2_scheme)],
db: Annotated[AsyncSession, Depends(get_db)],
) -> User:
"""Get the current authenticated user from the JWT token."""
# 1. Check token blacklist
if is_token_blacklisted(token):
raise HTTPException(status_code=401, detail="Could not validate credentials")
# 2. Decode and validate JWT
token_data = decode_token(token)
if token_data is None or token_data.type != "access":
raise HTTPException(status_code=401, detail="Could not validate credentials")
# 3. Fetch user from database
user = await get_user_by_email(db, email=token_data.sub)
if user is None or not user.is_active:
raise HTTPException(status_code=401, detail="User is inactive")
# 4. Validate tenant is active
result = await db.execute(select(Tenant).where(Tenant.id == user.tenant_id))
tenant = result.scalar_one_or_none()
if tenant is None or not tenant.is_active:
raise HTTPException(status_code=403, detail="Tenant is inactive or does not exist")
return user
Tenant Context Dependency¶
A separate dependency retrieves the full tenant object for endpoints that need it:
async def get_current_tenant(
current_user: Annotated[User, Depends(get_current_user)],
db: Annotated[AsyncSession, Depends(get_db)],
) -> Tenant:
"""Get the current tenant for the authenticated user."""
result = await db.execute(select(Tenant).where(Tenant.id == current_user.tenant_id))
tenant = result.scalar_one_or_none()
if tenant is None:
raise HTTPException(status_code=403, detail="Tenant not found")
if not tenant.is_active:
raise HTTPException(status_code=403, detail="Tenant is inactive")
return tenant
Dependency Type Aliases¶
Clean type aliases simplify endpoint signatures:
DB = Annotated[AsyncSession, Depends(get_db)]
CurrentUser = Annotated[User, Depends(get_current_user)]
CurrentTenant = Annotated[Tenant, Depends(get_current_tenant)]
OptionalUser = Annotated[User | None, Depends(get_optional_user)]
AdminUser = Annotated[User, Depends(require_admin)]
Usage in Endpoints¶
Every tenant-scoped endpoint filters queries by tenant_id from the authenticated user:
@router.get("/companies")
async def list_companies(
db: DB,
current_user: CurrentUser,
skip: int = 0,
limit: int = 20,
):
"""List companies for the current tenant."""
query = (
select(Company)
.where(Company.tenant_id == current_user.tenant_id) # Tenant isolation
.offset(skip)
.limit(limit)
)
result = await db.execute(query)
return result.scalars().all()
Cross-Tenant Prevention¶
Multiple safeguards prevent cross-tenant data access:
1. Application-Level Filtering¶
Every database query for tenant-scoped data includes a WHERE tenant_id = ? clause. The tenant_id is always derived from the authenticated user's JWT token -- it is never accepted as a request parameter.
2. Inactive Tenant Rejection¶
Requests from users belonging to inactive tenants are rejected with HTTP 403:
if tenant is None or not tenant.is_active:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Tenant is inactive or does not exist",
)
3. Resource Ownership Validation¶
When accessing specific resources by ID, the endpoint verifies both that the resource exists and that it belongs to the current tenant:
@router.get("/companies/{company_id}")
async def get_company(company_id: UUID, db: DB, current_user: CurrentUser):
result = await db.execute(
select(Company).where(
Company.id == company_id,
Company.tenant_id == current_user.tenant_id, # Ownership check
)
)
company = result.scalar_one_or_none()
if company is None:
raise HTTPException(status_code=404, detail="Company not found")
return company
4. Admin Role Gating¶
Administrative endpoints (partner management, platform-wide operations) require admin or owner role:
async def require_admin(
current_user: Annotated[User, Depends(get_current_user)],
) -> User:
if current_user.role not in ("admin", "owner"):
raise HTTPException(status_code=403, detail="Admin privileges required")
return current_user
S3 Storage Isolation¶
Document storage uses tenant-prefixed S3 paths to isolate files:
s3://grant-engine-documents/tenant-{tenant_id}/
applications/
{application_id}/
application-form.pdf
supporting-docs.zip
companies/
{company_id}/
carbon-report.pdf
exports/
grant-matches-2026-01.csv
Key isolation rules:
- All S3 operations include the
tenant-{id}/prefix derived from the authenticated user's tenant - Pre-signed URLs are scoped to specific object keys within the tenant prefix
- There is no API endpoint that accepts arbitrary S3 paths
- S3 bucket policies can enforce prefix-level access control as an additional layer
Database Indexes for Tenant Queries¶
Composite indexes optimize tenant-scoped queries:
| Table | Index Columns | Query Pattern |
|---|---|---|
users | (tenant_id) | List users for tenant |
companies | (tenant_id) | List companies for tenant |
matches | (tenant_id) | List matches for tenant |
applications | (tenant_id) | List applications for tenant |
Query Performance
All tenant-scoped queries hit the tenant_id index first, which drastically reduces the scan scope. For tables with millions of rows, the index ensures consistent sub-10ms query times regardless of total table size.
PostgreSQL Row-Level Security¶
In addition to application-level filtering, PostgreSQL RLS policies provide a database-level safety net. Even if application code has a bug that omits the tenant_id filter, RLS prevents data leakage.
Policy Example¶
-- Enable RLS on the companies table
ALTER TABLE companies ENABLE ROW LEVEL SECURITY;
-- Policy: users can only see rows matching their tenant
CREATE POLICY tenant_isolation_policy ON companies
USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
RLS Implementation Status
RLS policies are defined at the database level and supplement the application-level filtering. The application always sets the tenant context via SET LOCAL app.current_tenant_id = '{tenant_id}' at the start of each database session for RLS-enabled tables.
Tenant Lifecycle¶
Creation¶
New tenants are created automatically during user registration:
- User submits registration form with organization name
- Backend creates a
Tenantrecord withsubscription_tier = "explorer"andis_active = true - Backend creates a
Userrecord withrole = "owner"linked to the new tenant - JWT tokens are issued for immediate access
Deactivation¶
Tenant deactivation (is_active = false) immediately blocks all API access for users in that tenant. Existing JWT tokens are rejected at the middleware level.
Data Retention¶
When a tenant is deactivated:
- Data is retained in the database (soft delete)
- API access is blocked
- S3 objects remain but are inaccessible via API
- Celery tasks for that tenant are skipped
Testing Multi-Tenancy¶
Tests verify tenant isolation by creating multiple tenants and confirming that users in one tenant cannot access resources from another:
async def test_cross_tenant_access_prevented(db, tenant_a_user, tenant_b_company):
"""Verify that Tenant A's user cannot access Tenant B's company."""
response = await client.get(
f"/api/v1/companies/{tenant_b_company.id}",
headers={"Authorization": f"Bearer {tenant_a_user.token}"},
)
assert response.status_code == 404 # Not found, not 403
Return 404, Not 403
When a user requests a resource from another tenant, the system returns 404 Not Found rather than 403 Forbidden. This prevents information leakage -- the attacker cannot determine whether a resource exists in another tenant.