Backend

Scaling FastAPI for Millions of Records

12 min read January 2025

When building a real-time trading analytics platform, performance isn't just a nice-to-have - it's essential. Traders need instant access to market data, and any delay can mean missed opportunities. Here's how we scaled our FastAPI backend to handle millions of records efficiently.

5M+

Records Processed Daily

<50ms

Average Response Time

99.9%

Uptime SLA

The Problem

Our trading platform needed to ingest real-time market data from multiple exchanges, process it, store it, and serve analytics queries - all with minimal latency. Initial benchmarks showed response times of 2-3 seconds for complex queries, which was unacceptable.

The Tech Stack

FastAPI PostgreSQL TimescaleDB Redis Celery Docker Kubernetes

Optimization Strategies

1. Async All The Way

FastAPI's async support is powerful, but only if you use it correctly. We ensured all I/O operations were truly async, from database queries to external API calls.

from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select

async def get_trading_data(
    session: AsyncSession,
    symbol: str,
    start_date: datetime,
    end_date: datetime
) -> list[TradingRecord]:
    query = select(TradingRecord).where(
        TradingRecord.symbol == symbol,
        TradingRecord.timestamp.between(start_date, end_date)
    )
    result = await session.execute(query)
    return result.scalars().all()

2. Database Optimization with TimescaleDB

For time-series data, PostgreSQL alone wasn't enough. We integrated TimescaleDB, which automatically partitions data by time intervals and provides optimized queries for time-based analytics.

-- Convert regular table to hypertable
SELECT create_hypertable('trading_records', 'timestamp');

-- Create continuous aggregates for common queries
CREATE MATERIALIZED VIEW daily_ohlcv
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 day', timestamp) AS bucket,
    symbol,
    first(open, timestamp) AS open,
    max(high) AS high,
    min(low) AS low,
    last(close, timestamp) AS close,
    sum(volume) AS volume
FROM trading_records
GROUP BY bucket, symbol;

3. Strategic Caching with Redis

Not all data needs to be fetched from the database every time. We implemented a multi-layer caching strategy:

Hot data cache: Frequently accessed symbols and recent data
Computed aggregates: Pre-calculated statistics and metrics
Query result cache: Results of expensive analytical queries

import redis.asyncio as redis
from fastapi_cache import FastAPICache
from fastapi_cache.decorator import cache

@app.get("/api/v1/analytics/{symbol}")
@cache(expire=60)  # Cache for 60 seconds
async def get_analytics(symbol: str):
    # This result will be cached
    return await compute_analytics(symbol)

4. Connection Pooling

Database connections are expensive. We configured connection pools carefully to balance between connection reuse and avoiding contention:

from sqlalchemy.ext.asyncio import create_async_engine

engine = create_async_engine(
    DATABASE_URL,
    pool_size=20,
    max_overflow=10,
    pool_pre_ping=True,
    pool_recycle=3600
)

5. Background Processing with Celery

Heavy computations don't belong in request handlers. We offloaded complex analytics to Celery workers, allowing the API to respond quickly while processing happens in the background.

"The fastest API call is the one that doesn't wait for heavy processing to complete."

6. Efficient Serialization

JSON serialization can be a bottleneck when dealing with large datasets. We switched to orjson for a significant performance boost:

from fastapi.responses import ORJSONResponse

app = FastAPI(default_response_class=ORJSONResponse)

Results

After implementing these optimizations, we achieved:

Response times dropped from 2-3 seconds to under 50ms for most queries
Throughput increased by 10x, handling 1000+ requests per second
Database load reduced by 70% thanks to caching
99.9% uptime with automatic failover

Lessons Learned

Measure before optimizing: Profile your code to find actual bottlenecks
Choose the right database: TimescaleDB was a game-changer for our use case
Cache strategically: Not everything needs caching, but the right things do
Scale horizontally: Kubernetes made it easy to add more instances as needed
Monitor everything: Prometheus and Grafana helped us catch issues early

Conclusion

Scaling FastAPI for millions of records is achievable with the right architecture and tools. The key is to understand your data access patterns and optimize accordingly. Don't prematurely optimize - measure, identify bottlenecks, and address them systematically.

If you're building high-performance APIs with Python, FastAPI is an excellent choice. Combined with async programming, proper database optimization, and strategic caching, you can build systems that rival those written in traditionally "faster" languages.