Scaling FastAPI for Millions of Records
When building a real-time trading analytics platform, performance isn't just a nice-to-have - it's essential. Traders need instant access to market data, and any delay can mean missed opportunities. Here's how we scaled our FastAPI backend to handle millions of records efficiently.
The Problem
Our trading platform needed to ingest real-time market data from multiple exchanges, process it, store it, and serve analytics queries - all with minimal latency. Initial benchmarks showed response times of 2-3 seconds for complex queries, which was unacceptable.
The Tech Stack
Optimization Strategies
1. Async All The Way
FastAPI's async support is powerful, but only if you use it correctly. We ensured all I/O operations were truly async, from database queries to external API calls.
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
async def get_trading_data(
session: AsyncSession,
symbol: str,
start_date: datetime,
end_date: datetime
) -> list[TradingRecord]:
query = select(TradingRecord).where(
TradingRecord.symbol == symbol,
TradingRecord.timestamp.between(start_date, end_date)
)
result = await session.execute(query)
return result.scalars().all()
2. Database Optimization with TimescaleDB
For time-series data, PostgreSQL alone wasn't enough. We integrated TimescaleDB, which automatically partitions data by time intervals and provides optimized queries for time-based analytics.
-- Convert regular table to hypertable
SELECT create_hypertable('trading_records', 'timestamp');
-- Create continuous aggregates for common queries
CREATE MATERIALIZED VIEW daily_ohlcv
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 day', timestamp) AS bucket,
symbol,
first(open, timestamp) AS open,
max(high) AS high,
min(low) AS low,
last(close, timestamp) AS close,
sum(volume) AS volume
FROM trading_records
GROUP BY bucket, symbol;
3. Strategic Caching with Redis
Not all data needs to be fetched from the database every time. We implemented a multi-layer caching strategy:
- Hot data cache: Frequently accessed symbols and recent data
- Computed aggregates: Pre-calculated statistics and metrics
- Query result cache: Results of expensive analytical queries
import redis.asyncio as redis
from fastapi_cache import FastAPICache
from fastapi_cache.decorator import cache
@app.get("/api/v1/analytics/{symbol}")
@cache(expire=60) # Cache for 60 seconds
async def get_analytics(symbol: str):
# This result will be cached
return await compute_analytics(symbol)
4. Connection Pooling
Database connections are expensive. We configured connection pools carefully to balance between connection reuse and avoiding contention:
from sqlalchemy.ext.asyncio import create_async_engine
engine = create_async_engine(
DATABASE_URL,
pool_size=20,
max_overflow=10,
pool_pre_ping=True,
pool_recycle=3600
)
5. Background Processing with Celery
Heavy computations don't belong in request handlers. We offloaded complex analytics to Celery workers, allowing the API to respond quickly while processing happens in the background.
"The fastest API call is the one that doesn't wait for heavy processing to complete."
6. Efficient Serialization
JSON serialization can be a bottleneck when dealing with large datasets. We switched to orjson for a significant performance boost:
from fastapi.responses import ORJSONResponse
app = FastAPI(default_response_class=ORJSONResponse)
Results
After implementing these optimizations, we achieved:
- Response times dropped from 2-3 seconds to under 50ms for most queries
- Throughput increased by 10x, handling 1000+ requests per second
- Database load reduced by 70% thanks to caching
- 99.9% uptime with automatic failover
Lessons Learned
- Measure before optimizing: Profile your code to find actual bottlenecks
- Choose the right database: TimescaleDB was a game-changer for our use case
- Cache strategically: Not everything needs caching, but the right things do
- Scale horizontally: Kubernetes made it easy to add more instances as needed
- Monitor everything: Prometheus and Grafana helped us catch issues early
Conclusion
Scaling FastAPI for millions of records is achievable with the right architecture and tools. The key is to understand your data access patterns and optimize accordingly. Don't prematurely optimize - measure, identify bottlenecks, and address them systematically.
If you're building high-performance APIs with Python, FastAPI is an excellent choice. Combined with async programming, proper database optimization, and strategic caching, you can build systems that rival those written in traditionally "faster" languages.