System Design Interview: Design Netflix

"Design Netflix" is a read-heavy streaming problem dominated by one fact: serving video at global scale is a content-delivery and storage problem far more than a compute one. The interviewer wants to see you separate the control plane (browsing, search, recommendations) from the data plane (actually streaming bytes) and reason about delivering petabytes close to users.

Work it in order: clarify requirements, estimate scale, design the upload/transcode pipeline, then the playback path through a CDN, and finish on adaptive bitrate and trade-offs. The walk-through below follows that structure so you can practice the reasoning, not memorize an architecture.

1. Clarify requirements

Separate functional from non-functional and scope tightly. The core is: users browse a catalog and stream video smoothly across devices and network conditions.

Functional: browse/search the catalog, start playback, adaptive quality, resume where you left off, multi-device.
Non-functional: very high availability, low startup latency, smooth playback (no rebuffering), global reach.
Out of scope (say so): the recommendation ML internals, billing, DRM specifics — name them, then defer.

2. Estimate scale

Numbers justify the CDN-first design. Assume hundreds of millions of subscribers and many millions of concurrent streams at peak. A single stream is multiple Mbps, so aggregate egress is tens of terabits per second — impossible to serve from a few central data centers, which immediately motivates edge caching.

3. Ingestion and transcoding pipeline

Uploaded masters are processed offline, not on the playback path. Each title is transcoded into many renditions — multiple resolutions and bitrates — and segmented into small chunks (a few seconds each) for adaptive streaming.

Transcoding farm: convert the master into H.264/H.265/AV1 renditions at several bitrates; this is embarrassingly parallel batch work.
Segmentation + packaging: split each rendition into short segments with a manifest (HLS/DASH) describing them.
Store outputs in object storage (e.g. S3-like) as the origin of truth, then push popular content to the edge.

4. CDN and the playback path

Playback is served from the edge, not the origin. Video segments are cached on CDN nodes physically close to users (Netflix runs its own Open Connect appliances inside ISPs). The client fetches the manifest, then pulls segments from the nearest healthy edge node.

The control plane (catalog, search, user profiles, 'continue watching') is a separate set of stateless API services backed by their own databases and caches — it returns metadata and a manifest URL, and is completely decoupled from the heavy byte delivery.

Edge cache (CDN/Open Connect): serves the vast majority of bytes; pre-positions popular titles by region ahead of demand.
Origin storage: object store that backs cache misses.
Control-plane APIs: catalog, playback authorization, bookmarks — small payloads, horizontally scaled, cache-fronted.

5. Adaptive bitrate streaming

Smooth playback under variable networks comes from the client, not the server. The player monitors throughput and buffer level and switches segment quality up or down between segments using the manifest's rendition list — higher bitrate on fast connections, lower to avoid rebuffering on slow ones.

6. Trade-offs to verbalize

Close on the tensions an interviewer is listening for: precompute-all-renditions storage cost vs on-the-fly transcoding latency (Netflix precomputes — storage is cheaper than startup delay); cache-everything vs cache-popular at the edge (a long-tail catalog can't all live at every edge); and strong vs eventual consistency for control-plane data like 'continue watching' (eventual is fine). Naming these is what makes the answer senior.

Practice Netflix system design with real-time help

Natively's system design interview assistant can keep you structured during a live round — suggesting the next dimension to cover and surfacing trade-offs in real time, all on your own device.

Ready to try Natively?

Download the definitive local AI interview assistant today and ace your next coding interview with complete privacy.

Get Started Free