system-design
System Design Interview Playbook: 10-20min Structure and TypeScript
I've both interviewed at and conducted system design interviews for large engineering teams. I also run workshops through PersiaJS and my newsletter. Over...
22 Oct 2025
I've both interviewed at and conducted system design interviews for large engineering teams. I also run workshops through PersiaJS and my newsletter. Over the past decade I've learned one thing: doing well in these interviews isn't about memorizing buzzwords. It's about a repeatable, clear thought process and being able to justify trade-offs under time pressure.
Here's the playbook I use and teach.
The One Principle That Matters
Clarify first. Sketch quickly. Iterate. Be explicit about trade-offs.
I used to jump straight into diagrams and then realize I hadn't clarified a crucial constraint (e.g., "Is eventual consistency acceptable?"). That wasted time and sometimes lost the interviewer's trust. Now I always spend the first 2-3 minutes asking clarifying questions.
The Structure (Repeatable, 10-20 Minutes)
1. Clarify Requirements (2-3 min)
Who are the users? QPS? P95 latency targets? Data retention? Consistency requirements? Read/write ratio? Object sizes? Traffic patterns?
This is where you show the interviewer you think before you build.
2. High-Level Design (3-5 min)
Draw the main components: client, API gateway, application services, caches, database, async pipelines, storage. Keep it simple. Don't over-detail yet.
3. Define APIs and Data Model (2-3 min)
Show the external API shape and the minimal data schema. This grounds the discussion in concrete terms.
4. Capacity, Scaling, and Bottlenecks (3-5 min)
Estimate traffic and storage. Identify bottlenecks. Explain how to scale them. This is where napkin math matters.
5. Deep Dive Into One Component (5-10 min)
Pick the hardest or most interesting part (caching, sharding, consistency, search) and go deep. This is where you differentiate yourself.
6. Operational Concerns and Trade-Offs (2-3 min)
Monitoring, deployments, SLOs, rate limiting, security, cost. Show you think about running the system, not just building it.
7. Wrap Up (1-2 min)
Summarize decisions, acknowledge alternatives, invite follow-up questions.
This maps to how interviewers actually score: clarity, trade-offs, architecture, scalability, and operational awareness.
Concrete Example: URL Shortener
I use this in workshops because it touches hashing, collision handling, data models, caching, and analytics.
Clarify
- QPS: 10k read, 100 write.
- Latency target: <100ms for redirects.
- Support custom aliases.
- Data retention: indefinite (soft delete allowed).
High-Level Components
Clients → API Gateway → Shortener Service → Storage (primary DB) + Cache (Redis) → Analytics pipeline (Kafka → batch store)
APIs and Data Model
CreateShort(url, optional alias) → codeGET /r/{code} → 302to original URL
Generating Codes
Use Base62 encoding of an auto-increment ID or a hash (with collision handling). For global scale, prefer a partitioned keyspace.
const ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
const BASE = ALPHABET.length;
export function encodeBase62(num: number): string {
if (num === 0) return ALPHABET[0];
let s = "";
while (num > 0) {
s = ALPHABET[num % BASE] + s;
num = Math.floor(num / BASE);
}
return s;
}
Scaling Notes
- Reads dominate. Cache redirects in Redis (hot keys).
- Writes are low QPS — an RDBMS handles them fine.
- Partition by code prefix using consistent hashing.
- For custom aliases, enforce uniqueness with DB transactions or a dedicated index.
Real-World Trade-Offs I've Seen
I once designed a system using only auto-increment IDs and a single DB master. It worked at low scale but became painful when we needed geo-replication. We should have planned sharding earlier and separated the "assign ID" service.
UUIDs avoid central counters but make short codes longer. Choose based on product constraints.
Example Deep-Dive: Rate Limiting
Interviewers love this because it combines algorithms and distributed state.
Common approaches: fixed window, sliding window, token bucket, leaky bucket.
For distributed systems, use a centralized store (Redis) or embed tokens in clients with consistent coordination.
- Use Redis (
INCR+ expire) or a token bucket per user stored in Redis. - For large scale, consider approximate algorithms (leaky bucket via local counters + periodic sync) to reduce write pressure.
- Call out fairness, burst handling, and what happens if Redis fails.
What Interviewers Actually Listen For
- Do you ask clarifying questions? Good.
- Can you draw a coherent high-level design and zoom in? Good.
- Can you estimate capacity and identify bottlenecks? Good.
- Can you articulate trade-offs and operational concerns? Great.
- Can you implement a small algorithm and reason about correctness? Excellent.
Common Mistakes I've Made
Over-optimizing too early. I used to design distributed queues and leader election for features that never reached scale. Now I ask: "What scale do you actually need?" and propose a simpler solution with a clear migration path.
Ignoring operational concerns. A pretty architecture on a whiteboard that's impossible to operate (opaque failure modes, no metrics) is a non-starter.
Not justifying "why." Don't just list components — explain why you chose them. If you pick Dynamo-style storage, say whether you want availability over consistency and why.
Building Blocks Cheat Sheet
- Load Balancer — distributes requests; use health checks.
- API Gateway — authentication, rate limiting, routing.
- Cache (Redis, Memcached) — reduce read latency; think expiration and cache stampede.
- RDBMS — strong consistency, transactions.
- NoSQL (Cassandra, DynamoDB) — high write throughput, partition-tolerant.
- Queue (Kafka, RabbitMQ) — async workloads, reliability, decoupling.
- Object Store (S3) — large binary storage, cheap and durable.
- CDN — global caching for static content.
- Monitoring (Prometheus + Grafana) — latency, errors, QPS dashboards.
- Tracing (Jaeger) — distributed request tracing for debugging.
How to Practice
- Start small. Design a URL shortener, file store, or rate limiter. Timebox yourself: 20-30 minutes.
- Do mock interviews. I learned more from five live mocks than from reading fifty articles.
- Learn key algorithms deeply. Consistent hashing, leader election basics, token bucket.
- Read real postmortems. SRE postmortems from major companies teach failure modes better than any textbook.
- Keep a capacity cheat sheet. Average object sizes, network costs, disk IOPS. Approximate numbers help make realistic trade-offs.
Resources I Actually Recommend
- Designing Data-Intensive Applications by Martin Kleppmann — read it, re-read it, annotate it.
- High Scalability blog — real case studies from production systems.
- Engineering blog posts from major infra teams — gold for real-world trade-offs.
Final Advice
- Talk out loud. Don't go silent while thinking.
- Be explicit about assumptions. If you assume 10k QPS, say it.
- Prioritize: solve the core problem first, add improvements later.
- If time is running low, pick a single component to go deep on and show you can think end-to-end.