System Design Interview Playbook: 10-20min Structure and TypeScript

I get asked about system design interviews a lot — partly because I’ve both interviewed at and conducted interviews for big teams, and partly because I run workshops through PersiaJS and on my newsletter, Monday by Gazar. Over the last decade I’ve learned that doing well in these interviews isn’t about memorizing buzzwords; it’s about a repeatable, clear thought process and being able to justify trade-offs under time pressure.

Below is the playbook I use and teach. I’ll walk through the structure I follow in interviews, give concrete examples (including TypeScript snippets), and share real mistakes I’ve made so you can avoid them.

High-level principle I live by

  • Clarify first. Sketch quickly. Iterate. Be explicit about trade-offs.

I used to jump straight into diagrams and then realize I hadn’t clarified a crucial constraint (e.g., “Is eventual consistency acceptable?”). That wasted time and sometimes lost the interviewer’s trust. Now I always spend the first 2–3 minutes asking clarifying questions.

Interview structure I use (repeatable, 10–20 minute timing)

1. Clarify requirements (2–3 min)

  • Who are the users? QPS? P95 latency targets? Data retention? Consistency requirements?
  • Ask about traffic patterns, read/write ratio, size of objects.

2. High-level design (3–5 min)

  • Draw the main components: client, API gateway, application services, caches, DB, async pipelines, storage.

3. Define APIs and data model (2–3 min)

  • Show the external API shape and the minimal data schema.

4. Capacity, scaling & bottlenecks (3–5 min)

  • Estimate traffic, sizing, identify bottlenecks & how to scale them.

5. Deep dive into one component (5–10 min)

  • Pick the most interesting/hard part (caching, sharding, consistency, search) and go deep.

6. Operational concerns & trade-offs (2–3 min)

  • Monitoring, deployments, SLOs, rate limiting, security, cost.

7. Wrap up, alternatives, and follow-up questions (1–2 min)

This structure maps nicely to how interviewers score: clarity, trade-offs, architecture, scalability, and operational awareness.

Concrete example: URL shortener (short walkthrough)

I use this example in workshops because it touches on hashing, collision handling, data model, caching, and analytics.

Clarify

  • QPS: let’s assume 10k read QPS, 100 write QPS.
  • Latency target: <100ms for redirects.
  • We should support custom aliases.
  • Data retention: indefinite (but soft delete allowed).

High-level components

  • Clients -> API Gateway -> Shortener Service -> Storage (primary DB) + Cache (Redis) -> Analytics pipeline (Kafka -> batch store)

APIs and data model

  • API: CreateShort(url, optional alias) -> code
  • Redirect endpoint: GET /r/{code} -> 302 to original URL

Generating codes

  • Use a base62 encoding of an auto-increment ID OR a hash (with collision handling).
  • For global scale, prefer a partitioned keyspace: e.g., use a prefix for region/cluster or a shard id.

Simple base62 encoder example in TypeScript:

const ALPHABET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
const BASE = ALPHABET.length;
export function encodeBase62(num: number): string {
   if (num === 0) return ALPHABET[0];
      let s = "";
      while (num > 0) {
         s = ALPHABET[num % BASE] + s;
         num = Math.floor(num / BASE);
      }
   return s;
}

Scaling notes

  • Reads are dominant. Cache redirects in Redis (hot keys).
  • Writes are small QPS — can be handled with an RDBMS (for consistency) or a write-sharded NoSQL.
  • Partition by code prefix using consistent hashing or range shards.
  • For custom aliases, check uniqueness with DB transactions or a dedicated index.

Real-world trade-offs I’ve seen

  • I once designed a system that used only auto-increment IDs and a single DB master. It worked at low scale but became a pain when we needed geo-replication. We should’ve planned sharding earlier and separated the “assign id” service.
  • Using UUIDs avoids central counters but makes short codes longer. You must choose based on product constraints.

Example deep-dive I often pick: rate limiting (since interviewers like algorithms & distributed state)

Common approaches: fixed window, sliding window, token bucket, leaky bucket. For distributed systems, you either use a centralized store (Redis) or embed tokens in clients with consistent coordination.

In interviews, mention the distributed design:

  • Use Redis (INCR + expire) or use a token bucket per user stored in Redis (hash per user).
  • For large scale consider approximate algorithms (e.g., leaky bucket via local counters + periodic sync) to reduce write pressure.
  • Call out fairness, bursts allowed, and what happens if Redis fails.

What interviewers are actually listening for

  • Do you ask clarifying questions? Good.
  • Can you draw a coherent high-level design and then zoom in? Good.
  • Can you estimate capacity and identify bottlenecks? Good.
  • Can you articulate trade-offs and operational concerns (monitoring, backups, SLOs, failure modes)? Great.
  • Can you implement a small algorithm and reason about correctness/performance? Excellent.

Common mistakes I've made and seen

  • Over-optimizing too early. I used to design distributed queues and leader election for features that never reached scale. I wasted time and complicated the design. Now I ask: “What scale do you actually need?” and often propose a simpler solution with a clear migration path.
  • Ignoring operational concerns. You can make a pretty architecture on a whiteboard, but if it’s impossible to operate (opaque failure modes, no metrics), it’s a non-starter.
  • Not justifying “why”. Don’t just list components — explain why you chose them. If you pick Dynamo-style storage, say if you want availability over consistency and why.

A short cheat-sheet of common building blocks (one-line descriptions)

  • Load Balancer: distributes requests; use health checks.
  • API Gateway: authentication, rate limiting, routing.
  • Cache (Redis, Memcached): reduce read latency; think expiration & cache stampede.
  • RDBMS: strong consistency, transactions.
  • NoSQL (Cassandra, Dynamo): high write throughput, partition-tolerant.
  • Queue (Kafka, RabbitMQ): async workloads and reliability.
  • Object Store (S3): large binary storage, cheap.
  • CDNs: global caching for static content.
  • Monitoring (Prometheus + Grafana): latency, errors, QPS.
  • Tracing (Jaeger): distributed request tracing.

How I recommend practicing (actionable plan)

  • Start small: design a URL shortener, file store, or rate limiter. Timebox yourself: 20–30 minutes sketches.
  • Do mock interviews with peers or use platforms like Interviewing.io. I learned more from doing five live mocks than from reading 50 articles.
  • Learn a couple of common algorithms deeply: consistent hashing, leader election basics, token bucket.
  • Read real postmortems (SRE postmortems from major companies). They teach failure modes.
  • Keep a cheatsheet of capacities: average object sizes, network costs, disk IOPS — approximate numbers help make realistic trade-offs.

Resources I actually read or send people

  • High Scalability (case studies)
  • “Designing Data-Intensive Applications” by Martin Kleppmann — read, re-read, and annotate.
  • Blog posts from major infra teams — they’re gold for real-world trade-offs.
  • My newsletter, Monday by Gazar, often covers architecture patterns and practical lessons (shameless plug — I send stuff I wish someone told me years ago).

Final advice — what I tell candidates before an interview

  • Talk out loud. Don’t be silent while thinking.
  • Be explicit about assumptions. If you assume 10k QPS, say it.
  • Prioritize: solve the core problem first, add improvements later.
  • If time is low, pick a single component to go deep on (caching, sharding, consistency), and make sure you show you can think end-to-end.

If you want, we can do a mock system design now: pick a prompt (news feed, chat, ecommerce search, or video streaming) and I’ll walk you through how I’d tackle it in a 30-minute interview. I’ll even provide follow-up feedback and a short TypeScript prototype for key pieces.

Comments