system-design
How would you design a messaging app?
I built a chat feature once that worked great in testing. Two users, messages flying back and forth, instant delivery. Then we launched. 10,000 concurrent...
9 Mar 2024
I built a chat feature once that worked great in testing. Two users, messages flying back and forth, instant delivery. Then we launched. 10,000 concurrent users. Messages arriving out of order. Duplicates everywhere. Read receipts broken. The "simple" chat feature became a distributed systems nightmare.
Messaging apps look simple. They're not. The hard problems are ordering, delivery guarantees, offline support, and real-time sync across devices.
Business Requirements
- Real-Time Messaging — messages must arrive instantly. Users expect sub-second delivery. Anything slower feels broken.
- Authentication and Authorization — secure login, session management, and access control. Users must only see conversations they belong to.
- Multimedia Support — images, videos, voice messages, file attachments. Text-only messaging is table stakes.
- Message Sync — conversation history must be available on every device. Switch from phone to laptop, and all messages are there.
- Push Notifications — users need to know about new messages even when the app is closed. This is critical for engagement.
Technical Requirements
- Scalability — the user base will grow. Message volume will grow. The system must scale horizontally without re-architecture.
- High Availability — messaging is a primary communication channel. Downtime is unacceptable. Redundancy, failover, and load balancing are essential.
- End-to-End Encryption — messages should be readable only by the sender and recipient. The server should never see plaintext content.
- Efficient Storage — message history grows forever. You need efficient storage, retrieval, and pagination. Old messages can be moved to cold storage.
High-Level Design
Client-Server Architecture
Clients (mobile apps, web clients) maintain persistent connections to the server. WebSockets are the standard choice for real-time messaging. They give you full-duplex communication over a single TCP connection — no polling, no overhead.
Backend Services
- Auth Service — handles login, token management, session validation.
- Message Router — receives messages, determines recipients, routes to the right connections or queues.
- Storage Service — persists messages to the database. Handles retrieval and pagination.
- Notification Service — sends push notifications to offline users via APNs/FCM.
Message Queue
This is the critical piece. A message queue (RabbitMQ, Kafka) sits between the message router and downstream services. It decouples message receipt from processing. If the storage service is slow or the notification service is down, messages are buffered, not lost.
The queue also enables ordering guarantees within a conversation. Messages for the same chat partition to the same queue, preserving order.
Database
Persistent storage for user profiles, conversation metadata, and message history. Options:
- PostgreSQL — strong consistency, good for user data and conversation metadata.
- Cassandra — excellent write throughput, good for message storage at scale. Partition by conversation ID for efficient retrieval.
- MongoDB — flexible schema, good for varying message types and metadata.
Data replication and sharding ensure fault tolerance and scalability.

The message queue handles the heavy lifting after real-time delivery between clients. The message service stores history for later retrieval.
The Trade-Offs
WebSockets vs. long polling. WebSockets are faster and more efficient for real-time messaging. But they require persistent connections, which means more server memory and more complex load balancing (you need sticky sessions or a connection registry). Long polling is simpler but wastes bandwidth and adds latency.
Message ordering. Guaranteeing global message order is expensive. Guaranteeing order within a conversation is much cheaper (partition by conversation ID). For most messaging apps, per-conversation ordering is sufficient.
Delivery guarantees. At-least-once delivery means duplicates are possible. At-most-once means messages can be lost. Exactly-once is extremely hard in distributed systems. Most messaging apps use at-least-once delivery with client-side deduplication (using unique message IDs).
Encryption vs. searchability. End-to-end encryption means the server can't read messages. But it also means the server can't index or search them. Server-side search requires access to plaintext. You have to choose: security or searchability. Some apps (like WhatsApp) choose security. Some (like Slack) choose searchability.
The right design depends on your priorities. For a consumer chat app, prioritize E2E encryption and real-time delivery. For an enterprise collaboration tool, prioritize search, integrations, and compliance.