πŸš€
🎨 Designing Systems
πŸ’¬ Chat Application

Design a Chat Application (WhatsApp) πŸ’¬

Designing a real-time chat application requires handling millions of persistent connections, ensuring sub-second message delivery, and maintaining data consistency across multiple devicesβ€”all while keeping conversations private with end-to-end encryption.

🌍
References & Disclaimer

This content is adapted from Mastering System Design from Basics to Cracking Interviews (Udemy). It has been curated and organized for educational purposes on this portfolio. No copyright infringement is intended.


πŸš€ Introduction

A modern chat system like WhatsApp must handle massive scale while feeling instant and personal. The architecture centers around persistent WebSocket connections and a distributed presence management system.

Chat Events


πŸ“‹ Requirements

Functional Requirements

  • Messaging: 1-to-1 and group chat support.
  • Statuses: Typing indicators and real-time online status.
  • Media: Secure upload and delivery of images, videos, and documents.
  • Receipts: Delivery and read acknowledgments.
  • Sync: Multi-device support with message history synchronization.

Non-Functional Requirements

  • Low Latency: Sub-second message delivery globally.
  • High Availability: 99.99% uptime; users expect the service to always be "on."
  • Encryption: End-to-end encryption (E2EE) where the server has zero visibility into content.
  • Scalability: Support for 100M+ DAU and billions of messages daily.

πŸ“Š Scale Estimation

  • DAU: 100 Million users.
  • Messages/day: ~5 Billion.
  • Peak Load: 3x multiplier during major global events.
  • Concurrent Connections: 20–30 million persistent connections at peak.
  • Storage: Billions of messages per day requiring partitioned, write-optimized NoSQL stores.

πŸ“ High-Level Architecture

The core of the system is the Connection Manager, which maintains long-lived WebSocket connections for every active device.

Chat High-Level Architecture


⚑ Message Flow & Sequence

Understanding how a message travels from Sender to Receiver in both 1-to-1 and Group scenarios.

Chat Message Flow

The WebSocket Advantage

Chat applications rely on WebSockets (rather than standard HTTP) because:

  1. Bi-directional: The server can push messages to the client instantly.
  2. Lower Overhead: No need to resend headers for every message after the initial handshake.
  3. Real-time: Essential for typing indicators and presence updates.

πŸ—οΈ The Final Design

A comprehensive view of the entire chat ecosystem, including media handling and push fallback.

Chat Final Design


πŸ› οΈ Bottlenecks & Strategic Decisions

  1. Presence Updates: "Last seen" and typing indicators are high-frequency events. Use a Pub/Sub pattern with Redis to broadcast these efficiently without thumping the main database.
  2. Storage Write Pressure: 5 Billion messages/day is massive. Use Cassandra or DynamoDB because they are optimized for high-volume append writes and partitioned based on conversation_id.
  3. Connection Management: Handling 30M concurrent WebSockets is hard. Deploy a dedicated Connection Registry using Redis to track which user ID is connected to which specific WebSocket server instance.
  4. Encryption (E2EE): Implement key exchange (e.g., Signal Protocol) on the client-side. The server only stores encrypted blobs and metadata, never the raw message.

πŸ’‘ Top Interview Questions

Q: How do you handle group messages with 500+ members? Instead of the Chat Service fanning out 500 messages, it places a single message in a queue. A pool of Fan-out Workers then iterates through the member list and pushes the message to each member's active WebSocket connection.

⚠️

Q: What happens if the receiver is offline? If the Connection Manager detects no active WebSocket for the recipient, the message is stored in the DB as "Pending" and a Push Notification is triggered via FCM/APNs to alert the user.

Β© 2026 Driptanil Datta. All rights reserved.

Software Developer & Engineer

Disclaimer:The content provided on this blog is for educational and informational purposes only. While I strive for accuracy, all information is provided "as is" without any warranties of completeness, reliability, or accuracy. Any action you take upon the information found on this website is strictly at your own risk.

Copyright & IP:Certain technical content, interview questions, and datasets are curated from external educational sources to provide a centralized learning resource. Respect for original authorship is maintained; no copyright infringement is intended. All trademarks, logos, and brand names are the property of their respective owners.

System Operational

Built with Love ❀️ | Last updated: Mar 16 2026