Design WhatsApp: Messages, Delivery Status, and Group Chats

The two grey ticks turning blue is one of the most recognizable UI signals in software. Everyone who uses WhatsApp understands exactly what it means. But when I reason through how to implement it, the deceptively simple question is: where does that state live and who is responsible for changing it?

Your first instinct might be to put a status field on each Message and update it as the message progresses. Single-recipient chat is fine with that. Then you think through group messages. A message sent to a group of 10 people needs to be delivered to all 10 before you can show double ticks. Read receipts require that all 10 have opened the chat. If you store a single status on the message, you lose the per-recipient detail you need to compute the aggregate. You need a different model: one delivery record per recipient per message. That one insight reshapes most of the design.

The second interesting problem is offline delivery. A user who has not opened WhatsApp in three days still gets your messages the moment they reconnect. The server has to hold messages until the recipient is available, then push them in order, and then update the sender’s UI with the delivery confirmation. This is the store-and-forward pattern, and building it correctly requires you to think about what “delivered” actually means: delivered to the device, or delivered to the server? WhatsApp chose server-plus-device, which is why you see a single grey tick the moment the server accepts your message.

Requirements

Functional

  • Send text and media messages between users in one-to-one chats
  • Support group chats with multiple recipients
  • Track delivery state per message: SENT (server received), DELIVERED (device received), READ (user opened it)
  • Deliver messages to offline users when they reconnect
  • Notify the sender when delivery or read state changes

Non-functional

  • Delivery state updates must propagate in near-real-time when both parties are online
  • The design must handle the group message case where different recipients reach READ at different times
  • The design must preserve message ordering per chat

Core Entities

User holds identity and online status. Whether a user is currently connected determines whether messages can be pushed directly or queue for later delivery.

Message is an immutable record once created. It carries content, sender, timestamp, and a reference to the chat it belongs to. It does not carry a single status field. That job belongs to MessageDeliveryRecord.

MessageDeliveryRecord is the key entity most people miss in early designs. One record per (message, recipient) pair. It holds the individual delivery state for that recipient. Aggregating across all records for a message gives you the sender’s visible tick state.

TextMessage and MediaMessage extend Message through a class hierarchy. A TextMessage holds a string body. A MediaMessage holds a media URL and MIME type. The Factory pattern creates the right type based on what the sender submitted.

Chat represents a one-to-one conversation between two users. It owns an ordered list of messages and knows both participants.

Group represents a multi-party conversation. It owns its member list and a list of messages. When a sender sends a message to a group, the system creates one MessageDeliveryRecord for each member except the sender.

MessageQueue stores outbound messages for offline users. When a user comes online, the queue drains in order.

NotificationService pushes delivery state changes back to the original sender. It observes state changes on MessageDeliveryRecord and triggers sender notifications.

Class Design

+-------------------+       +-------------------------+
|      User         |       |        Message          |
|-------------------|       |-------------------------|
| user_id: str      |       | message_id: str         |
| display_name: str |       | sender: User            |
| is_online: bool   |       | chat_id: str            |
|                   |       | timestamp: datetime     |
+-------------------+       | content_type: str       |
                            +-------------------------+
                                      ^
                         +-----------+-----------+
                         |                       |
              +------------------+   +---------------------+
              |   TextMessage    |   |    MediaMessage     |
              |------------------|   |---------------------|
              | body: str        |   | media_url: str      |
              +------------------+   | mime_type: str      |
                                     +---------------------+

+-----------------------------+
|   MessageDeliveryRecord     |
|-----------------------------|
| record_id: str              |
| message_id: str             |
| recipient: User             |
| state: DeliveryState        |
| delivered_at: datetime|None |
| read_at: datetime|None      |
|-----------------------------|
| mark_delivered()            |
| mark_read()                 |
+-----------------------------+

+-------------------+    +-------------------+
|      Chat         |    |       Group       |
|-------------------|    |-------------------|
| chat_id: str      |    | group_id: str     |
| participants[2]   |    | name: str         |
| messages: list    |    | members: list     |
|                   |    | messages: list    |
+-------------------+    +-------------------+

+----------------------+    +------------------------+
|    MessageQueue      |    |  NotificationService   |
|----------------------|    |------------------------|
| _queue: dict         |    | + notify_sender(...)   |
| enqueue(user, msg)   |    | + on_state_change(...) |
| drain(user)          |    +------------------------+
+----------------------+

The relationship to understand: MessageDeliveryRecord is the bridge between Message and User. A message to a group of 5 produces 5 records. The sender-visible state comes from aggregating all records for that message and picking the minimum state (you only show blue ticks when all recipients have READ).

Key Implementation

from __future__ import annotations

import uuid
from dataclasses import dataclass, field
from datetime import datetime
from enum import Enum, auto
from typing import Callable, Optional


class DeliveryState(Enum):
    SENT = auto()       # server accepted the message
    DELIVERED = auto()  # device received it
    READ = auto()       # user opened the chat


class ContentType(Enum):
    TEXT = "text"
    MEDIA = "media"


@dataclass
class User:
    user_id: str
    display_name: str
    is_online: bool = False


@dataclass
class Message:
    message_id: str
    sender: User
    chat_id: str
    timestamp: datetime
    content_type: ContentType

    @staticmethod
    def new_id() -> str:
        return str(uuid.uuid4())


@dataclass
class TextMessage(Message):
    body: str = ""


@dataclass
class MediaMessage(Message):
    media_url: str = ""
    mime_type: str = ""


class MessageFactory:
    @staticmethod
    def create_text(sender: User, chat_id: str, body: str) -> TextMessage:
        return TextMessage(
            message_id=Message.new_id(),
            sender=sender,
            chat_id=chat_id,
            timestamp=datetime.utcnow(),
            content_type=ContentType.TEXT,
            body=body,
        )

    @staticmethod
    def create_media(
        sender: User, chat_id: str, media_url: str, mime_type: str
    ) -> MediaMessage:
        return MediaMessage(
            message_id=Message.new_id(),
            sender=sender,
            chat_id=chat_id,
            timestamp=datetime.utcnow(),
            content_type=ContentType.MEDIA,
            media_url=media_url,
            mime_type=mime_type,
        )


@dataclass
class MessageDeliveryRecord:
    record_id: str
    message: Message
    recipient: User
    state: DeliveryState = DeliveryState.SENT
    delivered_at: Optional[datetime] = None
    read_at: Optional[datetime] = None
    _on_change: list[Callable[["MessageDeliveryRecord"], None]] = field(
        default_factory=list, repr=False
    )

    def register_observer(
        self, callback: Callable[["MessageDeliveryRecord"], None]
    ) -> None:
        self._on_change.append(callback)

    def mark_delivered(self) -> None:
        if self.state == DeliveryState.SENT:
            self.state = DeliveryState.DELIVERED
            self.delivered_at = datetime.utcnow()
            self._notify()

    def mark_read(self) -> None:
        if self.state != DeliveryState.READ:
            self.state = DeliveryState.READ
            self.read_at = datetime.utcnow()
            self._notify()

    def _notify(self) -> None:
        for callback in self._on_change:
            callback(self)

    @staticmethod
    def new(message: Message, recipient: User) -> "MessageDeliveryRecord":
        return MessageDeliveryRecord(
            record_id=str(uuid.uuid4()),
            message=message,
            recipient=recipient,
        )


class NotificationService:
    """
    Observes delivery record state changes and pushes
    updates back to the original sender.
    In a real system this fires a WebSocket push or APNS/FCM notification.
    """

    def on_state_change(self, record: MessageDeliveryRecord) -> None:
        sender = record.message.sender
        aggregate = self._aggregate_state(record.message)
        print(
            f"[notify] {sender.display_name}'s message {record.message.message_id[:8]} "
            f"is now {aggregate.name} for all recipients"
        )

    def _aggregate_state(self, message: Message) -> DeliveryState:
        # Caller is responsible for passing all records for this message.
        # Stubbed here; real implementation queries a repository.
        return DeliveryState.DELIVERED


class MessageQueue:
    """Store-and-forward queue for offline users."""

    def __init__(self) -> None:
        self._queue: dict[str, list[Message]] = {}

    def enqueue(self, recipient: User, message: Message) -> None:
        self._queue.setdefault(recipient.user_id, []).append(message)

    def drain(self, recipient: User) -> list[Message]:
        return self._queue.pop(recipient.user_id, [])


@dataclass
class Chat:
    chat_id: str
    participants: list[User]
    _messages: list[Message] = field(default_factory=list, repr=False)

    def add_message(self, message: Message) -> None:
        self._messages.append(message)

    def messages(self) -> list[Message]:
        return list(self._messages)

    def other_participant(self, sender: User) -> User:
        for p in self.participants:
            if p.user_id != sender.user_id:
                return p
        raise ValueError("Sender is not a participant in this chat")


@dataclass
class Group:
    group_id: str
    name: str
    members: list[User]
    _messages: list[Message] = field(default_factory=list, repr=False)

    def add_message(self, message: Message) -> None:
        self._messages.append(message)

    def recipients_for(self, sender: User) -> list[User]:
        return [m for m in self.members if m.user_id != sender.user_id]


class MessagingService:
    """
    Orchestrates sending a message: creates delivery records,
    handles offline queuing, and wires up observer notifications.
    """

    def __init__(
        self,
        queue: MessageQueue,
        notifications: NotificationService,
    ) -> None:
        self._queue = queue
        self._notifications = notifications
        # In a real system, records are persisted to a database.
        self._records: list[MessageDeliveryRecord] = []

    def send_to_chat(self, chat: Chat, message: Message) -> list[MessageDeliveryRecord]:
        recipient = chat.other_participant(message.sender)
        chat.add_message(message)
        return self._deliver([recipient], message)

    def send_to_group(
        self, group: Group, message: Message
    ) -> list[MessageDeliveryRecord]:
        recipients = group.recipients_for(message.sender)
        group.add_message(message)
        return self._deliver(recipients, message)

    def _deliver(
        self, recipients: list[User], message: Message
    ) -> list[MessageDeliveryRecord]:
        records = []
        for recipient in recipients:
            record = MessageDeliveryRecord.new(message, recipient)
            record.register_observer(self._notifications.on_state_change)
            self._records.append(record)
            records.append(record)

            if recipient.is_online:
                record.mark_delivered()
            else:
                self._queue.enqueue(recipient, message)

        return records

    def user_came_online(self, user: User) -> None:
        """Called when a user reconnects. Drains the queue and marks messages delivered."""
        pending = self._queue.drain(user)
        for message in pending:
            for record in self._records:
                if record.message.message_id == message.message_id:
                    if record.recipient.user_id == user.user_id:
                        record.mark_delivered()

Design Decisions and Trade-offs

State as a progression, not a flag. DeliveryState is an enum that moves in one direction: SENT to DELIVERED to READ. There is no going backwards. Modeling it as a simple string field with any-direction mutation would allow bugs like a message moving from READ back to DELIVERED. The enum combined with the guard checks in mark_delivered and mark_read make illegal transitions impossible to express.

Per-recipient records for group messages. The non-obvious insight here is that a group message has no single delivery state. It has a collection of per-recipient states. You compute the aggregate state you show the sender (one tick, two ticks, blue ticks) from that collection. For one-to-one chats, there is exactly one record per message, which simplifies to the same interface. This uniformity is a nice property: send_to_chat and send_to_group both produce MessageDeliveryRecord lists and nothing else in the system needs to know which kind of chat it was.

Observer for delivery notifications. Rather than having MessagingService directly call the sender after each state change, delivery records fire callbacks when their state changes. This decouples MessageDeliveryRecord from the notification mechanism entirely. You can add logging, analytics, or push notification adapters by registering additional observers without touching the record’s logic. The cost is that the callback wiring happens at record creation time, which requires some care to get right.

Store-and-forward at the service layer. The MessageQueue is intentionally simple: a dictionary keyed by user ID, holding an ordered list of pending messages. In a real system this would be a durable queue (think Redis lists with persistence, or a dedicated message broker). The interface is the same though: enqueue and drain. Keeping it behind an interface means you can swap the backing store without changing MessagingService.

Factory for message types. The alternative is letting callers construct TextMessage and MediaMessage directly. That works, but it scatters the ID generation and timestamp logic across every call site. The factory centralizes that so message creation is always consistent.

End-to-end encryption as an extension point. WhatsApp’s actual encryption happens at the message content layer before the message object is ever serialized. In this design, the right place to add it is in MessageFactory.create_text and create_media, where you encrypt the content before setting it on the object, and in a corresponding decrypt step when the recipient reads it. The rest of the design does not need to know encryption exists. That isolation is deliberate.

The group tick problem. When should the sender see blue ticks for a group message? The strictest definition is “when all recipients have read it.” That is what WhatsApp does. The aggregate logic I stubbed in NotificationService._aggregate_state takes the minimum state across all records: if nine of ten people have READ the message but one has only DELIVERED, the sender sees two grey ticks. The moment the last recipient hits READ, the ticks turn blue. The engineering implication is that you need an efficient query: given a message ID, what is the minimum delivery state across all its records? Indexing by message ID in the records store makes this a fast lookup.

The Offline Delivery Gap

The scenario worth reasoning through carefully: Alice sends a message to Bob at 2pm. Bob’s phone is off. The message lands in the queue. Bob turns his phone on at 6pm and reconnects. user_came_online drains the queue, marks the record delivered, and fires the observer. Alice’s UI updates from one grey tick to two. Then Bob opens the chat, mark_read fires, and Alice’s ticks turn blue.

The tricky part is ordering. If Alice sent five messages while Bob was offline, they need to drain and be marked delivered in timestamp order, not in whatever order the queue happened to store them. The _queue structure here preserves insertion order because it is a list, but you want to be explicit about sort order if you ever switch to an unordered backing store.


If you found this useful or want to argue about where delivery state should live, I’d genuinely enjoy the conversation. Reach out on Twitter or LinkedIn.

Tags:

#lld #machine-coding #object-oriented-design #python #messaging #real-time

Related Posts

Let's Connect! 💬

Whether you're looking to hire, want to collaborate on a project, or just want to chat about tech—I'd love to hear from you!