Design a Logging Library Like log4j: Filters, Appenders, and Formatters

Logging seems like a solved problem. You call logger.info("something happened") and text appears somewhere. The design question is interesting precisely because it looks so simple on the surface, and then you start pulling on the threads.

Why can a single log statement simultaneously write to a file and print to the console? How does a DEBUG log in a library you depend on stay silent while your own DEBUG logs show up? Why can you attach a JSON formatter to one appender and a human-readable formatter to another, both receiving the same log record? Each of these questions points at a design decision. Get the separation of concerns wrong and you end up with a system where changing the output format requires modifying the appender, or adding a new destination requires modifying the formatter. The libraries that got this right (log4j, Python’s logging module, Logback) all converge on the same core insight: the thing that routes log records, the thing that writes them, and the thing that formats them are three separate concerns that should be composed, not merged.

Let me reason through how I’d design this from scratch.

Requirements

Functional:

  • Log messages at five levels: DEBUG, INFO, WARN, ERROR, CRITICAL
  • Support multiple appenders per logger (Console, File, Database)
  • Each appender uses a pluggable formatter (text, JSON, structured)
  • Filter chains per logger or appender: drop records below a minimum level or matching certain patterns
  • Logger hierarchy: a child logger inherits appenders from its parent unless explicitly configured otherwise

Non-functional:

  • The root logger should be a Singleton, so all loggers share a common ancestor
  • Adding a new appender type should not require touching the logger or formatter code
  • Adding a new formatter should not require touching any appender
  • The filter chain should be extensible without modifying logger or appender code

Core Entities

LogRecord is the value object that flows through the system. It carries the level, message, logger name, timestamp, and any extra context. Every other component receives and acts on a LogRecord. It is immutable once created.

LogLevel is an enum with a numeric ordering so we can compare levels. DEBUG < INFO < WARN < ERROR < CRITICAL.

Filter is an abstract base class with a single method: should_log(record) -> bool. Filters compose into chains using the Chain of Responsibility pattern.

Formatter is an abstract base class with a single method: format(record) -> str. Concrete formatters produce text, JSON, or any other string representation.

Appender is an abstract base class with a write(record) method. It holds a formatter and a list of filters. It applies its filter chain before calling the formatter. An appender knows nothing about where log records come from.

Logger is the user-facing entry point. It has a name, a minimum level, a list of appenders, a filter chain, and an optional parent logger. When a log statement is called, the logger creates a LogRecord, runs its own filter chain, then passes the record to its appenders and (if propagation is enabled) up to its parent.

LogManager is the Singleton that owns the root logger and the registry of named loggers. You get loggers from it, and it ensures the hierarchy is wired correctly.

Class Design

+-------------------------+
|      LogManager         |   (Singleton)
|-------------------------|
| - root_logger: Logger   |
| - loggers: dict         |
|-------------------------|
| + get_logger(name)      |
+-------------------------+
            |
            | creates/owns

+-------------------------+     0..* +------------------+
|         Logger          |-------->|    Appender      |
|-------------------------|         |------------------|
| - name: str             |         | - formatter      |
| - level: LogLevel       |         | - filters: list  |
| - appenders: list       |         |------------------|
| - filters: list         |         | + write(record)  |
| - parent: Logger        |         | + add_filter(f)  |
| - propagate: bool       |         +------------------+
|-------------------------|              |       |
| + debug/info/..(msg)    |        Console   File    Database
| + _log(level, msg)      |
+-------------------------+         +------------------+
                                    |    Formatter     |
+------------------+                |------------------|
|    LogRecord     |                | + format(record) |
|------------------|                +------------------+
| level: LogLevel  |                    |        |
| message: str     |              TextFormatter  JsonFormatter
| logger_name: str |
| timestamp: float |         +------------------+
| extra: dict      |         |     Filter       |
+------------------+         |------------------|
                             | + should_log(r)  |
                             +------------------+
                                  |        |
                           LevelFilter  PatternFilter

The key composition point: Appender holds a Formatter and a list of Filter objects. The appender delegates formatting to the formatter and filtering to the filter chain. Neither the formatter nor the filters know about each other.

Key Implementation

from __future__ import annotations

import json
import time
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from enum import IntEnum
from typing import Optional


class LogLevel(IntEnum):
    DEBUG = 10
    INFO = 20
    WARN = 30
    ERROR = 40
    CRITICAL = 50


@dataclass
class LogRecord:
    level: LogLevel
    message: str
    logger_name: str
    timestamp: float = field(default_factory=time.time)
    extra: dict = field(default_factory=dict)


# --- Filter hierarchy (Chain of Responsibility) ---

class Filter(ABC):
    def __init__(self, next_filter: Optional[Filter] = None) -> None:
        self._next = next_filter

    @abstractmethod
    def _passes(self, record: LogRecord) -> bool:
        pass

    def should_log(self, record: LogRecord) -> bool:
        if not self._passes(record):
            return False
        # Pass to next filter in the chain if one exists
        if self._next:
            return self._next.should_log(record)
        return True


class LevelFilter(Filter):
    """Rejects records below a minimum level."""

    def __init__(self, min_level: LogLevel, next_filter: Optional[Filter] = None) -> None:
        super().__init__(next_filter)
        self._min_level = min_level

    def _passes(self, record: LogRecord) -> bool:
        return record.level >= self._min_level


class PatternFilter(Filter):
    """Rejects records whose message contains a banned substring."""

    def __init__(self, banned_pattern: str, next_filter: Optional[Filter] = None) -> None:
        super().__init__(next_filter)
        self._pattern = banned_pattern

    def _passes(self, record: LogRecord) -> bool:
        return self._pattern not in record.message


# --- Formatter hierarchy (Strategy pattern) ---

class Formatter(ABC):
    @abstractmethod
    def format(self, record: LogRecord) -> str:
        pass


class TextFormatter(Formatter):
    """Human-readable single-line format."""

    def format(self, record: LogRecord) -> str:
        ts = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(record.timestamp))
        return f"[{ts}] [{record.level.name}] {record.logger_name}: {record.message}"


class JsonFormatter(Formatter):
    """Machine-parseable JSON format, useful for log aggregation pipelines."""

    def format(self, record: LogRecord) -> str:
        return json.dumps({
            "timestamp": record.timestamp,
            "level": record.level.name,
            "logger": record.logger_name,
            "message": record.message,
            **record.extra,
        })


# --- Appender hierarchy ---

class Appender(ABC):
    def __init__(self, formatter: Formatter) -> None:
        self._formatter = formatter
        self._filters: list[Filter] = []

    def add_filter(self, f: Filter) -> None:
        self._filters.append(f)

    def _passes_filters(self, record: LogRecord) -> bool:
        return all(f.should_log(record) for f in self._filters)

    def write(self, record: LogRecord) -> None:
        if not self._passes_filters(record):
            return
        self._emit(self._formatter.format(record))

    @abstractmethod
    def _emit(self, formatted: str) -> None:
        pass


class ConsoleAppender(Appender):
    def _emit(self, formatted: str) -> None:
        print(formatted)


class FileAppender(Appender):
    def __init__(self, formatter: Formatter, file_path: str) -> None:
        super().__init__(formatter)
        self._file_path = file_path

    def _emit(self, formatted: str) -> None:
        with open(self._file_path, "a") as f:
            f.write(formatted + "\n")


class DatabaseAppender(Appender):
    """
    Writes to a database. In a real implementation this would batch
    inserts and use a connection pool. The interface stays the same.
    """

    def __init__(self, formatter: Formatter, connection_string: str) -> None:
        super().__init__(formatter)
        self._conn_str = connection_string

    def _emit(self, formatted: str) -> None:
        # Placeholder: would execute INSERT INTO logs (entry) VALUES (?)
        pass


# --- Logger and LogManager ---

class Logger:
    def __init__(self, name: str, level: LogLevel = LogLevel.DEBUG) -> None:
        self.name = name
        self.level = level
        self._appenders: list[Appender] = []
        self._filters: list[Filter] = []
        self.parent: Optional[Logger] = None
        # Propagate to parent by default, matching log4j behavior
        self.propagate: bool = True

    def add_appender(self, appender: Appender) -> None:
        self._appenders.append(appender)

    def add_filter(self, f: Filter) -> None:
        self._filters.append(f)

    def debug(self, message: str, **extra: object) -> None:
        self._log(LogLevel.DEBUG, message, extra)

    def info(self, message: str, **extra: object) -> None:
        self._log(LogLevel.INFO, message, extra)

    def warn(self, message: str, **extra: object) -> None:
        self._log(LogLevel.WARN, message, extra)

    def error(self, message: str, **extra: object) -> None:
        self._log(LogLevel.ERROR, message, extra)

    def critical(self, message: str, **extra: object) -> None:
        self._log(LogLevel.CRITICAL, message, extra)

    def _log(self, level: LogLevel, message: str, extra: dict) -> None:
        if level < self.level:
            return
        record = LogRecord(level=level, message=message,
                           logger_name=self.name, extra=extra)
        if not all(f.should_log(record) for f in self._filters):
            return
        self._dispatch(record)

    def _dispatch(self, record: LogRecord) -> None:
        for appender in self._appenders:
            appender.write(record)
        # Walk up the hierarchy if propagation is enabled
        if self.propagate and self.parent:
            self.parent._dispatch(record)


class LogManager:
    """
    Singleton. Single entry point for obtaining named loggers.
    Child loggers get their parent wired automatically based on name hierarchy.
    'com.app.service' is a child of 'com.app' is a child of the root logger.
    """

    _instance: Optional[LogManager] = None

    def __new__(cls) -> LogManager:
        if cls._instance is None:
            cls._instance = super().__new__(cls)
            cls._instance._root = Logger("root")
            cls._instance._loggers: dict[str, Logger] = {}
        return cls._instance

    @property
    def root_logger(self) -> Logger:
        return self._root

    def get_logger(self, name: str) -> Logger:
        if name in self._loggers:
            return self._loggers[name]
        logger = Logger(name)
        logger.parent = self._find_parent(name)
        self._loggers[name] = logger
        return logger

    def _find_parent(self, name: str) -> Logger:
        """
        Walk up the dotted name hierarchy to find the nearest existing ancestor.
        Falls back to the root logger if no intermediate ancestor exists.
        """
        parts = name.split(".")
        for i in range(len(parts) - 1, 0, -1):
            ancestor_name = ".".join(parts[:i])
            if ancestor_name in self._loggers:
                return self._loggers[ancestor_name]
        return self._root

Why Appenders Must Be Independent of Formatters

This is the central design question. The tempting shortcut is to have each appender decide its own format. FileAppender writes text. DatabaseAppender writes JSON. The problem is that you cannot then configure them differently. You might want JSON files for production (so your log aggregator can parse them) and human-readable text for local development, both going to file appenders. Or you might want a console appender that writes JSON for a containerized environment where stdout is scraped by a collector.

When the formatter is injected into the appender at construction time, you get a clean cross product. Any formatter works with any appender. Adding a new structured formatter (say, a logfmt formatter) does not touch any appender. Adding a new appender (say, a Kafka appender) does not touch any formatter. They compose because they are genuinely independent concerns.

The Filter Chain and Chain of Responsibility

Each Filter can hold a reference to the next filter, forming a linked chain. A record propagates through the chain until some filter rejects it or the chain ends. This means you can compose filters without any single filter knowing about the others.

The alternative, a flat list of filters that the appender iterates over, also works and is actually simpler for the common case. The chain approach is useful when individual filters need to transform or augment the record as it passes through, not just accept or reject it. For pure pass/fail filtering, the flat list in _passes_filters above is more readable. Both approaches follow the same principle: filters are composable and the appender delegates to them.

The Singleton Root Logger

LogManager.__new__ implements the Singleton by storing the instance on the class. The root logger is created once and shared. Every named logger that does not configure its own appenders will eventually propagate up to the root, so configuring the root logger is sufficient for basic setups. This is exactly how Python’s built-in logging.getLogger() works, and the reason you can call logging.basicConfig() once at the top of your application and have all loggers in all libraries respect that configuration without each library doing any setup.

Design Decisions and Trade-offs

Logger hierarchy through dotted names vs. explicit parent references. The dotted name convention (com.app.service) is a log4j convention and Python’s logging module follows it. It makes hierarchy implicit and automatic, but it does mean the hierarchy is a convention rather than a type-checked relationship. An explicit parent= argument at construction time would be more explicit but more verbose and easy to misconfigure.

propagate=True by default. This means a record logged at com.app.service will be handled by the com.app.service appenders, then by the com.app appenders, then by the root logger’s appenders. This is usually what you want. If you add appenders at multiple levels without setting propagate=False, you’ll see duplicate log entries. It’s a common footgun. The design inherits this from log4j and Python’s logging module, so at least the behavior is consistent with what engineers expect.

FileAppender opens and closes the file on every write. In a real implementation, you’d keep the file handle open. I used the with open(...) pattern here to keep the example simple and side-effect-free without setup. Production file appenders buffer writes and rotate files by size or date.


If you have a question about the filter chain, or a different take on where formatters should live, I’d enjoy the conversation. Reach out on Twitter or LinkedIn.

Tags:

#lld #machine-coding #object-oriented-design #python #logging

Related Posts

Let's Connect! 💬

Whether you're looking to hire, want to collaborate on a project, or just want to chat about tech—I'd love to hear from you!