Simplifying Architecture: Refactoring a Python Data Validator CLI
Overview
Most developers fall into the trap of over-engineering early in a project. We often reach for complex design patterns like

We are looking at an interactive shell designed to load
Prerequisites
To follow this tutorial, you should have a solid grasp of
Key Libraries & Tools
- Python: The core programming language used for the entire application.
- Pandas: Used for high-performance data manipulation and loading CSV files into memory.
- Pydantic: Originally used for argument validation (later refactored for simplicity).
- Pytest: Our primary testing framework for ensuring refactored logic remains sound.
- Python: Utilized for adding type hints,
Protocol, andCallabledefinitions to improve code clarity.
Code Walkthrough: From Classes to Functions
The original code used a classic exit, import, merge) was a separate class with an execute method. This created a massive amount of file-system noise. Here is how we simplify it.
1. Decoupling the Event System
The project uses an event system to handle updates. Instead of nesting this inside a controller, we move it to a standalone module and simplify the logic. We add support for a "star" (*) listener, allowing one function to catch all events—perfect for a shell that just needs to print messages to the user.
# events.py
from typing import Any, Callable
_event_listeners: dict[str, set[Callable]] = {}
def register_event(event_name: str, listener: Callable[..., None]) -> None:
if event_name not in _event_listeners:
_event_listeners[event_name] = set()
_event_listeners[event_name].add(listener)
def raise_event(event_name: str, *args: Any, **kwargs: Any) -> None:
listeners = _event_listeners.get("*", set()).union(_event_listeners.get(event_name, set()))
for listener in listeners:
listener(*args, **kwargs)
2. Refactoring Commands to Functions
There is no need for a ShowFilesCommand class when a simple function will do. By using a dictionary to map strings to functions, we eliminate the need for a complex Factory pattern. We also replace
# commands/show_files.py
from .model import Model
from ..events import raise_event
def show_files(model: Model) -> None:
table_names = list(model.data_frames.keys())
message = f"Files present: {', '.join(table_names)}"
raise_event("display_message", message)
3. Implementing the Command Factory
With commands now being functions, the factory becomes a simple registry. This is much easier to read and extend than a series of class registrations.
# commands/factory.py
from typing import Any, Callable
from .exit import exit_app
from .show_files import show_files
CommandFunc = Callable[..., None]
COMMANDS: dict[str, CommandFunc] = {
"exit": exit_app,
"files": show_files,
}
def execute_command(name: str, *args: Any) -> None:
if name in COMMANDS:
COMMANDS[name](*args)
Syntax Notes: Protocols vs. ABCs
One major change in this refactor is the move from
from typing import Protocol
class Model(Protocol):
def get_data(self, alias: str) -> Any: ...
def delete_data(self, alias: str) -> None: ...
Practical Examples
This refactored architecture is ideal for any
In a real-world scenario, you might extend this by:
- Adding a Logger: Instead of just printing, have the event system send data to a logging service.
- Configuration Files: Use TOMLorJSONto define a list of files that should automatically load when the shell starts.
- Advanced Querying: Integrate DuckDBto allow SQL-like queries directly on the loadedPandasDataFrames.
Tips & Gotchas
- Avoid Global Namespace Pollution: Always wrap your startup code in a
if __name__ == "__main__":block and amain()function. This prevents variables from leaking into the global scope and makes your code easier to import for testing. - Relative vs. Absolute Imports: When working within a package, use relative imports (
from . import module). This allows you to rename folders or move the package without breaking every internal reference. - The YAGNI Principle: "You Ain't Gonna Need It." Don't build an MVC structure just because you might add a GUI later. Build the simplest version that works today. If you need a GUI tomorrow, the clean, functional code you wrote will be easy to adapt.
- Testing Output: Use the
capsysfixture inPytestto capturestdout. This is the most reliable way to test that your shell is actually displaying the correct messages to the user.

Fancy watching it?
Watch the full video and context