The God object trap Most developers start with a sensible class. It begins as a simple container for a data path and a few settings. However, Python classes frequently suffer from "feature creep." You add a data loading method, then a cleaning step, then model training logic. Suddenly, you have created a **God object**: a single class that knows too much and does too much. This anti-pattern makes testing difficult and modification dangerous, as every local change ripples through a massive, interconnected mess. Moving validation close to the data The first step in refactoring involves separating configuration from execution. By utilizing a Data Class with the `frozen=True` parameter, you create an immutable contract for your settings. Instead of keeping validation logic inside a bloated run method, you should place it within a `__post_init__` method. ```python @dataclass(frozen=True) class TrainingConfig: data_path: Path output_dir: Path test_size: float def __post_init__(self): if not self.data_path.exists(): raise FileNotFoundError(f"{self.data_path} not found") if not 0 < self.test_size < 1: raise ValueError("test_size must be between 0 and 1") ``` This shift ensures that once a `TrainingConfig` object exists, it is guaranteed to be valid. You no longer need to sprinkle `if` statements throughout your processing logic to check if paths exist or parameters are within range. Decoupling workflows from containers A critical rule of thumb emerges: if behavior relates to the data itself—validating it or deriving small values—keep it in the class. But if the behavior involves external libraries like Pandas or complex orchestration, it belongs in a standalone function. We break down the monolithic `run` method into granular, pure functions. Loading data becomes distinct from cleaning data. Feature engineering is stripped of its `self` dependency. This transformation turns a rigid class hierarchy into a flexible pipeline where functions only receive the specific data they need to operate. The final orchestration then happens in a clean, high-level `run_experiment` function that simply coordinates these smaller, testable units. By separating the "what" (data) from the "how" (workflow), you build software that survives long-term maintenance.
Pandas
Libraries
Jul 2021 • 1 videos
High activity month for Pandas. ArjanCodes among the most active voices, with 1 videos across 1 sources.
Sep 2021 • 1 videos
High activity month for Pandas. ArjanCodes among the most active voices, with 1 videos across 1 sources.
Aug 2022 • 2 videos
High activity month for Pandas. ArjanCodes among the most active voices, with 2 videos across 1 sources.
Sep 2022 • 1 videos
High activity month for Pandas. ArjanCodes among the most active voices, with 1 videos across 1 sources.
Dec 2025 • 1 videos
High activity month for Pandas. ArjanCodes among the most active voices, with 1 videos across 1 sources.
May 2026 • 1 videos
High activity month for Pandas. ArjanCodes among the most active voices, with 1 videos across 1 sources.
- May 15, 2026
- Dec 5, 2025
- Sep 23, 2022
- Aug 26, 2022
- Aug 5, 2022
Overview A Plugin Architecture allows you to extend an application's functionality without modifying its core source code. This pattern is essential for shipping software that remains open to extension but closed to modification. By decoupling the main logic from specific implementations, you can add features like new game characters or data processing modules simply by adding new files and updating a configuration. This tutorial demonstrates how to use Python to build a system where modules register themselves into a factory dynamically. Prerequisites To follow this guide, you should be comfortable with: * Basic Python syntax and Object-Oriented Programming (OOP). * The concept of JSON for data storage. * Familiarity with Python's `typing` module, specifically Protocols. Key Libraries & Tools * **importlib**: A built-in Python library used to import modules programmatically. * **typing.Protocol**: Used for structural typing to define an interface that plugins must adhere to. * **dataclasses**: Simplifies the creation of classes that primarily store data. Code Walkthrough 1. Defining the Interface We start by defining what a "character" looks like using a Protocol. This ensures that any plugin we load has the necessary methods, such as `make_noise`. ```python from typing import Protocol class GameCharacter(Protocol): def make_noise(self) -> None: ... ``` 2. Building the Factory The factory acts as a registry. It maintains a dictionary mapping string keys (from our JSON level definition) to creation functions. ```python from typing import Callable, Any character_creation_funcs: dict[str, Callable[..., GameCharacter]] = {} def register(character_type: str, creation_func: Callable[..., GameCharacter]): character_creation_funcs[character_type] = creation_func def create(arguments: dict[str, Any]) -> GameCharacter: args_copy = arguments.copy() char_type = args_copy.pop("type") try: creation_func = character_creation_funcs[char_type] return creation_func(**args_copy) except KeyError: raise ValueError(f"Unknown character type {char_type}") ``` 3. The Dynamic Loader This is the heart of the plugin system. It uses importlib to find and execute a plugin's initialization code. Each plugin must expose an `initialize` function. ```python import importlib def load_plugins(plugins: list[str]) -> None: for plugin_name in plugins: module = importlib.import_module(plugin_name) module.initialize() ``` 4. Creating a Plugin A plugin is just a separate Python file. For example, `plugins/bard.py` defines a new class and registers it back to the core factory. ```python from dataclasses import dataclass from game import factory @dataclass class Bard: name: str instrument: str = "flute" def make_noise(self) -> None: print(f"{self.name} plays the {self.instrument}!") def initialize(): factory.register("bard", Bard) ``` Syntax Notes We use **structural typing** via `typing.Protocol`. Unlike traditional inheritance, a class doesn't need to explicitly inherit from `GameCharacter`. As long as it implements `make_noise`, Python treats it as a valid implementation. We also utilize `**kwargs` unpacking in the factory's `create` method to pass JSON data directly into class constructors. Practical Examples * **Game Modding**: Allow players to drop a `.py` file into a folder to add custom items. * **Data Pipelines**: Add support for new file formats (CSV, Parquet, Avro) by creating reader plugins. * **CLI Tools**: Let users add custom commands to a central utility script without changing the core binary. Tips & Gotchas Always use a **try-except** block when accessing the factory dictionary to provide clear error messages for missing types. When popping the `type` key from arguments, make a copy of the dictionary first to avoid side effects that might break other parts of your application.
Sep 17, 2021The Hidden Cost of Code Smells A code smell isn't a bug. Your program might run perfectly, pass every test, and deliver the correct output while still being a disaster waiting to happen. In Python, smells are early warning signs that your software design is decaying. They make code harder to read, difficult to maintain, and nearly impossible to test. Identifying these patterns early is the difference between a project that scales and one that collapses under its own technical debt. Parameter Bloat and Feature Envy Passing too many parameters into a function is one of the most common smells. When a method requires six or seven arguments, it's often a sign of **Feature Envy**. This happens when a method in one class needs to know too many implementation details about another. Instead of passing every individual piece of data, pass the object itself. If your `add_vehicle` function needs a brand, model, price, and year, just pass a `VehicleModelInfo` instance. This simplifies the function signature and keeps the logic where it belongs. Adding sensible default values to your data classes further reduces the noise, letting you focus on what truly changes between calls. Flattening the Nest Deeply nested code is a cognitive burden. When you see multiple `if` statements tucked inside a `for` loop, your eyes have to track too many levels of logic. We call this the **Arrow Anti-pattern**. You can flatten this structure by using guard clauses. Instead of wrapping the main logic in a massive `if` block, check for the error condition first and exit early. If a vehicle model isn't found, raise an exception or return immediately. This keeps the "happy path" of your function at the lowest indentation level, making the primary intent of the code clear at a glance. Choosing the Right Data Structure Iterating through a list to find a specific item is a common bottleneck. If you find yourself frequently looping over a collection to match a key, you are using the wrong data structure. Switching from a list to a dictionary (or `dict`) can transform an $O(n)$ search into an $O(1)$ lookup. In a vehicle registry, for example, using a tuple of `(brand, model)` as a dictionary key allows for instant retrieval. It's cleaner, faster, and more idiomatic Python. Namespace Pollution and Tooling Wildcard imports—like `from random import *`—are a recipe for chaos. They pollute your namespace and make it impossible for other developers (or even your IDE) to know where a function originated. Be explicit. Import only what you need, or import the module itself and use the dot notation. Beyond manual fixes, use the ecosystem. Tools like Pylint catch these smells automatically. Combine them with an opinionated formatter like Black to ensure your code remains consistent. When you stop fighting over formatting, you can spend your energy on the architectural decisions that actually matter.
Jul 30, 2021