Mastering Python Refactoring: Decoupling Logic and Configuration for Scale

ArjanCodes//Dec 10, 2021//5 min read

Overview

Writing code that works is only half the battle. In software engineering, the real challenge lies in making that code maintainable, testable, and flexible. When dealing with complex tasks like or analysis, scripts often start as a single, monolithic file where configuration, logic, and external dependencies are tightly coupled.

This tutorial focuses on high-level refactoring techniques. We will dismantle "god classes" that instantiate their own subclasses—a major anti-pattern—and replace them with clean functions and . Furthermore, we will explore how to move hardcoded strings and settings into external configuration files, allowing the application to change behavior without a single line of code being rewritten.

Prerequisites

Mastering Python Refactoring: Decoupling Logic and Configuration for Scale — Refactoring A PDF And Web Scraper Part 2 // CODE ROAST

To get the most out of this guide, you should be comfortable with:

Intermediate syntax (classes, functions, and decorators).
The concept of and composition.
Type hinting and why it matters for modern development.
Basic understanding of .

Key Libraries & Tools

: Part of the typing module, used for structural subtyping (duck typing).
: Used for data manipulation, specifically handling data frames in the scraper.
: A library for displaying smart progress bars during long-running loops.
: The standard format for our external configuration files.
(Mentioned): A framework for elegantly configuring complex applications.

Code Walkthrough: From Classes to Functions

One of the biggest issues in the original code was a ScrapeRequest class that was responsible for creating its own subclasses. This creates a circular dependency and makes the code difficult to extend. We solve this by using and .

1. Defining the Scraper Protocol

Instead of a rigid class hierarchy, we define what a "scraper" looks like using a . Any class that has a scrape method matching this signature is now a valid scraper.

from typing import Protocol
from dataclasses import dataclass

@dataclass
class ScrapeResult:
    keywords: list[str]
    word_frequencies: dict[str, int]

class Scraper(Protocol):
    def scrape(self, search_text: str) -> ScrapeResult:
        ...

2. Refactoring Requests into Functions

We don't need a class for every type of request. By converting them into functions, we simplify the flow. These functions now accept a Scraper instance as a dependency.

def fetch_terms_from_doi(target: str, scraper: Scraper) -> ScrapeResult:
    # Logic to process target and call the scraper
    result = scraper.scrape(target)
    return result

3. Centralizing Logging

Duplicate logging logic is a maintenance nightmare. We create a dedicated log.py to handle both file logging and console printing in one place.

import logging

def log_message(message: str):
    logging.info(message)
    print(message)

The Power of External Configuration

Hardcoding paths, URLs, and word lists directly into your logic makes your script brittle. If you want to share your tool with a non-programmer, they shouldn't have to touch code to change the input folder. We use to map data into a typed object.

@dataclass
class ScrapeConfig:
    export_dir: str
    paper_folder: str
    target_words_file: str

def read_config(config_file: str) -> ScrapeConfig:
    with open(config_file, "r") as f:
        data = json.load(f)
    return ScrapeConfig(**data)

By passing this ScrapeConfig object down the call stack, we ensure that every component has access to the settings it needs without relying on global variables.

Syntax Notes

Protocol: This is a powerful feature of 's typing system. Unlike traditional inheritance, a class doesn't need to explicitly inherit from Scraper to be considered a Scraper. It just needs the right method.
Unpacking Operators (**data): We use the double asterisk to unpack a dictionary directly into the initializer of a . This only works if the keys in the exactly match the field names in the class.
Context Managers: Always use with open(...) for file operations and directory changes to ensure resources are cleaned up even if an error occurs.

Practical Examples

This refactoring approach is essential for:

Data Science Pipelines: Where file paths and filtering parameters change with every experiment.
CI/CD Environments: Where different configurations are needed for testing, staging, and production.
User-Facing Tools: Allowing users to modify a simple config.json instead of editing source code.

Tips & Gotchas

Avoid Instance Variable Bloat: Don't store temporary data as self.variable in a class if it's only used within a single method. Use local variables to keep the object state clean.
Type Checking Gaps: Libraries like and don't always have perfect type hints. You might encounter "Unknown" types; use typing.Any or # type: ignore sparingly when these external tools fail the linter.
Configuration Trickle: High-level objects should receive the whole ScrapeConfig, but low-level helpers should only receive the specific strings or sets they need. This keeps the low-level code reusable in other projects that don't use your specific config structure.

Topic DensityMention share of the most discussed topics · 27 mentions across 14 distinct topics

: 15%· products
: 15%· products
: 15%· products
: 11%· products
: 7%· products
Other topics: 37%

End of Article

Source video

Mastering Python Refactoring: Decoupling Logic and Configuration for Scale

Refactoring A PDF And Web Scraper Part 2 // CODE ROAST

ArjanCodes // 33:41

ArjanCodes

ArjanCodes

On this channel, I post videos about programming and software design to help you take your coding skills to the next level. I'm an entrepreneur and a university lecturer in computer science, with more than 20 years of experience in software development and design. If you're a software developer and you want to improve your development skills, and learn more about programming in general, make sure to subscribe for helpful videos. I post a video here every Friday. If you have any suggestion for a topic you'd like me to cover, just leave a comment on any of my videos and I'll take it under consideration. Thanks for watching!

What they talk about

AI and Agentic Coding News

Who and what they mention most

33.3%5

20.0%3

20.0%3

13.3%2

13.3%2

5 min read0%

5 min read