Refactoring Python: Eliminating 7 Common Code Smells for Cleaner Architecture
The Hidden Cost of Code Smells
A code smell is not a bug. Your program might run perfectly, passing every test case and delivering the expected output. However, code smells are subtle indicators of deeper design flaws that make projects difficult to maintain, test, or extend. Identifying these patterns early prevents the technical debt that often turns a simple feature request into a week-long debugging session. By focusing on how we structure data and define relationships between objects, we can move from code that just works to code that is truly professional.
Prerequisites and Toolkit
To follow this guide, you should have a solid grasp of fundamentals, specifically . We will work extensively with , a feature introduced in that simplifies class creation by automatically generating methods like __init__ and __repr__.
During this walkthrough, I utilize , an AI-assisted code completion tool that suggests snippets based on your team's patterns. It is particularly useful for reducing boilerplate when refactoring large classes. We also rely on the typing module, specifically , to implement structural subtyping and decouple our components.
Structural Refactoring: Data and Naming
One of the most frequent smells involves using "too elaborate" data structures. In our initial point-of-sale example, an Order class maintained three separate lists for item names, quantities, and prices. This is a maintenance nightmare; if you update one list but forget the others, your data goes out of sync.
Grouping Related Data
Instead of parallel arrays, we encapsulate these related fields into a single data class.
from dataclasses import dataclass
@dataclass
class LineItem:
item: str
quantity: int
price: int
@property
def total_price(self) -> int:
return self.quantity * self.price
Precision in Naming
We also addressed misleading names. A method titled create_line_item that merely appends an object to a list is better named add_line_item. Precise naming establishes clear responsibility and prevents developers from making incorrect assumptions about what a function does under the hood.
Decoupling Classes and Responsibilities
A class with too many instance variables often indicates it has taken on too many responsibilities. Our Order class originally held every detail of a , from their email to their postal code. This violates the Single Responsibility Principle. By extracting these into a separate class, we make the customer data reusable across other parts of the system, such as a CRM or mailing service.
The "Ask, Don't Tell" Principle
We also fixed the "verb-subject" smell. Instead of having a system pull data out of an order to calculate a total, we moved that logic into the Order class itself. Using 's sum function with a generator expression creates a clean, functional approach to this calculation:
@property
def total_price(self) -> int:
return sum(item.total_price for item in self.items)
Advanced Dependency Management
Hard-wired sequences and object creation within initializers are the most dangerous smells because they create rigid coupling. If the POS system creates its own StripePaymentProcessor, you can never test that system without hitting the real API.
Dependency Injection and Protocols
By using dependency injection, we pass the processor into the initializer. To keep things flexible, we define a . This acts as an interface; the POS system doesn't need to know it's using ; it only needs to know it has an object with a process_payment method.
from typing import Protocol
class PaymentProcessor(Protocol):
def process_payment(self, reference: str, price: int) -> None:
...
class POS:
def __init__(self, payment_processor: PaymentProcessor):
self.payment_processor = payment_processor
Syntax Notes and Best Practices
When refactoring, pay attention to 's evolving syntax. We used from __future__ import annotations to handle forward references in type hints, allowing a class method to return an instance of its own class before the class is fully defined.
As a final bonus, always prefer keyword arguments. Passing a string of raw integers like (102, 1, 500) to a constructor is ambiguous. Using id=102, quantity=1, price=500 makes the code self-documenting and resilient to changes in argument order. Refactoring isn't just about making code shorter; it's about making the intent behind the code undeniable.
Tips & Gotchas
- The List Factory Trap: In , never use
items: list = []. This creates a shared mutable default. Always usefield(default_factory=list). - Law of Demeter: If you see chains like
self.order.customer.address.city, you are leaking implementation details. The calling object knows too much about the internal structure of its neighbors. - Async Initializers: Remember that
__init__cannot beasync. If your object requires an asynchronous setup (like a network connection), use a staticcreatefactory method to handle the awaitable logic before returning the instance.
- 25%· software
- 13%· concepts
- 13%· libraries
- 13%· libraries
- 13%· companies
- Other topics
- 25%

Purge These 7 Code Smells From Your Python Code
WatchArjanCodes // 29:43
On this channel, I post videos about programming and software design to help you take your coding skills to the next level. I'm an entrepreneur and a university lecturer in computer science, with more than 20 years of experience in software development and design. If you're a software developer and you want to improve your development skills, and learn more about programming in general, make sure to subscribe for helpful videos. I post a video here every Friday. If you have any suggestion for a topic you'd like me to cover, just leave a comment on any of my videos and I'll take it under consideration. Thanks for watching!