Mastering the Iterator Protocol: Building Scalable Logic with Python Itertools
Overview of the Iterator Protocol
At its core, an iterator is a stateful object that lets you traverse a sequence of data one element at a time. This mechanism matters because it decouples the data’s storage from the logic used to consume it. Instead of loading an entire dataset into memory,
Prerequisites

Before diving into the implementation, you should have a firm grasp of:
- Basic Pythonsyntax and data structures (lists, tuples, and dictionaries).
- The concept of loops and conditional logic.
- Class definitions and dunder (double underscore) methods.
Key Libraries & Tools
- itertools: A built-in Pythonmodule that provides a suite of fast, memory-efficient tools for creating iterators for efficient looping.
- dataclasses: Used for creating structured data objects that can be made immutable (frozen) for use in specific iterator patterns.
Understanding the Iterable vs. Iterator Distinction
People often use these terms interchangeably, but they represent different roles in the protocol. An iterable is an object capable of returning an iterator (like a list or tuple). An iterator is the actual object that tracks the current state of the traversal.
countries = ("Germany", "France", "Italy")
# Getting an iterator from an iterable
country_iterator = iter(countries)
print(next(country_iterator)) # Germany
print(next(country_iterator)) # France
If you call iter() on an iterator, it simply returns itself. However, calling iter() on an iterable creates a brand-new iterator starting from the beginning. This subtle difference allows multiple independent traversals over the same data source simultaneously.
Implementing Custom Iterators
You can build your own traversal logic by implementing the __iter__ and __next__ methods within a class. This is particularly useful for generating sequences that don't exist in memory, such as a custom range or an infinite counter.
class NumberIterator:
def __init__(self, maximum: int):
self.number = 0
self.maximum = maximum
def __iter__(self):
return self
def __next__(self):
if self.number >= self.maximum:
raise StopIteration
self.number += 1
return self.number
Advanced Composition with Itertools
The
Chaining and Permutations
import itertools
items = ['A', 'B']
more_items = ['C', 'D']
# Combine sequences
combined = itertools.chain(items, more_items)
# Find all pairs
pairs = list(itertools.combinations(items + more_items, 2))
Functional Transformations with Starmap
starmap is a powerful alternative to standard mapping when your data is already grouped into tuples. It unpacks the arguments for you automatically.
data = [(2, 6), (8, 4), (5, 3)]
# Multiplies X * Y for each tuple
totals = list(itertools.starmap(lambda x, y: x * y, data))
Syntax Notes & Best Practices
- StopIteration: Always raise this error in
__next__to signal the end of the sequence. For loops handle this exception automatically. - Frozen Dataclasses: When iterating over sets of objects, ensure your dataclassesare
frozen=Trueso they are hashable. - Readability: While you can chain multiple itertoolsfunctions, avoid "one-liners" that become impossible to debug. Break complex chains into intermediate variables with descriptive names.
Tips & Gotchas
Iterators are one-time use. Once you exhaust an iterator (by reaching the end), it is spent. If you need the data again, you must create a new iterator instance. A common mistake is trying to iterate over the same iterator variable twice and wondering why the second loop produces no output.

Fancy watching it?
Watch the full video and context