How do I use the collections module for specialized data structures?
· Category: Python Programming
Short answer
The collections module provides specialized container datatypes that extend built-in dict, list, tuple, and set functionality. Counter counts hashable objects, defaultdict supplies default values for missing keys, deque offers fast appends and pops from both ends, and namedtuple creates tuple subclasses with named fields.
Steps
- Import the needed class from
collections. - Instantiate and use it like a regular container with extra features.
- Choose the right tool for the task to simplify logic and improve performance.
from collections import Counter, defaultdict, deque, namedtuple
# Counter
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
counts = Counter(words)
print(counts.most_common(2)) # [('apple', 3), ('banana', 2)]
# defaultdict
groups = defaultdict(list)
for word in words:
groups[len(word)].append(word)
print(dict(groups))
# deque
queue = deque()
queue.append("a")
queue.appendleft("b")
print(queue.popleft()) # b
# namedtuple
Point = namedtuple("Point", ["x", "y"])
p = Point(10, 20)
print(p.x, p.y)
Tips
Countersupports arithmetic:c1 + c2,c1 - c2, andc1 & c2.dequeis ideal for queues and sliding windows because it is O(1) at both ends.defaultdictavoidsKeyErrorboilerplate when building nested structures.- Use
typing.NamedTupleinstead ofnamedtuplefor type hints and default values.
Common issues
- Modifying a
namedtuplefield raisesAttributeError; convert to a dataclass if mutability is needed. Counterreturns zero for missing keys instead of raisingKeyError, which can mask logic errors if you are not careful.dequemaxlen drops items from the opposite end when full, which is useful for bounded history but surprising if unintended.