How do I use the collections module for specialized data structures?

· Category: Python Programming

Short answer

The collections module provides specialized container datatypes that extend built-in dict, list, tuple, and set functionality. Counter counts hashable objects, defaultdict supplies default values for missing keys, deque offers fast appends and pops from both ends, and namedtuple creates tuple subclasses with named fields.

Steps

  1. Import the needed class from collections.
  2. Instantiate and use it like a regular container with extra features.
  3. Choose the right tool for the task to simplify logic and improve performance.
from collections import Counter, defaultdict, deque, namedtuple

# Counter
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
counts = Counter(words)
print(counts.most_common(2))  # [('apple', 3), ('banana', 2)]

# defaultdict
groups = defaultdict(list)
for word in words:
    groups[len(word)].append(word)
print(dict(groups))

# deque
queue = deque()
queue.append("a")
queue.appendleft("b")
print(queue.popleft())  # b

# namedtuple
Point = namedtuple("Point", ["x", "y"])
p = Point(10, 20)
print(p.x, p.y)

Tips

  • Counter supports arithmetic: c1 + c2, c1 - c2, and c1 & c2.
  • deque is ideal for queues and sliding windows because it is O(1) at both ends.
  • defaultdict avoids KeyError boilerplate when building nested structures.
  • Use typing.NamedTuple instead of namedtuple for type hints and default values.

Common issues

  • Modifying a namedtuple field raises AttributeError; convert to a dataclass if mutability is needed.
  • Counter returns zero for missing keys instead of raising KeyError, which can mask logic errors if you are not careful.
  • deque maxlen drops items from the opposite end when full, which is useful for bounded history but surprising if unintended.