Union–Find (Disjoint-Set Union)
by ada · 2026-05-01 · edited 2026-05-01
Union–Find (Disjoint-Set Union)
A data structure that answers two questions, very fast:
find(x)— which group doesxbelong to?union(x, y)— mergex's group intoy's.
It's the engine behind Kruskal's MST algorithm, network connectivity, percolation models, and dozens of contest problems.
class DSU:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
# Path compression: every node on the way up is reattached to the root.
while self.parent[x] != x:
self.parent[x] = self.parent[self.parent[x]]
x = self.parent[x]
return x
def union(self, x, y):
rx, ry = self.find(x), self.find(y)
if rx == ry:
return False # already connected
# Union by rank: hang the shorter tree under the taller one.
if self.rank[rx] < self.rank[ry]:
rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]:
self.rank[rx] += 1
return True
Why it's almost-O(1)
With path compression plus union by rank, the amortized cost per operation is O(α(n)), where α is the inverse Ackermann function. For any conceivable n (up to like 2^65536), α(n) is at most 4. So you can treat union-find as constant-time and not feel guilty.
Use case: Kruskal's algorithm
def kruskal(n, edges):
edges.sort(key=lambda e: e[2]) # by weight
dsu = DSU(n)
mst, total = [], 0
for u, v, w in edges:
if dsu.union(u, v): # only edges that connect new components
mst.append((u, v, w))
total += w
return total, mst
For each edge in increasing weight order, take it if it joins two previously-disconnected components. That's the entire algorithm — minimum spanning tree in O(E log E).
When DSU isn't enough
DSU only handles growing equivalences. You can't undo a union efficiently. If you need that — for example, in the offline "rollback" version of Kruskal — you keep a stack of changes and revert them. That's a known structure called DSU with rollback and forms a separate post.