Disjoint set data structure
Previous: Data structure
The disjoint set data structure, also called a union-find data structure, is used in many graph problems to represent a collection of disjoint sets. It provides two main operations: union for joining sets, and find for determining which set an element belongs to.
Useful to think of electricity as an example. If there are a lot of nodes, connecting them will put them in the same circuit, but if two nodes aren’t connected, then they aren’t in the same circuit.
Two key optimizations: union by size/rank (attach smaller tree under larger) and path compression (flatten tree during find). Together these give nearly constant amortized time: O(α(n)) where α is the inverse Ackermann function.
struct Node {
int parent;
int size;
};
int find(vector<Node>& set, int x) {
while (set[x].parent != x) {
int parent = set[x].parent;
set[x].parent = set[parent].parent; // path compression
x = parent;
}
return x;
}
bool unite(vector<Node>& set, int x, int y) {
x = find(set, x);
y = find(set, y);
if (x == y) return false;
if (set[x].size < set[y].size) swap(x, y);
set[y].parent = x;
set[x].size += set[y].size;
return true;
}