INB371 Week 11 Lecture
Minimum Cost Spanning Tree
Disjoint Set Data Structure
Single Source Shortest Path
Kruskal’s algorithm is important for next assignment
A spanning tree is a special kind of subgraph, where it’s fully connected but there’s no cycles of edges.
If you have N vertices, a spanning tree has exactly N-1 edges.
There can be a number of spanning trees for any connected graph. But you don’t want any spanning
tree, you want to minimize the costs wherever possible. To keep the costs down further, you can use a
minimum-cost spanning tree.
Place each vertex in its own cluster or set.
Take the edge with the smallest weight.
If e connected two vertices in different clusters, then e is added to the minimum spanning tree
and the two clusters connected by e are merged into a single cluster.
If e connects two vertices which are already in the same cluster, ignore it.
If the edge connects two vertices in its own cluster, that means there’s already a connection, so if
there’s another edge there’s a cycle between the two vertices.
Design decisions for Kruskal’s Algorithm
- Need a sorted list of the edges in the graph in vector/priority queue based on minimum heap.
- Need an efficient datastructure for detecting cycles.
o Set based on a Hash Table.
Downside to hashtable is that merging clusters would be O(n2), which isn’t
o Binary Search Tree
Searching is O(n log n), which is inefficient
Efficient Union and Find Operations
Union and Find involve manipulating objects of many types.
Giving objects an index makes it easier to understand. Disjoint sets work by changing the index of an object in an array to be the same ‘set’.
When referring to mutually connected vertices, you can call them connected components.
Each union reduces by 1 the number of components.
We need to think about finding components (e.g. are they in the same set) and union commands
(replacing sets containing two items by their union). These both need to be efficient as possible for use
in high object count scenarios.
An integer array of size N.
Interpretation: objects p and q are connected if they have the same value in the integer array for their
e.g. if p = id*3+ = 3 and q = id*6+ = 6, they aren’t in the same set. If p = id*3+ = 6 and q = id*6+ = 6, they are
Downside to quickfind is that as you merge sets, you have to change every element in the array, so for
large counts of items your efficiency goes down as its O(N). The tree is flat, but it’s expensive to keep
An integer array of size N
Interpretation: id[i] is parent of I, root of I is id[id[id[id[ etc. Basically, array indexes inside array indexes
become a form of tree in itself.
e.g. check if p and q have the same root, if not, set the id of q’s root to the id of p’s root. Find is now
logN instead of O(1).
Downside: trees can ge very tall, and find is too expensive (could be N steps). Need to do a Find to do a
Modify quick union to avoid tall trees, if we can reduce the height of trees, find becomes moer efficient.
To do this we need to keep track of the size of each component, And we balance this by linking the
smaller set of elements to the larger set of elements.
Reducing Height of Trees
To implement this, you need create a new array of size N to keep track of the size of
We change the union to merge the smaller tree into the larger tree, and to update the