## Monday, July 5, 2010

### Prim's Algorithm

This algorithm was first propsed by Jarnik, but typically attributed to Prim. It starts from an arbitrary vertex (root) and at each stage, add a new branch (edge) to the tree already constructed, much like a mould. The algorithm halts when all the vertices in the graph have been reached.

To understand the problem statement refer here.

This is similar to  djikstra's algorithm where we were clear where to begin with as we were given the source vertex, but here we have to choose arbitrarily.

This strategy is greedy in the sense that at each step the partial spanning tree is augmented with an edge that is the smallest among all possible adjacent edges.

Example

We can now select the edge with min weight between 3 edges - 2,3,4. Lets select 2 as it is min.

Now we have 3 edges - 3 ,4 and 5, from the nodes that we have visited, so lets take 3 as it is minimum, but it doesn't add any new node, so we just discard it. (Marker in the below diagram means that we are just covering it)

Now we have 2 options of edges - 4 or 5, and as 4 is minimum we select it, it adds the last nodes also to the visited nodes, and hence prim's algo is complete.

So edges 1,2 and 4 make MST.

MST-PRIM

Input: A weighted, undirected graph G=(V, E, w)
Output: A minimum spanning tree T.

```T={}

Let r be an arbitrarily chosen vertex from V.

X = {r}

WHILE | U| < n

DO

Find u in X and v in V-X such that the edge (u, v) is a smallest edge between X-V.

T = T U{(u, v)} //add u,v to T

X= X U {v}    //add v to vertices covered set U
```

Analysis
The algorithm spends most of its time in finding the smallest edge. So, time of the algorithm basically depends on how  do we search this edge.

Just find the smallest edge by searching the adjacency list of the vertices in V. In this case, each iteration costs O(m) time, yielding a total running time of O(mn) OR O(|E|.|V|).

Can we do better?
We will focus on adjacency lists only. Can we do better? Answer is yes. As we saw in the main WHILE loop, we are getting the minimum of the edge lengths crossing the frontier, and getMin() of heap will be very useful, which will speed up repeated getMin() elements on the edges.

Heap operation costs
• Insert element ----- O(log n)
• Get Min element ----O(log n)
More on heap - here.

This is similar to djikstra''s algo where we stored edges, but here we have to vertices.

Prim's Algorithm with Binary heap
By using binary heaps, the algorithm runs in O(m log n).

Invariant 1 : elements in the heap will be vertices   i.e. vertices of V-X
Invariant 2 : for v ∈ V-X, key[v] = cheapest edge (u,v) with u ∈ X. If this edge don't exists, put infinity.

Check - Can initialize heap with O (m+log n) = O(m log n) preprocessing.

Now we have to make sure that above 2 variants hold when we execute WHILE loop.
So, first variant will hold as we have initialized the heap with vertices only.

Making the 2nd variant
eg.
```when v added to X{
foreach edge (v,w) in E {
if(w belongs to V-X){
Delete w from heap
Recompute key[w] = min ([key, (v-w))
re-insert into heap
}
}
```

This ensures that 2nd variant holds. Deleting the vertex at some position is little tricky, so we have to do some book keeping to keep the index of the vertex as well.

Analysis:
• n-1 Inserts during pre-processing
• n-1 extract min (one per iteration while loop)
• each edge (v,w) triggers one delete/insert combo, implies O(m) time.
So, overall O(m log n) time.

Fibonacci heap
By using Fibonacci heaps, the algorithm runs in O(m + n log n) = O (m log n)

Correctness of Prim's algorithm
Theorem - Prim's algorithm always computes an MST.

Part 1 - Prim's algorithm atleast computes a spanning tree T*
Part 2 - T* is an MST

To be done later.