Sunday, August 4, 2013

Dijkstra's Shortest Path Algorithm

It's Dijkstra's algorithm,  that lets you find single-source shortest paths.

Problem
Problem - Single source shortest paths
Input -

  • Directed graph G=(V,E)
  • Source vertex s 
Output - foreach v∈V, compute L(v)=length of a shortest s-v path in G.


Assumptions:

  1. There is a path from vertex s to vertex v (otherwise, the path length is infinity)
  2. Edge lengths are non-negative (there are other algorithms for this called , but most famous being Bellman ford algorithm which uses dynamic programming)
Example of shortest path
Consider the graph:

Consider the source vertex as a.
Shortest path from a to b is 2, as we can first go to a, and then b. So direct path is of less use here.
Shortest path from a to c is 1, which is a direct path.
Now, here the shortest path from a to d, is a,c,b, d with the total length of 6, though we go zig zag, all the other routes are not good, like a TO c TO d leads 7 and similarly a TO b TO d leads 7 :

Why another shortest-path algorithm?
One might as why use Dijkstra's Algorithm if you can already find the shortest path via Breadth First Search. However, BFS actually only works when all the edge length are one (or equal weight). It only tells you how many edges it takes to get from s to v. So in BFS, shortest path is defined by the minimum number of vertexes treversed and it is same as minimum number of edges plus one.

But then you can replace each edge e with with a directed path of length(e) edges. So if edge length 3, we can have 3 edges of equal length. This will make the graph way too big. It is too wasteful and it won't be in linear time anymore. So, even if BFS runs in linear time, it is not actual linear time.

Thirdly, you could always just add a large constant to each edge length so that there are no negative edge lengths. This seems plausible but then it will skew the shortest path results. If we add 5 to every edge length, a path with 10 edges will be increased by 50 while a path with only 5 edges would only be increased by 25.

But Dijkstra's algorithm gives shortest path with non-negative weights in generic way.

Pseudocode for directed graph
Notations
X - set containing vertices for which shortest path is correctly found. So, it will contain the vertices processed so far.
During the algorithm, we have to consider X as some kind of mould, which will grow and grow taking 1 vertex and cover the entire grow
A[s] - Shortest path computed for each vertex from source vertex s.
B - [completed shortest paths(A contains distances and B contains the actual path) Note that B is not required in actual implementation, just used for book-keeping"
V - Set containing all the vertices
V-X - - Set containing vertices not conquered

Initialize:

X = {s} 
A[s] = 0 
B[s] = empty path //B from s to s is 0

Main Loop:
//We have to grow X as mould, which will increase 1 node by 1 node
//We have to process vertex X which is just 1 hop away from X, we grow X by 1 node
//V total number of nodes
while (X != V) [when not all vertices have been processed]
     - among all edges (v, w) in E, with v in X and w not in X, pick one that minimizes A[v] + l(vw) [Dijkstra's greedy criterion]
     - call this edge (v*, w*)
     - add w* to X
     - set A[w*] := A[v*] + l(v*w*) //greedy approach
     - set B[w*] := B[v*] U (v*, w*)

Example
(Taken from Algorithms: Design and Analysis, Part 1 by Tim Roughgarden - coursera) :

Consider the following graph:

Lets start with while loop, X=s, and A[s]=0, B[s]= null


Now we use Djikstra's greedy score for u,w such that new score = A[v]+lvw. From s, we have 2 paths - s-v and s-w. Lets |d| we use as shortest distance of s to other vertex. |s-v| = 1 and |s-w| = 4. Following the greedy approach, we decide that we choose |s-v| as it has smaller amount. So, now A[v] = 1 i.e. the shortest distance currently and B[v] = (s,v) which has corresponding path.

Now we go to next iteration of while loop, with our new set X = s,v. Also, we will see all edges which are crossing the frontier, with tail in X, and head outside of X. So we have 3 edges - v-w, s-w, v-t.
Now, applying the greedy formula, |v-w| = |s-v|+dvw. So we have |v-w| = 1+2 = 3.

|s-w| = 4, and A(|v-t|) = A(v)+|v-t|= 1+6 = 7. So we have to choose |v-w| as it is shortest with 3.

Now, A = 3, B(w) = (s,v,w). X now contains - s,v,w

So, we have 3 vertices covered.


Now , we come to last iteration. We have only 1 vertex left which is not in X, i.e. w. So, to bring in t, we have 2 paths - v-t,w-t.
|v-t| = A(v)+|v-t| = 1+6 = 7
|w-t| = A(w)+|w-t| = 3+3 = 6

So, we have to add t to X. A[t] = 6. B[t] = s,v,w,t.

Why Dijkstra's algorithm don't work for negative edges?
Suppose, most negative edge has weight of -10, adding 10 to all the paths, making all edge weights non-negative. But it will not solve the problem, as number of nodes in the path matters. Suppose, we have 2 vertices s and t, and they have 2 paths - one with 5 nodes and other with 2 nodes. So adding 10 to individual paths change the path length by 50 in first case and 20 in second case, hence changing the path lengths and affecting the shortest path.

Running time of dijkstra's algorithm
Running time of dijkstra's algorithm depends on the implementation method used, we will come to that soon.

Adjacency list implementation : If we use graph with adjacency list, running time is Θ(|V|.|E|), where V is number of vertex and E is number of edges. We can see that as there are V-1 iterations of while loop, and we iterate over edges whether it is part of X or not, which is not constant but proportional to E.

How can we do better? We have to change the data structure used. So we used Heap.

Heap implementation : O(|E| log |V|) - Blazingly fast :) Let see the implementation now.


Implementing Dijkstra's algorithm
Here we will discuss the dijkstra's algorithm using Heaps (Normal implementation can be done using adjacency list)
Heaps are perfectly balanced binary tree and has following property:
- node key <= children keys (In case of minHeap)
Heap ADT's operations - ExtractMin (extract min element from heap), Insert, delete. You can see heap introduction here. Also, you can read about heap here.

Invariant 1
Coming to dijkstra's algorithm, heaps should be used to vertices and not edges.
We have 2 sets : X and V-X, where X is vertices explored, and V-X vertices unexplored.

Invariant 2
As we are storing vertices and not edges, then we have to be smart enough to use edge length in key, as we have to pickup min of edges.
k[v] = smallest djikstra's greedy score of an edge u,v


Rest will be done later.

Please let me know if article seems unclear to you. Thanks.

0 comments:

Post a Comment