Friday, August 30, 2013

find four elements in array whose sum equal to a given number X

This can be solved efficiently via using HashTables.

We can have a hashtable sums sums will store all possible sums of two different elements. For each sum S it returns pair of indexes i and j such that a[i] + a[j] == S and i != j. But initially it's empty, we'll populate it on the way. So, this can be done in O(n^2) time.

Pseudocode


for (int i = 0; i < n; ++i) {
    // 'sums' hastable holds all possible sums a[k] + a[l]
    // where k and l are both less than i

    for (int j = i + 1; j < n; ++j) {
        int current = a[i] + a[j];
        int rest = X - current;
        // Now we need to find if there're different numbers k and l
        // such that a[k] + a[l] == rest and k < i and l < i
        // but we have 'sums' hashtable prepared for that
        if (sums[rest] != null) {
            // found it
        }
    }

    // now let's put in 'sums' hashtable all possible sums
    // a[i] + a[k] where k < i
    for (int k = 0; k < i; ++k) {
        sums[a[i] + a[k]] = pair(i, k);
    }
}

Thanks.

Sunday, August 25, 2013

DFS (Depth first search) for graph

What is dfs?
 This algorithm explores aggressively and backtracks only when necessary. This is similar to how we play a maze game.

To code DFS iteratively, we can mimic BFS , but use a Stack instead of a Queue. Another way to code this would be to do it recursively. DFS is very naturally phased as recursive example. We will see both of implementation later.

 DFS(graph G, vertex s)
     mark s as explored
     for every edge (s, v):
          if v unexplored
               DFS(G, v)

Example : 
Consider the graph:
So, our strategy will be to explore aggressively and only backtrack when necessary.
Example taken from coursera course by Tim Roughgarden in Algorithms: Design and Analysis, Part 1.

Now, we have starting vertex as s.  From s we have to go to some node, lets go to a.

From as we can't go to b, as it is s's immediate neighbor, but we can go to a's immediate neighbor. So we go to c:

From c, we go to e.
From e we can go to d only as it is the only neighbor which is unexplored. So d is 5th node we see.
From d we can either go to c or we can go b. Lets suppose we go to c. But c is already explored at number 3, so we retreat, and we then go to b.
Now at b we go to s, but we have explored that, then we go to a, we have explored. So we have explored all the options from b, so we retreat to d, then we see that at d also we have explored all the options, so we retreat to e, from e to c, from c to a, and then we come to s back. From s we have not went through edge s-b, so we traverse and again retreat as we have already seen b. Now we have covered all the edges.

Basic DFS Properties
At the end of DFS, all vertices that could have been reached will have been reached.
Time complexity - 
Running time for adjacency list- O(|V|+|E|) OR O(n+m) where |V| and |E| is number of vertices and edges for adjacency list representation of graph.
Running time for adjacency matrix - O(|V|2) OR O(n2) as we have to traverse through the whole row until we find an edge.

Application of DFS
  1. Topological sorting
  2. Finding SCC(strongly connected component)

Dijkstra's algorithm for equal weight edges using BFS

This is the application of BFS (breadth first search)

dist(v) = fewest number of edges on path s to v.

Here is the pseudocode:
Initialize dist(v) = {0 if v=s, infinity if v != s}
foreach vertex v in all vertices
{
  if(v==s)
    dist(v) = 0;
  else
    dist(v) = infinity;
}

Now apply BFS, and hen considering edge (v, w), if w is unexplored, then set dist(w) = dist(v)+1, keeping rest of the BFS same.

Note - BFS computes the shortest path, only when the length of all the edge is same (say 1)

Consider the graph:
Initially, dist(s) = 0 and rest of the nodes have dist = infinity.

So, applying bfs now, where we find a , we add to queue, mark dist(a) = dist(s)+1 = 0+1 = 1.
So, dist(v)=1. Now we add b, and do the same. So dist(b) = 1.

Now we deque a, and add c. While adding c, we will find dist(c) = dist(a) +1, so dist (c) and so on.

In order to retrieve the shortest path from the origin to a node, you need to maintain two items for each node in the graph: it's current shortest distance, and the preceding node in the shortest path. Initially all distances are set to infinity, and all predecessors are set to empty. Consider following example:
A - B - C
 |           |
D          E

In this example, we set A's distance to zero, and then proceed with the BFS. On each step you check if you can improve the distance of a descendant, i.e. the distance from the origin to the predecessor plus the length of the edge that you are exploring is less than the current best distance for the node in question. If you can improve the distance, set the new shortest path, and remember the predecessor through which that path has been acquired. When the BFS queue is empty, pick a node (in your example, it's E) and traverse its predecessors back to the origin. This would give you the shortest path.

Here is the pseudo code of what we talked about:
function Dijkstra(Graph, source):
       for each vertex v in Graph:                                // Initializations
           dist[v] := infinity ;                                  // Unknown distance function from 
                                                                  // source to v
           previous[v] := undefined ;                             // Previous node in optimal path
       end for                                                    // from source
       
       dist[source] := 0 ;                                        // Distance from source to source
       Q := the set of all nodes in Graph ;                       // All nodes in the graph are
                                                                 // unoptimized – thus are in Q
      while Q is not empty:                                      // The main loop
          u := vertex in Q with smallest distance in dist[] ;    // Source node in first case
          remove u from Q ;
          if dist[u] = infinity:
              break ;                                            // all remaining vertices are
          end if                                                 // inaccessible from source
          
          for each neighbor v of u:                              // where v has not yet been 
                                                                 // removed from Q.
              alt := dist[u] + dist_between(u, v) ;
              if alt < dist[v]:                                  // Relax (u,v,a)
                  dist[v] := alt ;
                  previous[v] := u ;
                  decrease-key v in Q;                           // Reorder v in the Queue
              end if
          end for
      end while
      return dist;
endfunction

For detailed algorithm of dijkstra's algorithm, please refer here.

BFS (Breadth first search) on Graph

Algorithm
  • Starting at some arbitrarily chosen vertex s (s stands for start vertex) , we mark v so that we know we've visited it, process v, and 
  • then visit i.e. mark the vertex as visited and process all of v's neighbors. 
  • Now that we've visited and processed all of v's neighbors, 
  • we need to visit and process all of v's neighbors neighbors
Pseudocode
BFS(graph G, vertex s)
[all nodes initially unexplored]
-mark s as explored
let Q = queue data structure, initialized with s
while (Q not empty)
     remove the first node of Q, call it v
     for each edge (v, w):
          if w is unexplored
               mark w as explored
               add w to Q (at the end)

We visit each node only once and each edge at most twice (for edge(v, w), we might encounter it when we're at node v and at node w).

Example
Consider the graph
Let s be the starting vertex.
Layer 0 is node s, then a and b are layer 1. So, when layer 0 is processed, we add layer 1 to queue.
Layer n+1 gets added to the queue when we process layer n.


Time Complexity: 

Adjacency list implementation of graph - O(|V|+|E|) OR O(n+m) where |V| and |E| is number of vertices and edges for adjacency list representation of graph.
Running time for adjacency matrix - O(|V|2) OR O(n2)

where V - number of vertices and E number of edges which can be reached from the starting vertex. Also this is time complexity for adjacency list implementation.

Applications of BFS



Saturday, August 24, 2013

Deleting the element from the heap

Here we will discuss deletion of element from min heap.

Pseudocode for deletion from heap
  1. Find the index of the element to be deleted from the heap. This takes O(n) time.(If you already have the index, please skip this step)
  2. Remove the element from the heap
  3. Move the root element to the element to be deleted location
  4. Move the last element to the first position
  5. Decrease array size by 1
  6. Now bubble down on the array(like we did in extract min). Bubbling down (or sifting down) is done as follows : 
    • if current node has no children, it is over;
    • if current node has one child: check, if heap property is broken, then swap current node's value and child value; so we bubble down the child;
    • if current node has two children: find the smaller of them. If heap property is broken, then swap current node's value and selected child value; bubble down the child.
  7. Repeat step 6, until the heap property is satisfied.
Example
Consider the following heap:
Suppose we want to delete element 9. So, we first nullify it and move to 4 to 9th location:

Now, as the root position is empty, we move last element i.e. 13 to root.
Now the process of bubbling down starts. We see that at the root node, 12 is greater than 4 and 8. So we swap it with smaller of children i.e. 4.

But again the heap property is not restored, as 13 is still larger than 4 and 4. So bubbling down again:
But still 13 is greater than 11, and hence heap property is still not satisfied:
Finally, min heap property is satisfied, and we are done.

Thanks.