## Sunday, August 4, 2013

### BFS and Undirected Connectivity

Let G=(V,E) be an undirected graph (Note - for directed graph, better algorithms are available)

Connected Components - the pieces of G, where  graphs are connected in equivalence class.

Formal definition of max connectivity - Equivalence relation between 2 vertices u and v for every u to v path in G.

Equivalence relation between 2 vertices means they have to satisfly 3 conditions of equivalence
1) Reflexive - Everything has to be related to itself
2) Symmetric - If u is related to v, then v is related to u, which is not a problem for undirected graph
3) Transitive - If u and v are related and v and w are related, then u and w are related

Application
What is use of this application -
1) Check if network is disconnected
2) clustering of highly related object which can help in graph visualization

Pseudocode
Let G=(V, E) be an undirected graph. The connected components of G are the "pieces' of G. The goal is to compute all connected components.
To compute all components:
```Mark all nodes unexplored
//suppose nodes have names say 1 to n
for i=1 to n
if i not yet explored
BFS(G, i)
```

The call to BFS discovers precisely i's connected component. This algorithm can't miss any nodes because it actually walks through every node so it must find every connected component. If we wanted to find the number of connected components, we could keep a count variable and increment it each time we call BFS.

Example
(Taken from coursera course - Tim Roughgarden in Algorithms: Design and Analysis, Part 1)
Consider the graph :
So we have nodes marked from 1 to n. We start with node 1, and after reaching there we start BFS.

Starting the BFS here. We add 3 and 5 queue, process them and de-queue them.

On de-queuing 5, we will add 7 and 9 in some order.

Now there are no other connected nodes in this component. So we come out of BFS loop. We go to next node i.e. 2.
Again we start BFS from node 2, and we only get 4.
As there no other edges we come out of this component to outer for loop.

Now in the outer loop we go to node 3, which is explored, 4 - explored, 5 - explored, and then we come to 6 which is not explored.

We apply bfs again, and also traverse 8 and 10, and come out of bfs loop.
Now we go to for loop, and see 7 is explored, 8-explored, 9-explored and finally 10 is also explored.

Running time - O (V+E)
For each node, we have to do constant time proportional to number of vertices V (like initializing all the nodes to unexplored, then running in loop and setting them explored etc) and as far as the edges, we have to consider them only when we are in for loop of processing, though it is not required for initalizing calls etc, it is per edge.