Thursday, April 8, 2010

Detecting a loop in a linked list

This updated information has been further expanded upon on my new website. You can find the updated details here: https://k5kc.com/cs/algorithms/linked-list-cycle-1-detect-cycle/.
Problem : Given a singly linked list, find if there exist a loop.

Example
A -> B -> C -> D -> E -> C [ E again point to C]

Solutions
Method 1 - Using hashing
Traverse the list one by one and keep putting the node addresses in a Hash Table. At any point, if NULL is reached then return false and if next of current node points to any of the previously stored nodes in Hash then return true.

Method 2 - Visited flag
This solution requires modifications to basic linked list data structure.  Have a visited flag with each node.  Traverse the linked list and keep marking visited nodes.  If you see a visited node again then there is a loop. This solution works in O(n) but requires additional information with each node.
A variation of this solution that doesn’t require modification to basic data structure can be implemented using hash.  Just store the addresses of visited nodes in a hash and if you see an address that already exists in hash then there is a loop.

Method 3 - Floyd’s Cycle-Finding Algorithm i.e. Using 2 pointers

Have 2 pointers to start of the linked list. Increment one pointer by 1 node and the other by 2 nodes. If there's a loop, the 2nd pointer will meet the 1st pointer somewhere. If it does, then you know there's one.This is also called tortoise and hare approach.

Here is some code

p=head;
q=head->next;

while(p!=NULL && q!=NULL)
{
  if(p==q)
  {
    //Loop detected!
    exit(0);
  }
  p=p->next;
  q=(q->next)?(q->next->next):q->next;
}

// No loop.

Time Complexity: O(n)
Auxiliary Space: O(1)

Dry running the code
head -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7 -> 8
                  ^                        |
                  |                        |
                  +------------------------+

Starting A at 1 and B at 2, they take on the following values:
A   B
=   =
1   2
2   4
3   6
4   8
5   4
6   6

Note that there is no reason that you need to use the number two. Any choice of step size will work (except for one, of course). See the proof here.


How this work - intuitive approach
Consider two pointers: a slow pointer S that increments by one node at each step, and a fast pointer F that increments by two nodes at each step (i.e. it is twice as fast as S). Both pointers start at the same time from the beginning of an n-node loop. In the time S covers n nodes. F will have covered 2n nodes and they will both meet at the start of the loop.
Now, let us say that the slow pointer S starts at the beginning of the loop, and the fast pointer F starts at node k (where k < n) of the loop. As these two pointers move along the loop, they will meet at node (n-x).
What we really need to do is figure out x, as it will give us the node at which the two pointers meet inside the loop.
  1. When S takes n/2 steps, it will be at node n/2. During the same time, F will have taken 2(n/2) = n steps, and it will be at node (k+n). Since the we are inside a loop, F will be effectively back at node k.
  2. In order for the two pointers to meet at node (n-x), S needs to take a further (n-x-n/2)=(n/2-x) steps and it will end up at node n-x. During the same time, F will have taken 2*(n/2-x)=(n-2x) steps and will be at node k+(n-2x). Given our assumption that both S and F meet at the same node:
  n-x = k+n-2x
 =>   x = k
This means that if S starts from the start of the loop, and F starts k nodes into the loop, both of them will meet at node (n-k), i.e k nodes from the end of the loop. This is a key insight.

Once you know a node within the loop, there's an O(n) guaranteed method to find the start of the loop. The solution is discussed here.

 But can we do better, and answer is yes.

Method 4 - Floyd’s cycle-finding algorithm (modified by Richard Brent)

Richard Brent described an alternative cycle detection algorithm, which is pretty much like the hare and the tortoise [Floyd's cycle] except that, the slow node here doesn't move, but is later "teleported" to the position of the fast node at fixed intervals.
The description is available here : http://www.siafoo.net/algorithm/11 Brent claims that his algorithm is 24 to 36 % faster than the Floyd's cycle algorithm. O(n) time complexity, O(1) space complexity. (Courtesy)


public static boolean hasLoop(Node root){
    if(root == null) return false;

    Node slow = root, fast = root;
    int taken = 0, limit = 2;

    while(slow.next != null && fast.next != null){
        fast = fast.next;
        taken++;
        if(slow == fast) return true;

        if(taken == limit){
            taken = 0;
            limit <<= 1;   // equivalent to limit *= 2;
            slow = fast;  // teleporting the turtle (to the hare's position) 
        }
    }
    return false;
}

Method 5 - Reverse a list
If there are no requirements on using extra memory and keeping the list in its original state very neat and simple approach can be uses. The idea is based on the fact that if you reverse the list which has loops you will end-up on the head node while if list doesn’t have loops you will stop at the tail.

public bool hasLoop(Node head)
{
  Node first=null;
  Node second=null;
  Node tmp=head;

  while(tmp) {
     first=second;
     second=tmp;
     tmp=tmp.Next;

     if(tmp==head)
            return true;

     second.Next=first; 
  }

  return false;
} 

Once detecting the loop, we may like to find the start of the loop or delete it.


References

1 comment: