Thursday, January 5, 2012

Removing a loop in a singly linked list

Problem:
Remove a loop from a singly linked list if the list has a loop.

Example
For example, 1->2->3->4->1 (4 points to the head node 1) should become 1->2->3->4 while 1->2->3->4 should stay the same 1->2->3->4.

Solution 
 The problem has 3 steps :
  1. Detect if there is a loop in the list (already solved here)
  2. Identify the start of the loop (already discussed here)
  3. Delete the loop, which we will discuss here. Please go through 1 and 2.

Once you know where the loop starts, it's easy to identify the last element in the list since it's the element in the list following the start of the loop that ends up pointing back to the start of the loop. It is then trivial to set the next pointer/reference of this element to null to correct the cyclic link list (not circular linked list which is where the last elements points back to the first - this would be a specific instance of cyclic lists).
  1. Floyd's cycle detect algorithm, also called the tortoise and hare algorithm as it involves using two pointers/references that move at different speeds, is one way of detecting the cycle. If there is a cycle, the two pointers (say p1 and p2) will end up pointing to the same element after a finite number of steps. Interestingly, it can be proved that the element at which they meet will be the same distance to the start of the loop (continuing to traverse the list in the same, forward direction) as the start of the loop is to the head of the list. That is, if the linear part of the list has k elements, the two pointers will meet inside the loop of length m at a point m-k from the start of the loop or k elements to the 'end' of the loop (of course, it's a loop so it has no 'end' - it's just the 'start' once again). And that gives us a way to find the start of the loop:
  2. Once a cycle has been detected, let p2 remain pointing to the element where the loop for the step above terminated but reset p1 so that it's pointing back to the head of the list. Now, move each pointer one element at a time. Since p2 began inside the loop, it will continue looping. After k steps (equal to the distance of the start of the loop from the head of the list), p1 and p2 will meet again. This will give you a reference to the start of the loop.
  3. It is now easy to set p1 (or p2) to point to the element starting the loop and traverse the loop until p1 ends up pointing back to the starting element. At this point p1 is referencing the 'last' element list and it's next pointer can be set to null.


Here is the C++ implementation of the algorithm:
struct Node
{
  int data;
  Node* next;
};

void removeLoop(Node* head)
{
  if (head == NULL)
    return;

  Node* slow = head;
  Node* fast = head;
  Node* last = NULL;
  
  bool hasLoop = false;

  while (fast->next != NULL && fast->next->next != NULL) 
  {
    last = slow;
    slow = slow->next;
    fast = fast->next->next;
    
    if (slow == fast)
    {
      hasLoop = true;
      break;
    }
  }

  
  if (hasLoop)
  {
    slow = head;

    while (slow->next != fast->next)
    {
      slow = slow->next;
      fast = fast->next;
    }
    
    if (slow == head && fast == head )
      last->next = NULL;
    else
      fast->next = NULL;
  }
}

Explanation: our method takes only the reference to the list's head node as an argument. The first while loop determines if the list has loop. And, the second while loop points the loop's end node to null, terminating the loop:
  1. If the list is empty (head points to NULL), there is nothing to do so return.
  2. Next, we need three different pointers. slow will go through the list one node after another. fast moves through the list two nodes at a time, so it goes twice as fast as the slow pointer. And, last refers to the previous node visited by the slow pointer.
  3. We just have to loop through the list until fast reaches NULL because if the linked list doesn't have loop, fast will reach NULL eventually. Moreover, if the list does have a loop, we'll break out of the while loop immediately.
  4. As the while loop runs, we check if slow and fast pointer point to the same node. When they do, it means the list has a loop. This is a very common technique to find out if a linked list has loop. Here is a post about that technique if you are not familiar with it.
  5. As soon as we know that the list has loop, we set the flag hasLoop to true. And then, we break out of the while loop immediately because fast will never reach NULL and the while loop will never end.
  6. If the list has loop as indicated by the flag hasLoop, we enter the second while loop to remove that the list's loop.

    a) First, we set the slow pointer back to the list's head node.

    b) Then, we move both the slow and fast pointer one node at a time. They will eventually meet each other because the list still has a loop.

    c) We stop the loop right before the pointers meet. That's how we get the reference to the end node of the loop:

    d) If the slow and fast pointer both point to the head of the list, then the list is circular, meaning that the end node points back to the head node of the list (ie. 1->2->3->4->1). Thus, we'll get the reference to the end node by using the last pointer. Remember that last is pointing to either the head of the list or the end node of the loop after the first while loop. That's why if slow and fast point at the head, then slow must be pointing at the end node.

    e) On the other hand,if the end node is pointing to any node in the list other than the head node (ie. 1->2->3->4->2), then slow and fast is pointing to the end node and last is pointing to the head node. Thus, we use fast pointer instead of last pointer to break the loop.

This algorithm takes O(n) time complexity and O(1) space complexity. It's very efficient!!

0 comments:

Post a Comment