Thursday, January 5, 2012

Remove duplicates from an unsorted linked list

Problem
Write code to remove duplicates from an unsorted linked list.

FOLLOW UP: How would you solve this problem if a temporary buffer is not allowed?

Example  
Input/Output
Original linked list:
2 -> 3 -> 2 -> 5 -> 2 -> 8 -> 2 -> 3 -> 8 -> null
After deleting duplicate node:
2 -> 3 -> 5 -> 8 -> null

We have already seen deleting duplicates from the sorted list. Now we will see deleting from the unsorted array.

Solutions

METHOD 1 - Using two loops
This is the simple way where two loops are used. Outer loop is used to pick the elements one by one and inner loop compares the picked element with rest of the elements.
Algorithm uses two iterating pointers called current and runner.
Current does a normal iteration of the list while the runner iterates the previous node of current. runner checks for a match with the current as they iterate through the list. At any point of time the runner encounters only one duplicate.

public void deleteDup()
 {
  if(head == null)
   return;
  ListNode<T> previous = head;
  ListNode<T> current = previous.next;
  while(current != null)
  {
   ListNode<T> runner = head;
   while(runner != current)
   {
    if(runner.data == current.data)
    {
     ListNode<T> temp = current.next;
     previous.next = temp;
     current = temp;
     break;
    }
    runner = runner.next;
   }
   if(runner == current)
   {
    previous = current;
    current = current.next;
   }
  }
 }

Time complexity - O(n^2)

METHOD 2 - Use Sorting 

  1. 1) Sort the elements using Merge Sort. We will soon be writing a post about sorting a linked list. O(nLogn)
  2. 2) Remove duplicates in linear time using the algorithm for removing duplicates in sorted Linked List. O(n)
Please note that this method doesn’t preserve the original order of elements.
Time Complexity: O(nLogn)

METHOD 3 - Use Hashing
  1. We traverse the link list from head to end. 
  2. For every newly encountered element, we check whether it is in the hash table: if yes, we remove it; otherwise we put it in the hash table.

Time Complexity: O(n) on average (assuming that hash table access time is O(1) on average).

Follow up question's solution is method 1, as we don't want to use any extra space.


0 comments:

Post a Comment