Thursday, January 5, 2012

Find the First Duplicated Integer in an Array containing numbers from 1 to N Without Using Extra Memory

Question: There is a size-N array of integers whose values range from 1 to N. Some integers are duplicated. Find the first duplicate and its index without using any extra memory (not even O(1)).

Solution


There are a few keys in the question that we should consider:
  1. The array size is a finite number N which is also the upper limit for any integer inside the array. Hence, if the array size is 10, then there are ten elements in the array and each of those elements is between 1 and 10.
  2. Some elements are just duplicates. That means the array may not contain all numbers between 1 and N.
  3. We need to find only the first duplicate. Thus, as soon as we discover one, we can stop.
  4. No extra memory is allowed, not even O(0). That means we can't use any extra storage or declare any new variable inside our algorithm because that is O(1).
To solve this problem, we must keep track of elements in order to figure out which one is duplicate. However, regardless of method, we must not use any extra memory. Here is where we must depend on the information given by the problem.

Method 1 - Using modulo operator

We know that each element's value is between 1 and N, so if we increase that value by (N + 1), the modulus of that value and N + 1 is still the same. For example, modulus of 2 and (N + 1) is 2 and modulus of 2 + N + 1 is still 2. Thus, we can mark a visited element by adding N + 1 into its value or any element's value in the array because that doesn't change its modulus with N + 1 at all!

Furthermore, we know that each element's value is between 1 and N. Hence, we can use each element's value as a key and the element at array[key] as the mark. For example, if the current element is 4 then we can check if its value has been visited, and thus a duplicate, by looking at the value of the element at index 4 in the array, if the current element is 2 the we check the value of the element at index 2 in the array, and so on.
  • One final point, we need to traverse the array and check each element, so we need a loop that run till we find the duplicate or till we reach the end of the array. The problem with such loop is that it needs additional counter / sentinel value as the loop's termination condition! That requires memory allocation. Fortunately, we know that the first element is not a duplicate because there is no other element in front of it. Thus, we can use the first element in the array as our counter.

void print_repeats(unsigned a[], unsigned n)
{
    unsigned i, _2n = 2*n;
    for(i = 0; i < n; ++i) 
       if(a[a[i] % n] < _2n) a[a[i] % n] += n;
    for(i = 0; i < n; ++i) 
       if(a[i] >= _2n) printf("%u ", i);
    
}
This algorithm is short but difficult to understand without doing an example, so lets take a look at the algorithm implementation and go straight to an example:

Dry run
Input - n be 7 and array be {1, 2, 3, 1, 3, 6, 6}
Output - 1

first foreach -
i = 0, a[a[0]% 7 ] = a[1] = 2 < 14 => a[1] = 10 and array = {1,9 , 3,1,3,6,6,}
i = 1, a[a[1]%7]   = a[2] = 3 <14 => a[2]  = 10 and array = {1,9,10,1,3,6,6}
i = 2, a[a[2]%7] = a[3] = 1 < 14 => a[3] = 8 and array = {1,9,10,8,3,6,6}
i = 3, a[a[3]%7] = a[1] = 9< 14 => a[1] = 16 and array = {1,16,10,8,3,6,6}
Ideally we will break here as we need only 1 as answer. We can continue if want the other answers as well, like 3 and 6.
i = 4, a[a[4]%7] = a[3] = 8 <14 => a[3] = 15 and array = {1,16,10,15,3,6,6}
i = 5,a[a[5]%7] = a[6] = 6 < 14 => a[6] = 13 and array = {1,16,10,15,3,6,13}
i = 6, a[a[6]%7] = a[13%7] = 6 =<14 => a[6] = 20 and array becomes {1,16,10,15,3,6,20} and indices where number is more than 2n is our answer.

Method 2 - Using sign

traverse the list for i= 0 to n-1 elements
{
  check for sign of A[abs(A[i])] ;
  if positive then
     make it negative by   A[abs(A[i])]=-A[abs(A[i])];
  else  // i.e., A[abs(A[i])] is negative
     this   element (ith element of list) is a repetition
}

Here is the code:
void printRepeating(int arr[], int size)
{
  int i;
  printf("The repeating elements are: \n");
  for (i = 0; i < size; i++)
  {
    if (arr[abs(arr[i])] >= 0)
      arr[abs(arr[i])] = -arr[abs(arr[i])];
    else
      printf(" %d ", abs(arr[i]));
  }
}

Time complexity - O(n), space complexity - O(1), we can make it O(0) for space.



Reference

3 comments:

  1. let the array be 3 1 2 1 3 4

    first repeated integer is 3 here

    start from the first and go till the end

    add n+1 to the a[a[i]] where i is the current index.

    iterate again dividing the values by n+1 if values comes 1 . the remainder is the repeated one.

    ReplyDelete
  2. We can find out the duplicate element by using Set, list or trvaersing using loop on array.
    Below link can be useful to find out the algorithm to find duplicate or repeated elements in an array in java

    Find out duplicate or repeated elements in an array in java

    ReplyDelete
  3. You said the array elements are all larger than 0. You never said we cannot modify the original array.
    My way of solving this is:
    A[6] = {3 1 2 1 3 4}
    At index 0, mark A[abs(A[0])] = A[3] = -abs(A[3]) = -1
    A[6] = {3 1 2 -1 3 4}
    At index 1, mark A[abs(A[1])] = A[1] = -abs(A[1]) = -1
    A[6] = {3 -1 2 -1 3 4}
    At index 2, mark A[abs(A[2])] = A[2] = -abs(A[2]) = -2
    A[6] = {3 -1 -2 -1 3 4}
    At index 3, check A[abs(A[3])] = A[1] < 0.
    At here we know some other index has value A[3]. Thus return abs(A[3]) = 1

    If array has value 0, this method fails. If array element > array size -1, fail. If array has negative value, fail. But given the conditions you provided, the above algorithm works.

    ReplyDelete