Friday, February 14, 2014

Find kth largest in an Array

Problem : 
 You have an array of numbers; you need to find the k-th largest in the arra
 Solution

Method 1 - Tournament principle

We have to find the element which has competed with largest element, and is just (k-1)th level.We have discussed this principle here.

Method 2 - Using max heap

Consider the array of length n. Here is the algo : 
  • Build a max heap in O(n) ; 
  • Now remove k elements from the heap; wherein each removal costs log(n) time to maintain the heap property. So total time complexity = O(k logn) 
  • To understand building MAX heap in O(n) check heap.

Method 3 - Modified bubble sort running k times
1) Modify Bubble Sort to run the outer loop at most k times.
2) Print the last k elements of the array obtained in step 1.
Time Complexity: O(nk)
Like Bubble sort, other sorting algorithms like Selection Sort can also be modified to get the k largest elements.

Method 4 - Use temporary array
K largest elements from arr[0..n-1]
1) Store the first k elements in a temporary array temp[0..k-1].
2) Find the smallest element in temp[], let the smallest element be min.
3) For each element x in arr[k] to arr[n-1]
If x is greater than the min then remove min from temp[] and insert x.
4) Print final k elements of temp[]
Time Complexity: O((n-k)*k). If we want the output sorted then O((n-k)*k + klogk)

Method 5 - Use Min Heap
This method is mainly an optimization of method 4. Instead of using temp[] array, use Min Heap.
1) Build a Min Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array. O(k)
2) For each element, after the kth element (arr[k] to arr[n-1]), compare it with root of MH.
       a) If the element is greater than the root then make it root and call heapify for MH
       b) Else ignore it.
// The step 2 is O((n-k)*logk)
3) Finally, MH has k largest elements and root of the MH is the kth largest element.
Time Complexity: O(k + (n-k)Logk) without sorted output. If sorted output is needed then O(k + (n-k)Logk + kLogk)
All of the above methods can also be used to find the kth largest (or smallest) element.


Method 6 - Use Sorting
1) Sort the elements in descending order in O(nLogn)
2) Print the first k numbers of the sorted array O(k).
Time complexity: O(nlogn)

Method 7 - Use Order Statistics
If you want a true O(n) algorithm, as opposed to O(kn) or something like that, then you should use quickselect (it's basically quicksort where you throw out the partition that you're not interested in). This is called finding the k-th order statistics.

There's a very simple randomized algorithm (called quickselect) taking O(n) average time, and a pretty complicated non-randomized algorithm taking O(n) worst case time. There's some info on Wikipedia, but it's not very good.
Everything you need is in these powerpoint slides. Also, refer quicksort and randomized selection to find introductory help on randomized algorithm.


Just to extract the basic algorithm of the O(n) worst-case algorithm:

Select(A,n,i):
    Divide input into ⌈n/5⌉ groups of size 5.

    /* Partition on median-of-medians */
    medians = array of each group’s median.
    pivot = Select(medians, ⌈n/5⌉, ⌈n/10⌉)
    Left Array L and Right Array G = partition(A, pivot)

    /* Find ith element in L, pivot, or G */
    k = |L| + 1
    If i = k, return pivot
    If i < k, return Select(L, k-1, i)
    If i > k, return Select(G, n-k, i-k)


The algorithm is discussed in detail here.

Method 8 - Compare 3 elements if N=3 (like wise for N)
int third_biggest(int *arr, int size)
{
 int first = INT_MIN, second = INT_MIN, third = INT_MIN;
 for(int i=0;i first)
  {
   third = second;
   second = first;
   first = arr[i];
  }
  else if(arr[i] > second)
  {
   third = second;
   second = arr[i];
  }
  else if(arr[i] > third)
  {
   third = arr[i];
  }
 }
 return third;
}

Ofcourse, method 7 is best :)

0 comments:

Post a Comment