Problem
OR
Right FIND MEDIAN returns 2 as median
Now input is -2 -4
So Stream = 0 1 2 3 -2 -2 -4
Sorted Stream would be -4 -2 0 1 2 3 4 and FIND MEDIAN returns 1
We maintain two heaps: minHeap and maxHeap. If we represent these two heaps in terms of list, minHeap is in descending order whereas maxHeap is in ascending order. If we manage to maintain these two heaps while getting new numbers, the median is either in the first place of minHeap or in the first place of maxHeap (or the average of these two).
Pseudocode
Consider the example of sequence of numbers: 1, 2, 3, 4 and 5. The maintainence of minHeap, maxHeap are shown below:
After 1 coming:
minHeap: 1
maxHeap:
median : 1
After 2 coming:
minHeap: 1
maxHeap: 2
median : 1.5
After 3 coming:
minHeap: 1
maxHeap: 2 3
median : 2
After 4 coming:
minHeap: 1 2
maxHeap: 3 4
median : 2.5
After 5 coming:
minHeap: 1 2
maxHeap: 3 4 5
median : 3
Java code
Method 2 - Using counting sort for integers
We can use counting sort for integers, which will give us A[n/2] as median. But of course this will not work for
Method 3 - Using Balanced Binary Search Trees
At every node of BST, maintain number of elements in the subtree rooted at that node. We can use a node as root of simple binary tree, whose left child is self balancing BST with elements less than root and right child is self balancing BST with elements greater than root. The root element always holds effective median.
If left and right subtrees contain same number of elements, root node holds average of left and right subtree root data. Otherwise, root contains same data as the root of subtree which is having more elements. After processing an incoming element, the left and right subtrees (BST) are differed utmost by 1.
Self balancing BST is costly in managing balancing factor of BST. However, they provide sorted data which we don’t need. We need median only. The next method make use of Heaps to trace median.
Now the height of left and right subtrees across the root is same and we stop.
Finding height of a binary search tree can be done in O(log h) where h is the height of tree
Method 4 - Using skiplist and queue
See the answer here.
Method 5 - P2 algorithm
See the answer here.
Thanks. Please share if you have some other ways.
References
Numbers are randomly generated and passed to a method. Write a program to find and maintain the median value as new values are generated.OR
You are given a stream of numbers which can be positive or negative. You are required to provide an operation FIND MEDIAN..which when invoked should be able return the median of the numbers in stream(in say O(1) time)
OR
Find the rolling median?
Example
0 1 2 3 4Right FIND MEDIAN returns 2 as median
Now input is -2 -4
So Stream = 0 1 2 3 -2 -2 -4
Sorted Stream would be -4 -2 0 1 2 3 4 and FIND MEDIAN returns 1
Solution
Method 1 - Using heapWe maintain two heaps: minHeap and maxHeap. If we represent these two heaps in terms of list, minHeap is in descending order whereas maxHeap is in ascending order. If we manage to maintain these two heaps while getting new numbers, the median is either in the first place of minHeap or in the first place of maxHeap (or the average of these two).
Pseudocode
Step 1: Add next item to one of the heaps if next item is smaller than maxHeap root add it to maxHeap, else add it to minHeap Step 2: Balance the heaps (after this step heaps will be either balanced or one of them will contain 1 more item) if number of elements in one of the heaps is greater than the other by more than 1, remove the root element from the one containing more elements and add to the other one if some one calls getMedian...then If the heaps contain equal elements; median = (root of maxHeap + root of minHeap)/2 Else median = root of the heap with more elements
Consider the example of sequence of numbers: 1, 2, 3, 4 and 5. The maintainence of minHeap, maxHeap are shown below:
After 1 coming:
minHeap: 1
maxHeap:
median : 1
After 2 coming:
minHeap: 1
maxHeap: 2
median : 1.5
After 3 coming:
minHeap: 1
maxHeap: 2 3
median : 2
After 4 coming:
minHeap: 1 2
maxHeap: 3 4
median : 2.5
After 5 coming:
minHeap: 1 2
maxHeap: 3 4 5
median : 3
Java code
public static class maxComparator implements Comparator<Double> { public int compare(Double o1, Double o2) { return -o1.compareTo(o2); } } static PriorityQueue<Double> minHeap = new PriorityQueue<Double>(11, new maxComparator()); static PriorityQueue<Double> maxHeap = new PriorityQueue<Double>(); public static void addToHeap(double number) { if (minHeap.size() == maxHeap.size()) { if (minHeap.isEmpty()) minHeap.add(number); else if (number > minHeap.peek()) maxHeap.add(number); else { // number < minHeap.peek() maxHeap.add(minHeap.poll()); minHeap.add(number); } } else if (minHeap.size() > maxHeap.size()) { if (number > minHeap.peek()) maxHeap.add(number); else { // number <= minHeap.peek() maxHeap.add(minHeap.poll()); minHeap.add(number); } } else { // minHeap.size() < maxHeap.size() if (number < maxHeap.peek()) minHeap.add(number); else { // number >= maxHeap.peek() minHeap.add(maxHeap.poll()); maxHeap.add(number); } } } public static double getMedian() { if (minHeap.size() == maxHeap.size()) return (minHeap.peek() + maxHeap.peek()) / 2; else if (minHeap.size() > maxHeap.size()) return minHeap.peek(); else return maxHeap.peek(); }
Method 2 - Using counting sort for integers
We can use counting sort for integers, which will give us A[n/2] as median. But of course this will not work for
Method 3 - Using Balanced Binary Search Trees
At every node of BST, maintain number of elements in the subtree rooted at that node. We can use a node as root of simple binary tree, whose left child is self balancing BST with elements less than root and right child is self balancing BST with elements greater than root. The root element always holds effective median.
If left and right subtrees contain same number of elements, root node holds average of left and right subtree root data. Otherwise, root contains same data as the root of subtree which is having more elements. After processing an incoming element, the left and right subtrees (BST) are differed utmost by 1.
Self balancing BST is costly in managing balancing factor of BST. However, they provide sorted data which we don’t need. We need median only. The next method make use of Heaps to trace median.
- Data Structure providing sorted order is a Binary Search Tree. However worst case it can be linear list.
- Store the numbers in a BST..if the tree is not fully balanced i.e. it's left or right skewed use tree rotations across the root in O(1) to make the height of left and right subtree same.
- Now root node would be at the median. This can be found in O(1) time.
Now the height of left and right subtrees across the root is same and we stop.
Method 4 - Using skiplist and queue
See the answer here.
Method 5 - P2 algorithm
See the answer here.
Thanks. Please share if you have some other ways.
References
0 comments:
Post a Comment