**Problem**

I have only x GB RAM in computer. I have a text file of 10 GB data.This file contains numbers. x is less than 10. How will I sort them?Solution

**Generic algorithm**

Here is the generic algorithm:

- split the file into parts (buffers) that you can sort in-memory
- then when all buffers are sorted take 2 (or more) at the time and merge them (like merge sort) until there's only 1 buffer remaining which will be the sorted file

**Example x = 4**

Given a 10GB file of integers and a RAM that can hold only 4GB values, how would you sort the integers in the file. Here is how we will continue:

- Divide the file in 3 parts(3GB+3GB+4GB,(each part must be less than 4GB )) and sort each chunk using any O(nlogn) sorting algo(as it is less than 4GB we can apply any in-memory sorting algo).
- Once each chunk of file sorted, bring 1GB of each chunk in memory, so it'll occupy 3GB(1+1+1), now sort this 3GB data(by 3-way merge sort),
- Note that we still have 1 GB ram left. Now, we select a minimum number of the sorted elements,add that in remaining 1GB, and while selecting this number, bring a number from corresponding chunk to replace it, finally write sorted 1GB to secondary storage. So, we have functioning of minHeap.

__Dry running the algo__

let say we divide file in three chunk A,B,C and after sorting individully, content of these parts is like: A{12,18,20,25,33,35,37},B{8,13,14,40,41,45,47},C{2,15,50,70,72,75,78}.

Now suppose in memory, we have {12,18,20,25} {8,13,14,40} {2,15,50,70} respectively from A,B,C. now we'll select 2 from C's part as its minimum, so put this in remaining 1GB and replace 2 in memory by an element of chunk C. i.e. insert 72 in C's part in memory.. next replace 8 by 41 and so on.. we are maintaining a min heap.

**References**

http://stackoverflow.com/questions/6988611/sorting-10gb-data-in-1-gb-memory-how-will-i-do-it

http://www.careercup.com/question?id=13717672

## 0 comments:

## Post a Comment