Friday, January 6, 2012

2 Sum Problem : Given an integer array and a number T, find all unique pairs of (a, b) whose sum is equal to T

This updated information has been further expanded upon on my new website. You can find the updated details here: https://k5kc.com/cs/algorithms/2sum-problem/.
You are given an array of n integers and a target sum T. The goal is to determine whether or not there are two numbers x,y in A with x+y=T.

Example : Suppose we have an int array = {5, 3, 7, 0, 1, 4, 2} and T = 5. The unique pairs that sum up to 5 are (5, 0) (3, 2) and (1, 4).

There are three approaches to solve this problem - 1) brute force, 2) sort the array and use binary and search, and 3) Using the hashtable. 
Please scroll down, if you are only interested in the best approach i.e. approach 3 using hashtables.

Approach 1 : Brute force method
The first approach is to use brute force which gives time complexity of O(n^2) and space complexity of O(n). We basically just have to loop through the array and for each number we add it to each of other numbers and see if they sum up to 5. If so, we print out the pair. Here is the code for brute force approach:
void findPairOfSum(int arrayOfNum[], int arraySize, int sum)
{
  for (int i = 0; i < arraySize; i++)
  {
    for (int j = i; j < arraySize; j++)
    {
      if (arrayOfNum[i] + arrayOfNum[j] == sum)
        cout << "(" << arrayOfNum[i] << ", " << arrayOfNum[j] << ")" << endl;
    }
  }
} 
The second approach is to use a hash table to store the difference between sum and each of the elements in the array. Then as we loop through the array, check each element against the hash table. If the element presents, then print out the key value pair. For example, if we hash the example array we'll have this hash table:
Key   Value
5        5 - 5 = 0
3        5 - 3 = 2
7        5 - 7 = -2
0        5 - 0 = 5
1        5 - 1 = 4
4        5 - 4 = 1
2        5 - 3 = 2

This approach will have the time complexity of O(n) and space complexity of O(n). Thus, chose your method wisely, depending on your need (speed or space efficiency).

2nd Approach - Use sorted array
A better way would be to sort the array. This takes O(n log n)
Then for each x in array A, use binary search to look for T-x. This will take O(nlogn).
So, overall search is  O(n log n)

Hers is how the pseudo code will look:
arr = {};//some array
sortedArr = sort(arr);
for( i = 0;i < arr.length - 1; i++)
{
   x = arr[i];
   bool found = binarySearch(sortedArr, T-x);//Search for T-x in sorted Arrary
   if(found)
    print "pair", x, T-x;
}

Approach 2b - Using sorting but using variant of binary search
Assuming array sorted in ascending order. Now we take first and last element, and sum them up. If sum is equal to T, we have found the pair, but if sum is greater than T, we reduce right pointer by 1, or increment left pointer otherwise.

arr = {};//some array
sortedArr = sort(arr);
left = start;
right= arr.length;
while(left < right)
{
   x = arr[left];
   y = arr[right];
   sum = x+y;
   if(sum == T)
      found=true;
   if(sum > T)
      right--;
   if(sum < T)
      left++;  
   
   if(found)
    print "pair", x, T-x;
}

3rd and Best - Use hash table
I have already mentioned in problem in the application of hash table here.
The best way would be to insert every element into a hash table (without sorting). This takes O(n) as constant time insertion.
Then for every x, we can just look up its complement, T-x, which is O(1).
Overall it takes will be O(n).

Here is how the pseudocode will look.
Let arr be the given array.
And T be the given sum

for (i=0 i<arr.length - 1 ;i++)
{
  hash(arr[i]) = i  // key is the element and value is its index.
}

for (i=0 i<arr.length - 1; i++)
{
  if (hash(T - arr[i]) != i ) // if T - ele exists and is different we found a pair
    print "pair i , hash(T - arr[i]) has sum T"
  
}


Please let us know if you know any other approach. Thanks.

34 comments:

  1. how to build the hash table in the approach 3. also i think you should check if hash(T-arr[i]) is empty first.

    ReplyDelete
    Replies
    1. Hi Jianchen, I think we can use hashmap or we can use element as an index in the array. I think only way hash() is empty when the given array is null or has no element. In that case we will not enter into any for loop.

      -Kinshuk

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
  3. What if there are duplicates in the array? We should keep track of the number of occurrences of each number and update as we go.

    ReplyDelete
    Replies
    1. Still it work, as we have to just find the sum. If 1 element is in the hashmap, and say other element repeats, we won't care as their sum will not be equal to given sum T.

      Delete
    2. If we have below input ?
      Array: [3,3]
      Target Sum: 6

      Delete
    3. when we see 3, we check if `target-3` exists in map. It doesnt, so we add 3 to map. Then, next time, we see 3 again, and this time `target-3` exists in map, hence that is the pair. I have updated my code here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table-. (apologies for such a delayed response)

      Delete
  4. I don't want to find unique pairs,I want all the numbers that form the sum.
    eg: array = {5, 6, 7, 4, 1, 3, 2, 8, 9, 10} and T = 15
    result:
    1+9+5=15
    5+10=15
    6+4+5=15
    2+8+5=15
    3+2+6+4=15 ... etc

    ReplyDelete
    Replies
    1. Hi Vikram, this is a different problem all together. Please refer here for the solution - http://codereview.stackexchange.com/questions/56270/list-all-possible-numbers-from-a-given-array-that-sums-up-to-a-given-number.

      Delete
    2. If we use below array({ 2, 6, 8, 4, 5, 7, 3, 9, 1, 5 }) for sum of number 10 - then the first solution will get fail.
      The solution is

      private static void findPairsForXWay1(int[] pairArray, int x) {
      for (int i = 0; i < pairArray.length; i++) {
      for (int j = i+1; j < pairArray.length; j++) {
      if(pairArray[j] + pairArray[i] == x){
      System.out.println("{" + pairArray[i] + "," + pairArray[j] + "}");
      }

      }

      }
      }

      Delete
  5. The hashmap one is a fucking genius. Got excited a bit :x

    ReplyDelete
  6. The hashmap solution will double print many pairs. How do you handle that?

    ReplyDelete
    Replies
    1. Hi Anonymous, we will break out of the loop as soon as find the first pair :). Thanks

      Delete
  7. How to handle if negative & positive numbers exist and T(positive)?

    ReplyDelete
    Replies
    1. This problem will not occur if you use hashmap, instead of array. But here is a solution via array approach.
      Then we can create 2 arrays, hashN and hashP for negative and positive numbers respectively. While inserting, we can have something like hashN[absolute(arr[i])] = arr[i], and TP will fill normally, without absolute i.e. hashP[arr[i]] = arr[i]. Now iterate over the array arr again, and if (arr[i] > T) , then you need negative number, so refer hashN[T-arr[i]], if it exists then you got the number otherwise its a problem. Please let me know if I wasn't clear enough.

      - Kinshuk

      Delete
  8. hi, i think the last solution also takes O(n^2) you are making hashtable.get operation inside for loop hash table search also has worst case of of O(n) for search operation... so overall it will be O(n^2)

    ReplyDelete
    Replies
    1. Hi Janardan, creation of hashtable takes O[n] time. Getting a single element in hashtable takes O(1) times, hence as we are in the loop, that takes another O(n) time, hence O(n) as whole. Please let me know if that works for you.

      Delete
  9. Hi The first approach for array({ 2, 6, 8, 4, 5, 7, 3, 9, 1, 5 } ) and sum for number 10 will get fail.The solution for 1st approach is given below.

    private static void findPairsForXWay1(int[] pairArray, int x) {
    for (int i = 0; i < pairArray.length; i++) {
    for (int j = i+1; j < pairArray.length; j++) {
    if(pairArray[j] + pairArray[i] == x){
    System.out.println("{" + pairArray[i] + "," + pairArray[j] + "}");
    }

    }

    }
    }

    ReplyDelete
  10. Can you please elaborate on the hash map solution. So say I have Key =5 and Value =0. What does that mean. How do I know that it is the answer. Not getting what you mean by "Then as we loop through the array, check each element against the hash table. If the element presents, then print out the key value pair." Can you please explain with an example?

    ReplyDelete
    Replies
    1. Hi Chintan, we add all the elements in the array to the hashset. This will take O(n) time and O(n) space of hashmap. Now, iterate over the array, one by one. Now, for each element el in the array, check if K-el exists in the hashmap. If it exists, we have found the solution.

      Delete
  11. Does anybody has working code for the third approach. I have tried from my side but my code doesn't seem to work. Can anybody help? Below is my code.

    import java.util.*;
    public class Range {

    public static void main(String[] args) {
    // TODO Auto-generated method stub
    Hashtable mapping = new Hashtable();
    int a[]= {5,4,3,0,1,7,2};
    k = 5;

    for (int i=0; i < a.length; i++){
    mapping.put(a[i], i);
    }
    for (int i=0; i < a.length; i++){
    if (mapping.containsKey(k - a[i])){
    int value = (Integer)mapping.get(k-a[i]);
    if (value != i){
    System.out.println((Integer)mapping.get(k-a[i]));
    System.out.println("////////////////");
    }
    }
    }

    ReplyDelete
    Replies
    1. Hi, I have updated my code here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table-. (apologies for such a delayed response)

      Delete
  12. Does anybody has working code for the third approach. I have tried from my side but my code doesn't seem to work. Can anybody help? Below is my code.

    import java.util.*;
    public class Range {

    public static void main(String[] args) {
    // TODO Auto-generated method stub
    Hashtable mapping = new Hashtable();
    int a[]= {5,4,3,0,1,7,2};
    k = 5;

    for (int i=0; i < a.length; i++){
    mapping.put(a[i], i);
    }
    for (int i=0; i < a.length; i++){
    if (mapping.containsKey(k - a[i])){
    int value = (Integer)mapping.get(k-a[i]);
    if (value != i){
    System.out.println((Integer)mapping.get(k-a[i]));
    System.out.println("////////////////");
    }
    }
    }

    ReplyDelete
  13. Suppose we have
    array = {1, 1, 1}
    T = 2
    Then the latest hash-approach will fail.

    And the second thought about 'arr.length - 1'
    Why we always iterate 'all except the last elem'?

    ReplyDelete
    Replies
    1. Hi, good point. I have updated my code here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table-. (apologies for such a delayed response)
      Also, arr.length - 1 is still okay, because I was assuming arrays as 0-indexed. So, the last element will be at `arr.length - 1`.

      Delete
  14. Approach 2b will result in infinite loop. You aren't changing any index when you found a pair. It'll keep logging the same pair.

    ReplyDelete
    Replies
    1. Hi Ocemyn, good point. I have updated my code here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table-. (apologies for such a delayed response)

      Delete
  15. Shouldn't while loop should be like this below in approach 2B:

    while(left0){
    right--;
    }
    if(sum<0){
    left++;
    }
    if(found){
    cout<<"("<<x<<", "<<y<<")";
    found = false;
    }
    }

    Your's code will iterates for infinite times according to me for this input 1,3,2,5,4,-5,0,0,1,-3.

    ReplyDelete
    Replies
    1. Hi Michael, good point. I have updated my code here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table-. (apologies for such a delayed response)

      Delete
  16. None of these answers answer the question, you need to exclude duplicates with another array or break in the key mapper
    const getUniquePairs = (t, arr) => {
    const map = {}
    for (let i = 0; i < arr.length; i++) {
    const currentInt = arr[i]
    map[currentInt] = i
    }
    let values = []
    for(let i = 0; i < arr.length; i++) {
    if (arr[i] < t && map[t - arr[i]] !== i && !values.includes(arr[i]) && !values.includes(t - arr[i])) {
    values = [...values, arr[i], t - arr[i]]
    console.log(arr[i], t - arr[i], 'INDEXES:', i, map[t - arr[i]])
    }
    }
    }

    ReplyDelete
    Replies
    1. Hi Evan, good point. I have updated my code here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table-. (apologies for such a delayed response)

      Delete
  17. In this solution, condition " if (hash(T - arr[i]) != i )" used is wrong because it return true if element is not found
    example : if not found in hash table,
    then if(NULL!=i) which is true

    ReplyDelete
    Replies
    1. Good point. I updated the pseudocode here: https://k5kc.com/cs/algorithms/2sum-problem/#method-3---use-hash-table- (apologies for such a delayed response)

      Delete