Thursday, January 5, 2012

Maximum continuous sum subarray problem

Problem

You are given an array of integers (both positive and negative). Find the continuous sequence with the largest sum. Return the sum.
EXAMPLE
Input: {2, -8, 3, -2, 4, -10}
Output: 5 (i.e., {3, -2, 4})

Solution 

Method 1 - Brute force
The outer loop picks the beginning element, the inner loop finds the maximum possible sum with first element picked by outer loop and compares this maximum with the overall maximum. Finally return the overall maximum. The time complexity of the Naive method is O(n^2).

Method 2 - Divide and conquer
1) Divide the given array in two halves
2) Return the maximum of following three
….a) Maximum subarray sum in left half (Make a recursive call)
….b) Maximum subarray sum in right half (Make a recursive call)
….c) Maximum subarray sum such that the subarray crosses the midpoint
The lines 2.a and 2.b are simple recursive calls. How to find maximum subarray sum such that the subarray crosses the midpoint? We can easily find the crossing sum in linear time. The idea is simple, find the maximum sum starting from mid point and ending at some point on left of mid, then find the maximum sum starting from mid + 1 and ending with sum point on right of mid + 1. Finally, combine the two and return.

int maxCrossingSum(int arr[], int l, int m, int h)
{
    // Include elements on left of mid.
    int sum = 0;
    int leftSum = Integer.MIN;
    for (int i = m; i >= l; i--)
    {
        sum = sum + arr[i];
        if (sum > leftSum)
          leftSum = sum;
    }
 
    // Include elements on right of mid
    sum = 0;
    int rightSum = Integer.MIN;
    for (int i = m+1; i <= h; i++)
    {
        sum = sum + arr[i];
        if (sum > rightSum)
          rightSum = sum;
    }
 
    // Return sum of elements on left and right of mid
    return leftSum + rightSum;
}
 
// Returns sum of maxium sum subarray in aa[l..h]
int maxSubArraySum(int arr[], int l, int h)
{
   // Base Case: Only one element
   if (l == h)
     return arr[l];
 
   // Find middle point
   int m = (l + h)/2;
 
   /* Return maximum of following three possible cases
      a) Maximum subarray sum in left half
      b) Maximum subarray sum in right half
      c) Maximum subarray sum such that the subarray crosses the midpoint */
   return Math.max(maxSubArraySum(arr, l, m),
              maxSubArraySum(arr, m+1, h),
              maxCrossingSum(arr, l, m, h));
}

Time Complexity:  O(nLog n)
maxSubArraySum() is a recursive method and time complexity can be expressed as following recurrence relation.
T(n) = 2T(n/2) + Θ(n)
The above recurrence is similar to Merge Sort and can be solved either using Recurrence Tree method or Master method. It falls in case II of Master Method and solution of the recurrence is Θ(nLogn).

Method 3 - Dynamic Programming using Kadane's algorithm (Best)
This problem can be solved by using a modified version of Kadane's Algorithm. The strategy is to calculate the sum of each subarray and keep track of the sum, the start index and the end index of the subarray that gives the largest sum. Moreover, we calculate the sum of a subarray by adding the sum of the previous subarray with an additional element. For example, to calculate the sum of {5, 2, -1}, we add the sum of {5, 2}, which is 7, to the value of the next element, which is -1.

Kadane's algorithm/pseudocode
Initialize:
    maxSoFar = 0
    maxEndingHere = 0

Loop for each element of the array
  (a) maxEndingHere = maxEndingHere + a[i]
  (b) if(maxEndingHere < 0)
            maxEndingHere = 0
  (c) if(maxSoFar < maxEndingHere)
            maxSoFar = maxEndingHere
return maxSoFar
In simpler terms, iterate over the array and have 2 checks:
  • if greater intermediate sum is found, store it constituents as a temporary result;
  • independently, if sum got negative, reset it to 0 and prepare to build a new sequence from next scan position.
Code in java
public static void findMaxSumSequence (int inputArray[])
{
    if (inputArray == null || inputArray.length == 0)
        throw new IllegalArgumentException("Array is empty");

    int size = inputArray.length;

    int maxSum = inputArray[0];
    int maxStartIndex = 0;
    int maxEndIndex = 0;

    int curSum = inputArray[0];
    int curStartIndex = 0;


    for (int i = 1; i < size; i++)
    {
        if (curSum < 0)
        {
            curSum = 0;
            curStartIndex = i;
        }

        curSum = curSum + inputArray[i];

        if (curSum > maxSum)
        {
            maxSum = curSum;
            maxStartIndex = curStartIndex;
            maxEndIndex = i;
        }
    } 

    out.println( "Start index: " + maxStartIndex + " End index: " 
            + maxEndIndex + " Max sum: " + maxSum );
}

Time Complexity: O(n)

Explanation
This method accepts an array and its size as arguments. It prints out the start index, end index and the sum of the subarray that yields the max sum. Here are the main steps:
  1. First, check if the array size is 0. If it is so, we throw an exception because there is nothing we need to do when the array is null.
  2. Then, initialize variables: maxSum is the largest sum found. maxStartIndex and maxEndIndex are respectively the start index and end index of the subarray that yields the max sum. curSum is the sum of the current subarray that we're examining. curStartIndex is the start index of the current subarray we're checking.
  3. Next, we loop through the array and start calculating the sum of subarrays one after another:

    If the sum of the current subarray is less than 0, we reset curSum to 0. Why? If the last subarray's sum is negative, we will only decrease the next subarray's sum by adding the previous subarray's sum with an additional number. For example, if the previous subarray's sum is -2, and the next element is 3, it's better to reset the sum to 0 and add 3 into 0 than to add -2 to 3.

    curSum = curSum + inputArr[i] calculates the sum of the current subarray by adding the sum of the previous subarray with the next value in the array.

    After that, if the sum of the current subarray is greater than the max sum then we replace the max sum with the sum of the current subarray. We also change the maxStartIndex to the start index of the current subarray and the maxEndIndex to the current index i.

    When the loop ends, maxSum will contain the largest sum found. maxEndIndex and maxStartIndex contain respectively the end and start index of the subarray that gives the largest sum. Thus, we just have to print out those values.
If you have any comment or another solution to this problem, please post in the comment below. I would love to learn about it. Thanks for reading and until next post!

References

2 comments: