Monday, March 24, 2014

Sieve of Eratosthenes

Problem
Given a number n, print all primes smaller than or equal to n. It is also given that n is a small number.

Example
For example, if n is 10, the output should be “2, 3, 5, 7″. If n is 20, the output should be “2, 3, 5, 7, 11, 13, 17, 19″.

Algorithm
The sieve of Eratosthenes is one of the most efficient ways to find all primes smaller than n when n is smaller than 10 million or so (Ref Wiki).

Following is the algorithm to find all the prime numbers less than or equal to a given integer n by Eratosthenes’ method:
  1. Create a list of consecutive integers from 2 to n: (2, 3, 4, …, n).
  2. Initially, let p equal 2, the first prime number.
  3. Starting from p, count up in increments of p and mark each of these numbers greater than p itself in the list. These numbers will be 2p, 3p, 4p, etc.; note that some of them may have already been marked.
  4. Find the first number greater than p in the list that is not marked. If there was no such number, stop. Otherwise, let p now equal this number (which is the next prime), and repeat from step 3.
When the algorithm terminates, all the numbers in the list that are not marked are prime.

Here is wikimedia image which can show you what we just wrote:
Following is C++ implementation of the above algorithm. In the following implementation, a boolean array arr[] of size n is used to mark multiples of prime numbers.

C++ code
// marks all mutiples of 'a' ( greater than 'a' but less than equal to 'n') as 1.
void markMultiples(bool arr[], int a, int n)
{
    int i = 2, num;
    while ( (num = i*a) <= n )
    {
        arr[ num-1 ] = 1; // minus 1 because index starts from 0.
        ++i;
    }
}
 
// A function to print all prime numbers smaller than n
void SieveOfEratosthenes(int n)
{
    // There are no prime numbers smaller than 2
    if (n >= 2)
    {
        // Create an array of size n and initialize all elements as 0
        bool arr[n];
        memset(arr, 0, sizeof(arr));
 
        /* Following property is maintained in the below for loop
           arr[i] == 0 means i + 1 is prime
           arr[i] == 1 means i + 1 is not prime */
        for (int i=1; i<n; ++i)
        {
            if ( arr[i] == 0 )
            {
                //(i+1) is prime, print it and mark its multiples
                printf("%d ", i+1);
                markMultiples(arr, i+1, n);
            }
        }
    }
}

Java code
Consider the code below, which creates the boolean array of initialCapacity, and marks indices in it to false, for p,2p,3p,...where p is a prime and p is 2 to begin with.

public static void prime( int initialCapacity){
  int index=2;//2 is first prime
  ArrayList<Integer> listOfPrimeNumbers = new ArrayList<Integer>( );
  boolean[] isPrimeNumber = new boolean[initialCapacity + 1]; // boolean defaults to
  // false
  isPrime[0] = isPrime[1] = false;//1st 2 numbers are not prime
  for (int i = index; i <= initialCapacity; i++)
     isPrimeNumber[i] = true;
  while ( index <= initialCapacity )
  {
    
    if (isPrimeNumber[index]) {
      listOfPrimeNumbers.add(index);
    }
    for (int j = index; j * index <= initialCapacity; j++) {
      isPrimeNumber[index * j] = false;
    }

    // Now mark the multiple of i as non-prime number
    index++;
  }
}

You can set the initial capacity of listOfPrimeNumbers, because you can estimate how many prime numbers are under N. See
http://en.wikipedia.org/wiki/Prime_number_theorem
but basically n / (ln n) should be the initial capacity of the listOfPrimeNumbers. This will ensure that your program does not constantly resize the list under the covers, which can be expensive.
That is, if you really want to be efficient. If you dont care, then just set the initial capacity of that list to something higher. Right now you have it set to the default, which means your listOfPrimeNumbers will have expand.

Time complexity -
 Time complexity of calculating all primes below n in the random access machine model is O(n loglog n) operations, a direct consequence of the fact that the prime harmonic series asymptotically approaches log log n. It has an exponential time complexity with regard to input size, though, which makes it a pseudo-polynomial algorithm.

The bit complexity of the algorithm is O(n (log n) (log log n)) bit operations with a memory requirement of O(n)

References
geeksforgeek ,
stackoverflow 1, 2, 3,

Thanks.

0 comments:

Post a Comment