Sunday, January 3, 2010

Count bits set in an integer

Question: Write a function that determines the number of bits set to 1 in the binary representation of an integer.

Method 1 - Simple method (Right shift integer until it becomes 0)

Going by the brute force approach, it would be easy to come up with the following solution. If the least significant bit of a number is 1, it will return a result of 1 when we AND it with 1. Using this logic, we can shift bits in the number to the right and keep testing if the least significant bit is 1. When there are no 1′s left in the number, it will become zero. Here’s some code to illustrate this simple logic.

C code
int numOnesInInteger( int num )
{
     int numOnes = 0;
     while( num != 0 )
     {
          if( ( num & 1 ) == 1 ) {
              numOnes++;
          }
          num = num >> 1;
     }
     return numOnes;
}


Time Complexity: (-)(logn) (Theta of logn)

Method 2 - Brian Kernighan’s Algorithm using & operator
Think we can do better? The above algorithm runs in O(n) where n in the number of bits in the integer. Maybe we can make this a tad better. But this would need some real creative bit manipulation skills. So let’s think, what happens to a number when it we AND it with (number – 1). Let’s take 7 (0111) as an example.

7 & 6 = 0111 & 0110 = 0110 = 6

6 & 5 = 0110 & 0101 = 0100 = 4

4 & 3 = 0100 & 0011 = 0000 = 0

Do you see a pattern yet? Essentially, when you AND a number with (number – 1), the last 1 bit in the number gets set to zero. So if we continue to set every 1 bit in the number to zero, the number will eventually become zero. If we use this logic to count the number of 1s, we have an algorithm that runs in O(m) where m is the number of bits set to 1 in the number. Here’s some code.

C code
int numOnesInInteger( int num )
{
     int numOnes = 0;
     while( num != 0 ){
          num = num & (num – 1);
          numOnes++;
     }
     return numOnes;
}

Time Complexity: O(logn)

Method3 - Using lookup table
This method is very popular because it uses a lookup table. This speeds up the computation. What it does is it keeps a table which hardcodes the number of bits set in each integer from 0 to 256.

For example

0 - 0 Bit(s) set.
1 - 1 Bit(s) set.
2 - 1 Bit(s) set.
3 - 2 Bit(s) set.
...

So here is the code...

C code
const unsigned char LookupTable[] = { 0, 1, 1, 2, 1, 2, 2, 
3, 1, 2, 2, 3, 2, 3, 3, 4, 
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 
4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8};


unsigned int num; 
unsigned int ctr; // Number of bits set.

ctr = LookupTable[num & 0xff] + 
LookupTable[(num >> 8) & 0xff] + 
LookupTable[(num >> 16) & 0xff] + 
LoopupTable[num >> 24];

Time complexity -  We can count bits in O(1) time using lookup table. Though space complexity is large.

Please see http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetTable for details.

Method 4 -From Hacker's Delight, p. 66, Figure 5-2
int pop(unsigned x)
{
    x = x - ((x >> 1) & 0x55555555);
    x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
    x = (x + (x >> 4)) & 0x0F0F0F0F;
    x = x + (x >> 8);
    x = x + (x >> 16);
    return x & 0x0000003F;
}
Executes in ~20-ish instructions (arch dependent), no branching.

Hacker's Delight is delightful! Highly recommended.

Method 5 - Divide and conquer
This algorithm is based on Divide and Conquer Algorithm. Suppose there is a 8bit integer 213(11010101 in binary), the algorithm works like this(each time merge two neighbor blocks):

+-------------------------------+
| 1 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |  <- x
|  1 0  |  0 1  |  0 1  |  0 1  |  <- first time merge
|    0 0 1 1    |    0 0 1 0    |  <- second time merge
|        0 0 0 0 0 1 0 1        |  <- third time ( answer = 00000101 = 5)
+-------------------------------+

unsigned int count_bit(unsigned int x)
{
  x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
  x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
  x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
  x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);
  x = (x & 0x0000FFFF) + ((x >> 16)& 0x0000FFFF);
  return x;
}


Method 5 - XOR the number and -negative of number, till number becomes 0

public static int myBitCount(long L){
      int count = 0;
      while (L != 0) {
         count++;
         L ^= L & -L; 
      }
      return count;
  }


Some non-language agnostic solution. 
Java
If you happen to be using Java, the built-in method Integer.bitCount will do that.

GNU compiler
On the GNU compiler for example you can just use:
  int __builtin_popcount (unsigned int x);
In the worst case the compiler will generate a call to a function. In the best case the compiler will emit a cpu instruction to do the same job faster.
The GCC intrinsics even work across multiple platforms. Popcount will become mainstream in the x86 architecture, so it makes sense to start using the intrinsic now. Other architectures have the popcount for years.

Use of counting bits set
 There are couple of uses of counting set bits :
  • Check if number is power 2 - referred as method 1 here.
  • You can count number of bits to be flipped for converting number A to B - here.
Thanks.

References
stackoverflow

0 comments:

Post a Comment