Friday, January 6, 2012

Find all subsets of a given set OR Find power set of a given set

Problem
Write a method that returns all subsets of a set.
Example
If we're given a set of integers such that S = {1, 2, 3}, how can we find all the subsets of that set? For example, given S, the set of all subsets i.e. P(S) we get are {}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, and {1, 2, 3}.

Here is {} is empty set denoted by Φ.

If S has n elements in it then P(s) will have 2^n elements.P(S) is also called power set.

Solution
This can be solved in couple of ways using recursion, backtracking.

Method 1 - Using recursion
If we have to find the power set of a,b,c
Recursively, P(abc) = {abc} + P(ab) + P(ac), and so on
Suppose we are given a set A = {1,2,3}, which is the originalSet. We want to get the outputSet.
Pseudocode
set powerSet(A){
  add set A to outputSet;
  for each item in A {
   let subset = A - item,
   add powerset(substring) to outputSet
  }
  return outputSet;      
}

Here is the code in java
public static <T> Set<Set<T>> powerSet(Set<T> originalSet) {
    Set<Set<T>> sets = new HashSet<Set<T>>();
    if (originalSet.isEmpty()) {
     sets.add(new HashSet<T>());
     return sets;
    }
    List<T> list = new ArrayList<T>(originalSet);
    T head = list.get(0);
    Set<T> rest = new HashSet<T>(list.subList(1, list.size())); 
    for (Set<T> set : powerSet(rest)) {
     Set<T> newSet = new HashSet<T>();
     newSet.add(head);
     newSet.addAll(set);
     sets.add(newSet);
     sets.add(set);
    }  
    return sets;
}

And a test, given your example input:
 Set<Integer> mySet = new HashSet<Integer>();
 mySet.add(1);
 mySet.add(2);
 mySet.add(3);
 for (Set<Integer> s : SetUtils.powerSet(mySet)) {
     System.out.println(s);
 }

Can we get rid of loops? Yes we can, here is the pseudocode:
powerset(originalSet) {
  add originalSet to set;
  add powerset2(originalSet,0) to set;
  return set
}

powerset2(originalSet,pos) {
  if pos<length(originalSet) then
    let subSet = (string excluding the char at pos)
    add powerset(subSet) to set
    add powerset2(originalSet,pos+1) to set
  else 
    add "" to set
  endif
  return set
}

Here is the code in java
public static <T> Set<Set<T>> powerSet2(Set<T> originalSet) {
    Set<Set<T>> sets = new HashSet<Set<T>>();
    if (originalSet.isEmpty()) {
        sets.add(new HashSet<T>());
        return sets;
    }
    List<T> list = new ArrayList<T>(originalSet);
    T head = list.get(0);
    Set<T> rest = new HashSet<T>(list.subList(1, list.size()));
    addSets(sets, powerSet(rest), head);
    return  sets;
}

private static <T> void addSets(Set<Set<T>> sets, Set<Set<T>> setsToAdd, T head) {
    Iterator<Set<T>> iterator = setsToAdd.iterator();
    if (iterator.hasNext()) {
        Set<T> set = iterator.next();
        iterator.remove();
        Set<T> newSet = new HashSet<T>();
        newSet.add(head);
        newSet.addAll(set);
        sets.add(newSet);
        sets.add(set);
        addSets(sets, setsToAdd, head);
    }
}

Time complexity - O(2^n)

Method 2 - Using bit strings

Consider the example
Set  = [a,b,c]
power set size = pow(2, 3) = 8
Run for binary counter = 000 to 111

Value of Counter          Subset
--------------------------------------
   000                      Empty set
   001                      a
   011                      ab
   100                      c
   101                      ac
   110                      bc
   111                      abc

To solve this problem there are two techniques that we need to understand:
  1. Determine the number of subsets using bit strings: we know that the number of subsets equals to 2 to the power of the number of elements in the superset: #subsets = 2 ^ #elements. For instance, if our superset is S = {1, 2, 3}, then it has 2^3 = 8 subsets. If our superset has four elements then it has 2^4 = 16 subsets. Moreover, by shifting the bit representation of the number 1 by n, we also get 2^n. Thus, if we shift the bit string of 1 by the number of elements in the superset, we'll get the number of subsets for that superset. For example, if we have S = {1, 2, 3}, then there are 1 << 3 = 2^3 subsets in S.
  2. Keeping track of data using bit manipulation: for this problem, we will use a bit string to keep track of subsets and their elements. Take S = {1, 2, 3} as an example. If the bit string is 100 then the subset we're working on has only one element which is 1. If the bit string is 110 then the subset we're working on has two elements which are 1 and 2. We will use & and << bit operators to figure out which bit is 1 in the bit string.
If you are not familiar with bit manipulation please read "Bit manipulation tips & tricks" part 1 and part 2. Here is the code for the algorithm written in Java:


public static void findSubsets(int array[])
{
  int numOfSubsets = 1 << array.length; //2^n

  for(int i = 0; i < numOfSubsets; i++)
 {
   int pos = array.length - 1;
   int bitmask = i;

   System.out.print("{");
   while(bitmask > 0)
   {
    if((bitmask & 1) == 1)
      System.out.print(array[pos]+",");
    bitmask >>= 1;
    pos--;
   }
   System.out.print("}");
 }
}
Code explanation:
  1. First, we find out the number of subsets that the superset has by shifting the bit representation of 1 by the number of elements in the superset.
  2. Next, we just loop through each subset and generate its elements accordingly.
  3. Inside the loop, we use pos and bitmask to keep track of the element. Specifically, bitmask is the bit string that represents elements in the current subset. And, we use post to retrieve the correct element from the superset.
  4. The while loop will add the correct element to the subset. Note that bitmask & 1 equals to 1 only when bitmask has a '1' bit at the last position. For example, bitmask = "001" or "011" will make bitmask & 1 equal to 1. That's when we'll add an element into the subset. Why does it work? Well, for each iteration of the while loop, we'll shift bitmask by one bit position to the right (bitmask >>= 1) and we decrement pos by 1 (pos--). Thus, whenever there is a '1' bit at the last bit position, we know exactly which element to add into the subset. If you are confused, don't worry because we'll do an example!
  5. After finishing one subset, we go to a new line and continue processing the rest.
Dry running the above code
set is {1, 2, 3, 4, 5}
i from 0 to 31:
i = 00000 => print {}
i = 00001 => print {5} (or 1(if pos=0, and pos is incremented, 
                            the order in which you do it really shouldn't matter)
i = 00010 => print {4}
    00011 => print {5, 4}
    00100 => print {3}
    00101 => print {5, 3}
    00110 => print {4, 3}
    00111 => print {5, 4, 3}
        ...
    11111 => print {1, 2, 3, 4, 5}


Time complexity - O(n.2^n)

References
http://stackoverflow.com/questions/2779094/what-algorithm-can-calculate-the-power-set-of-a-given-set
http://stackoverflow.com/questions/1670862/obtaining-powerset-of-a-set-in-java
http://stackoverflow.com/questions/14224953/get-all-subsets-of-a-set
http://stackoverflow.com/questions/4640034/calculating-all-of-the-subsets-of-a-set-of-numbers
http://www.geeksforgeeks.org/power-set/

2 comments:

  1. What is the complexity of this code?

    ReplyDelete
    Replies
    1. Sorry for the late response. The time complexity of method 2 will be O(n.2^n). I know it may not help now as I am over an year late, but still replying.

      Delete