Monday, April 26, 2010

Scrabble algorithm

If you are playing scrabble and you have n letters, what are all the possible words that you can make from these n letters?

It looks as if it would involve permutations and combinations both of the letters or a union of permutations and combinations for n letters. Surprisingly, it is much simpler.

Below are the possible words for letters 'abc'. I have highlighted the patterns in them.

a
ab
abc
ac
acb
b
ba
bac
bc
bca
c
ca
cab
cb
cba

For n letters, it looks like we have to get permutations of n letters plus permutations of n-1 letters and so on till permutations of each letter which is O(nCn * n^n) + O(nCn-1 * n-1^n-1) + ... O(nC1 * 1^1). The time complexity is quite high.

I think I said simpler. :D 

All possible words for n letters can be obtained by using the permutations algorithm of n letters with a very simple mod. The changes are in blue.


permutation(char[] in):
    //initially, used[] set to false and depth is 0
    permuteWords(in, out, used, 0)

permuteWords(char[] in, buffer out, bool[] used, int depth):
    print out
    if depth = in.length:
        print out
        return
    for (int i = 0; i < in.length; i++):
        if used[i] = true:
            continue
        used[i] = true
        out.append(in[i])
       
        permuteWords(in, used, out, depth +1)
       
        out.length = out.length -1
        used[i] = false


Time complexity is the same as that of permutations O(n^n).

A better way is to have each word added to a list (the list would be created in permutation method and passed as parameter to permuteWords method) for extensions of the problem. 

Possible extensions to this scrabble problem are:

  • Find the longest word. 
  • Find the word with maximum points given a map of points to each letter.
  • You have a dictionary, how would you use it to get the valid words? For this, I would assume that the dictionary has a function like boolean isValidWord(String s) which would indicate if a particular word is valid or not. Here is another interesting thought with the dictionary extension which requires a trie structure. I could use a dictionary, to see if I can bust out of permuting further as mostly not all permutations are going to be valid words. For example, if you have a dictionary implemented as a trie with a function boolean hasPrefix(String s) or hasPrefix(char c) and letters 'aabc' and suppose that the dictionary does not have a trieNode structure a-a, you do not have to permute further for the unused characters 'bc'. Below is the changed code that I came up with (not tested). 

Set getValidWordsPermutation(char[] in):
    Set validWords = new HashSet();
    //initially, used[] set to false and depth is 0
    permute
Words(in, out, used, 0, validWords);
    return validWords;

permute
Words(char[] in, buffer out, bool[] used, int depth, Set validWords):
    // if the current prefix is not present in the dictionary, then do not permute further with the rest of the unused letters
    if !dict.hasPrefix(out): //or dict.hasPrefix(
out[out.length()])
        return
   
    //if current prefix is a word, add to the set
    if dict.isValidWord(out):
        validWords.add(out)

    if depth = in.length:
        return

    for (int i = 0; i < in.length; i++):
        if used[i] = true:
            continue
        used[i] = true
        out.append(in[i])
       
        permuteWords(in, used, out, depth +1,
validWords)
       
        out.length = out.length -1
        used[i] = false


Using the dictionary, only those trieNodes forming valid words possible from the n letters are traversed in the dictionary trie [O(|V|+|E|) time] and not all permutations [O(n^n) time]. If all permutations are valid words OR if there is no such hasPrefix(String s) function for trie, then it takes O(n^n) time - which is better than O(nCn * n^n) + O(nCn-1 * n-1^n-1) + ... O(nC1 * 1^1). 

0 comments:

Post a Comment