Monday, April 26, 2010

Important ds question

a) Write a function that validates whether a tree is a BST or not. b) Write a function to find the common ancestor of two given nodes of binary tree. c) In a BST, value of  two of the nodes got exchanged by mistake. Write a function to correct the BST. d) In a array, there exists a majority element(a no. repeated atleast n/2 + 1 times) and rest of the no.s can be anything. find that majority element in a single pass without using extra space. e) Find the no. of root nodes in a Binary tree. f) write a function to create  a...

Longest common subsequence : Dynamic programming

m: A B C B D A B yn: B D C A B A The LCS is B C B A For arbitrary inputs, it is NP-hard. For constant inputs, DP gives the solution in polynomial time ex: O(n^k). Also, there can be more than one LCSs and the solution for that is exponential time ex: O(k^n). Here we maintain an LCS matrix of m*n size. Initially, for i= 0 or j = 0, LCSij = 0. Then we use the following optimal substructure. LCSij = 0                                       ...

GCD and LCM

Finding the GCD GCD for 60 and 40 is 20:  60 = 40 * 1 + 20 40 = 20 * 2 + 0 LCM for 60 and 40 is 120: 60*40 / 20 Recursive solution for GCD int gcd(a, b): if b==0: return a return gcd(b, a%b) Iterative solution for GCD int gcd(a, b): while (b != 0): //swap a with b, and b with a%b temp = a%b a = b b = t return a Finding the LCM Method1 : Using GCD int lcm(a, b): return (a*b)/gcd(a, b) Method 2: Solution: least common multiple (LCM) is the lowest value that is a multiple of both...

External Sorting - How do you sort a million 32-bit integers in 2MB RAM?

A million 32-bit integers is 4MB. We cannot fit them all in the given RAM size and we have to do external sorting. It is an open ended question which is interactive. Let's assume that - The list of numbers is on disk (as we cannot fit them all in RAM). There is a sort function which sorts the given list effectively i.e. we do not have to worry about a particular sorting algorithm. There is lots of disk space.  As we are short on RAM, the time complexity is not a constraint. This is an important one. I would: - divide the...

maximum index of an array

Space for 200,000,000 integers : 800MB 256,000,000 integers : 1024MB Integer.MAX_VALUE (2,147,483,647) : ~8600MB Integer.MAX_VALUE/2 (1,073,741,823) : ~4300MB //arrayindextest.java class arrayindextest {     public static void main(String[] args) {         int[] a = new int[256000000]; //or 200000000         System.out.println(a.length);     } } -Xmx set max heap size flag -Xms set min heap size flag 200,000,000 integers : 800MB With new int[200000000]:...

Missing integers

If you have a cheap mainstream commodity hardware consisting of a decent processor and 1GB RAM, how would you find a few (more than one) missing integers in a list saved on disk containing all of the unique 2^32 integers? One approach is to use an array of boolean flags and marking them as true for present integers (index of the array denotes flag for the integer). Traversing the flags array again would give the missing integers. However, 2^32 integers each of 4B require 16GB of space alone. A boolean array for 2^32 locations each of 1B...

print combinations

All possible combinations of a 3 digit binary number are 2c1 * 2c1 * 2c1 = 8 (000, 001, 010, 011, 100, 101, 110, 111). Each digit can be 0 or 1 so 2c1. Below is the code for it. Starting with 000, we get the next combination by incrementing the LSbs and carrying out the ripple effect with the for loop. void printCombinations(int digits):     char[] out = new char[digits]     for (int i = 0; i < out.length; i++):         out[i] = '0'     while (true):    ...

Trie examples

I mentioned Tries before here and here. I thought of two more popular examples. Google Suggestions (now in production) and Gmail autocomplete logic would be implemented as a Trie (prefix tree). I could think of two ways this could be done using a Trie. The first way is to DFS from the current prefix or node to get to all the possible words below. This approach is time-intensive due to its recursive nature and the user probably would...

Scrabble algorithm

If you are playing scrabble and you have n letters, what are all the possible words that you can make from these n letters? It looks as if it would involve permutations and combinations both of the letters or a union of permutations and combinations for n letters. Surprisingly, it is much simpler. Below are the possible words for letters 'abc'. I have highlighted the patterns in them. a ab abc ac acb b ba bac bc bca c ca cab cb cba For n letters, it looks like we have to get permutations of n letters plus permutations of n-1 letters...

Permutations and Combinations of string

Permutations of string: For a string of length n, there are n! permutations. For string abcd, there are 24 permutations. We would have a wrapper permutation method which takes the string and calls the recursive permute method. The basic idea is that the permutations of string abc are a + permutations of string bc, b + permutations of string ac and so on. The permute method uses a bool[] used to indicate which characters of the string are already used in the buffer out (to avoid repetitions). The variable depth is used to indicate...

Tries

Trie (pronounced as try even though it stands for tree and retrieval) is a prefix tree data structure. You can read more on it on topcoder and wiki. Trie mostly has a root node as an empty string ("") and each node has 26 child pointers letters (A to Z) at least. As we add words to it, the trie grows. Here is an example of how a trie looks (from www.danvk.org) after adding a few words to it. Note that the complete words are double rounded....

superstring

Some code for superstring given two strings. It assumes that first string is longer than second string. If the lengths of the strings first and second are m and n, it takes O(mn) time (as m >= n, the other two loops run for O(n^2) time). There are three loops, first loop looks for second string within the first string, second loop looks for the second string before the first string and third loop looks for after. A list of results is maintained from which the string which is least lengthy and lexicographically smallest is returned as...

Find the distinct words from a text file

The data structure of the trie and its addWord() method remains the same. class trieNode:     trieNode[26] children     char c     int countPrefix // words with prefix from root to current node     int countWord // complete words for this current node void addWord(trieNode node, string word):         //end of the word         if (word = ""):             node.countWord = node.countWord...

DFS and BFS on trees, with and without recurssion

BFS (Breath First Search) with and without recurssion  DFS with Recursion and without recursion...

Remove duplicates from the sorted array

Problem   Remove duplicates from the sorted array Example [1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5] -> [1, 2, 3, 4, 5] Method 1- using 3 pointers //using 3 pointers fixDuplicatesInSortedArray(Integer[] a): start = 0 end = 0 destination = 0 while(start < a.length): while(end < a.length && a[end] == a[start]): end++ end-- //copy the distinct element a[destination] = a[start] destination++ start = end + 1 end = start //null...

print all word combinations from phone number

Print all words combinations from phone number. ex: 111-1111 -> AAA-AAAA, AAA-AAAB ... CCC-CCCC char charkeyMap(int digit, int position) ex: charkeyMap(1, 0) = 'A' charkeyMap(1, 1) = 'B' charkeyMap(1, 2) = 'C' void printCombinations(int[] in):     char[] out = new char[in.length]         //first combination ex: 111-1111 -> AAA-AAAA     for(int i = in.length -1; i >= 0; i--):         out[i] = charkeyMap(in[i], 0)     int i    ...

String matching

The naive brute force approach: It is very simple to code but there are little things to look for (as least I have to) with respect to optimization. The outer loop runs for text, the inner loop runs for pattern with one more condition -  i+ len(pattern) <= len(text). int bruteForce (text, pattern)     for (i =0; i < len(text); i++):         for (j = 0; j < len(pattern) && i+ len(pattern) <= len(text); j++): //instead of i+j < len(text)        ...

Sorting binary array - Array containing only 0 and 1 in one pass OR two color sort

Problem Sort a binary array - array containing 0s and 1s in 1 pass. This is also called two color sort. Example Input - Binary unsorted array A = {0,1,1,0,0,1}; Output - binary sorted array B = {0,0,0,1,1,1} Solution This is similar to quicksort partition algorithm. Though normal sorting takes O(n log n) time, but we have to just partition 0 from 1, and hence it should only take time of partitioning - O(n). Lets see how? We take low at index 0 of array, high at the of the array. Now we will follow this pseudocode: Pseudocode: low = 0;...