Monday, May 16, 2011

What is BST or Binary Search tree?

A binary search tree (BST) is a tree which has following property :
Every node's left sub-tree has a key less than the node's key, and every right sub-tree has a key greater than the node's key.

The major advantage of binary search trees is that the related sorting algorithms and search algorithms, such as in-order traversal, can be very efficient. This data structure is perfect for advanced interview questions. See the figure below for this tree:

From this figure we can portray the following:
6 - Root node of the binary tree
2 - Parent node of the nodes 1 & 5
2 & 8 - Children nodes of 6 and these nodes are in a level one more than the root node 6
8 - Right child node of 6
2 - Left child node of 6
1 & 6 - These node are without a child, so they are referred to as leafs or external nodes
2 & 8 - These nodes have children/child, so they are referred to as internal nodes

Depth or Hieght of this binary tree - The longest path from the root node 6 to any leaf node, for example 1 node. Therefore, the depth of this binary tree is two. In the almost balanced tree, it is nearly log n(to base 2) and nearly n in case of almost unbalanced binary tree.

Indegree of a node - The number of edges merging into a node. For example, indegree of the node 2 is one i.e. one edge merges.
Outdegree of a node - The number of edges coming out from a node. For example, outdegree of the node 2 is two i.e. two edges comes out of this root node.

Implementing BST / ADT
A binary tree is made of nodes, where each node contains a "left" pointer, a "right" pointer, and a data element. The "root" pointer points to the topmost node in the tree. The left and right pointers recursively point to smaller "subtrees" on either side. A null pointer represents a binary tree with no elements -- the empty tree.
Here is the example is this:


 So, here root is 5, which has left node as 3, and right node as 9. Also, note that 9 has no right node and 1,4,6 are leaf nodes as they have both right and left pointers null.

Following will be class structure of BST (following c/cpp) :
struct node {
  int data;
  struct node* left;
  struct node* right;
} ;


Note: Pointer Changing Code There is a common problem with pointer intensive code: what if a function needs to change one of the pointer parameters passed to it? For example, the insert() function below may want to change the root pointer. In C and C++, one solution uses pointers-to-pointers (aka "reference parameters"). That's a fine technique, but here we will use the simpler technique that a function that wishes to change a pointer passed to it will return the new value of the pointer to the caller. The caller is responsible for using the new value. Suppose we have a change() function that may change the the root, then a call to change() will look like this... // suppose the variable "root" points to the tree root = change(root); We take the value returned by change(), and use it as the new value for root. This construct is a little awkward, but it avoids using reference parameters which confuse some C and C++ programmers, and Java does not have reference parameters at all. This allows us to focus on the recursion instead of the pointer mechanics.
BST Speciality
Basically, binary search trees are fast at insert and lookup. The next section presents the code for these two algorithms. When we search for 4 (tree above) , we know, it is less than 5, so we go left, we get 3. We know 4 is more than right, so we go to right and finally find it.

Height of Binary tree

Height of the binary tree plays special role in lookup and other time operation time complexity. For balanced binary search tree, with N nodes can locate the item in order lg(N) time (log base 2), which is equivalent to depth of BST. By balanced I mean that height of left and right part of the tree are equal:


 But for unbalanced tree like this one (see below), the time complexity for lookup is O(n), which equivalent to height of BST.



Therefore, binary search trees are good for "dictionary" problems where the code inserts and looks up information indexed by some key. The lg(N) behavior is the average case -- it's possible for a particular tree to be much slower depending on its shape. So, in any case lookup time is O (height)

Operations of BST
 Following are the operations we perform on BST
  • Lookup
  • Insert
  • Delete
Other operations:
  • Getting Maximum -This is the same concept as minimum but you move to the right.
  • Finding Predecessor -
    -- Easy Case - If k's left subtree is not empty, just return the max key in its left subtree.
    -- Harder Case - Go up parent pointers until you reach one that is less than you.
  • In-Order Traversal -Let R be the root node of the search tree, with substrees Tl and Tr. Recurse on Tl. Next print out R's key. Lastly, recurse on Tr.
Lookup
Here is the code for lookup
/*
Given a binary tree, return true if a node
with the target data is found in the tree. Recurs
down the tree, chooses the left or right
branch by comparing the target to each node.
*/
static int lookup(node* node, int target) {
// 1. Base case == empty tree
// in that case, the target is not found so return false
    if (node == NULL) {
    return(false);
    }
    else {
// 2. see if found here
    if (target == node->data)
         return(true);
    else {
// 3. otherwise recur down the correct subtree
        if (target < node->data)
             return(lookup(node->left, target));
        else return(lookup(node->right, target));
    }
  }
}


Insert 
We piggy back on search only.We took the convention that if the key value of the inserted node is equal to the one present in tree, we move to the left of the tree.

/*gets the node of tree with value initialized with data*/
node* newNode(int data) {
  node* temp = (node*) malloc(sizeof(node)); // "new" is like "malloc"
  temp->data = data;
  temp->left = NULL;
  temp->right = NULL;

  return(temp);
}

/*
Give a binary search tree and a number, inserts a new node
with the given number in the correct place in the tree.
Returns the new root pointer which the caller should
then use (the standard trick to avoid using reference
parameters).
*/
struct node* insert(struct node* node, int data) {
// 1. If the tree is empty, return a new, single node
    if (node == NULL) {
        return(newNode(data));
    }
    else {
    // 2. Otherwise, recur down the tree
    if (data <= node->data) 
        node->left = insert(node->left, data);
    else 
        node->right = insert(node->right, data);
    
     return(node); // return the (unchanged) node pointer
   }
}

Getting the Min and Max of BST

Getting the Predecessor of node

Getting the Successor of a node

Traversals in BST - Inorder, PreOrder and Post Order

Deleting a Node from BST 

Select and Rank

 








0 comments:

Post a Comment