Wednesday, January 4, 2012

Red Black tree

A red-black tree is a type of self-balancing binary search tree, a data structure used in computer science, typically used to implement associative arrays. The original structure was invented in 1972 by Rudolf Bayer who called them "symmetric binary B-trees", but acquired its modern name in a paper in 1978 by Leo J. Guibas and Robert Sedgewick, due to printing press technology of that time. Unfortunately, they were not able to print the paper in red black ink as was supposed, but the name stayed.

It is complex data structure, but has good worst-case running time for its operations and is efficient in practice: it can search, insert, and delete in O(log n) time, where n is the number of elements in the tree.

Properties/Invariants of RBT
A Red black tree has an additional field called color in BST.
A red-black tree is a binary search tree which has the following red-black properties OR invariants: 
  1. Every node is either red or black.
  2. Every NULL and root is black.
    (Note that NULL means whatever is the pointer from the leaf node)
  3. If a node is red, then both its children are black. So it also implies - no 2 reds in a row
    (Note that this doesn't mean, if node is black it's children should be red)
  4. Every simple path from a given node to a descendant leaf( OR NULL )contains the same number of black nodes.i.e. no. of black nodes from a node to the bottom of the tree is same, no matter which path you choose to go, this height is called black-height(x) for node x.
    So, when we traverse to NULL, it implies unsuccessful search as we can't find the node to be searched but reached the NULL or descendant leaf.
These properties forces tree to have height = log n, in which last 2 properties play main role.


Examples

Example 1 - RBT not possible with 3 nodes (kind of non-example)
Claim - A chain of length 3 cannot be a Red black tree
Proof -Consider the following tree below, and we have to color it such that it satisfies all the invariants of RBT.


To, satisfy the 2nd condition, we have to set node 1 as black, as the root must be black.
Now, the node 2 and 3 can be colored in 4 ways, Suppose we make node 2 as red node, then by invariant 3, node 3 should be black.


Now, the invariant 4 will be broken. It says that for one un-successful run, you pass through exactly same number of black nodes. Suppose we search 0 in the tree, then it will go to 1, and return, as left pointer of node 1 is NULL, and number of black nodes encountered is 1 :

Now, suppose we search 4 in the tree, then we keep on going right until we encounter null after 3, and here number of  black nodes encountered is 2 (both node 1 and 3) :


So, invariant 4 is violated. No matter what you try, some or the other invariant is violated.

This was kind of un-example, lets now move to actual example.

Example 2 :
One tree that is easily make red black tree is perfectly balanced tree, for example consider the following tree.

Can we color this tree RBT?

Suppose we color all the 3 nodes black :


invariant 1 - satisfied as nodes are either black or red
invariant 2 - satisfied as roots node 5 is black and so is leaf node 3 and 7
invariant 3 - satisfied as there is no red node, but if it had been there, its children would have been black.
invariant 4 - satisfied, as no matter what we un-successful search we do, we always encounter 2 black nodes.

hence, its a rbt now.

Example 3 : 
Now lets take a very very simple case of insertion in the binary tree of example 2.

(Note insertion in RBT is not that simple, please refer here for in-depth insertion tutorial on RBT, this example is take to understand RBT only)

Suppose we want to insert the element 6, of-course we will insert below node 7, but what color we give the node 6?


Suppose, we color 6 as black (image below) , but it will break invariant 4. Suppose we unsuccessfully search for 1, we traverse 2 black nodes(i.e. 5 and 1), but if we search for 5.5 we will traverse 3 black nodes(i.e. 5,7 and 6).
So, lets color it red, all the invariants will match now.



Also, it is not possible to color the nodes uniquely.In the above example if we re-color 6 to black and 7 and 3 to red, everything is still ok :


Height guarantee
Benefit of color and variant 3 and 4 in RBT is we have height guarantee.
Every red-black tree with n nodes has a height less than or equal to 2log2(n+1).
 

RBT Operations
RBT operations are same as BST (binary search tree) with insertions and deletions a little complex, to help us with guaranteed height.
Thanks.

0 comments:

Post a Comment