Saturday, August 29, 2009

Deleting a node from a Binary Search Tree

Suppose we want to delete a node k. First we need to search for the node k. Offcourse k should not be null.

Then we have three cases:

1. The node to be deleted has no children - The memory occupied by this node must be freed and either the left link or the right link of the parent of this node must be set to NULL.

2. The node to be deleted has exactly one child -
We have to adjust the pointer of the parent of the node to be deleted such that after deletion it points to the child of the node being deleted.

3. The node to be deleted has two children -
We need to find the inorder successor of the node to be deleted.
The data of the inorder successor must be copied into the node to be deleted and a pointer should be setup to the inorder successor. This inorder successor would have one or zero children.
This node should be deleted using the same procedure as for deleting a one child or a zero child node.
Thus the whole logic of deleting a node with two children is to locate the inorder successor, copy its data and reduce the problem to a simple deletion of a node with one or zero children.

Case 1: The node to be deleted has no children
Example. Remove 1 from a BST.



 Remove 1 , that's all.


Case 2: The node to be deleted has 1 child
Example. Remove 18 from a BST.

After removing the node, we just append 1 to 5.


Case 3: Node to be deleted has 2 children
This is the most complex case. Consider the tree below , suppose we want to delete the node 3 from the tree. Here we update the node such that we will reduce it to case 1 or 2. 


How do we calculate inorder successor or predecessor?
Here we have to find the inorder successor or predecessor L of node K - the node which has to be deleted. So here the next small element which is smaller than 3 is 2, i.e. L. 

We already know how to calculate predecessor L. What it depends on is whether the node to be deleted has non-empty left sub tree. If the tree has non-null left sub tree then just go to the right most child of the tree. So, here because we have 2 children, so this case will always prevail. 



Approaching the problem now

The same approach can be utilized to remove a node, which has two children:
  • find a minimum value in the right subtree(this is the inorder successor of the node to be deleted).
  • replace value of the node to be removed with found minimum. Now, right subtree contains a duplicate!
  • apply remove to the right subtree to remove a duplicate, that is call remove function(in our case delete) with right subtree now as root node.
Apply this on the example:
Finding the inorder successor
Swap inorder successor with node to be deleted , here 3
Now, we have to delete 3, which fits in the case 1 - node with no child, just delete it :)


Here is the function I implemented:

A node can be removed from a Binary Search Tree using two approaches.
1. Double Pointer Approach
2. Single Pointer Approach

Double pointer approach:

void remove(BinaryTreeNode **node, int data)
{
   if(*node == NULL)
   {
     std::cerr << "ERROR: Node does not exist\n";
   }
 
   if(data == (*node)->data)
   {
     if((*node)->right == NULL)
     {
       *node = (*node)->left;
     }
     else if((*node)->left == NULL)
     {
       *node = (*node)->right;
     }
     else
     {
       BinaryTreeNode **successor = getSuccessor(&(*node)->left);
       (*node)->data = (*successor)->data;
       remove(successor, (*successor)->data);
     }
   }
   else if(data < (*node)->data)
   {
     remove(&(*node)->left, data);
   }
   else
   {
     remove(&(*node)->right, data);
   }
}
 
//Tricky
BinaryTreeNode** getSuccessor(BinaryTreeNode **node)
{
  BinaryTreeNode **tmp = node;
  while((*tmp)->right != NULL)
    tmp = &(*tmp)->right;
  return tmp;
}

Single pointer approach
Note the difference between single pointer and double pointer approach. The difference help in clarifying how pointers work in C/C++

BinaryTreeNode* remove(BinaryTreeNode *node, int data)
{
  if(node == NULL)
  {
    std::cerr << "ERROR: Node does not exists\n";
    return node;
  }
 
  if(data == node->data)
  {
    BinaryTreeNode *retval = NULL;
  
    if(node->left == NULL)
    {
      retval = node->right;
      delete node;
      return retval;
    }
    else if(node->right == NULL)
    {
      retval = node->left;
      delete node;
      return retval;
    }
    else
    {
       BinaryTreeNode *successor = getSuccessor(node->left);
       node->data = successor->data;
       node->left = remove(node->left, successor->data);
    }
  }
  else if(data < node->data)
  {
    node->left = remove(node->left, data);
  }
  else
  {
    node->right = remove(node->right, data);
  }
 
  return node;
}
 
BinaryTreeNode* getSuccessor(BinaryTreeNode *node)
{
  while(node->right != NULL)
    node = node->right;
  return node;
}

Implementing the same in java
private BinaryTreeNode remove(BinaryTreeNode node, int data)
{
  if(node == null)
  {
    System.out.println("ERROR: Number does not exist");
    return null;
  }
 
  if(data == node.data)
  {
    if(node.left == null)
    {
      return node.right;
    }
    else if(node.right == null)
    {
      return node.left;
    }
    else //both the children exists
    {
      BinaryTreeNode successor = getSuccessorNode(node.left);
      node.data = successor.data;
      node.left = remove(node.left, successor.data);
    }
  }
  else if(data < node.data)
  {
    node.left = remove(node.left, data);
  }
  else
  {
    node.right = remove(node.right, data);
  }
 
  return node;
}
 
private BinaryTreeNode getSuccessorNode(BinaryTreeNode node)
{
  while(node.right != null)
  {
    node = node.right;
  }
  return node;
}


0 comments:

Post a Comment