This updated information has been further expanded upon on my new website. You can find the updated details here: https://k5kc.com/cs/algorithms/subtree-of-another-tree-problem/.
You have two very large binary trees: T1, with millions of nodes, and T2, with hundreds of nodes. Create an algorithm to decide if T2 is a subtree of T1.
Solution
Method 1 - Traversing through T1 and finding subtree at each node, if T1's node = T2.root
Here's what we will do:
- Traverse T1.
- If the current node is equal to the root node of T2, traverse both of the trees (T2 and the current subtree of T1) at the same time.
- Compare the current node. If they are always equal, T2 is a subtree of T1.
public static boolean isSubtree(BinaryTreeNode<Integer> T1, BinaryTreeNode<Integer> T2) { if (T1 == null) return false; else if (T1.data.equals(T2.data) && isTwoTreeEqual(T1, T2)) { return true; } else { return isSubtree(T1.left, T2) || isSubtree(T1.right, T2); } } public static boolean isTwoTreeEqual(BinaryTreeNode<Integer> root1, BinaryTreeNode<Integer> root2) { if (root1 == null && root2 == null) return true; else if (root1.isLeaf() && root2.isLeaf()) return root1.data.equals(root2.data); else if (!root1.isLeaf() && !root2.isLeaf()) { if (!root1.data.equals(root2.data)) return false; return isTwoTreeEqual(root1.left, root2.left) && isTwoTreeEqual(root1.right, root2.right); } else return false; }
Though this is good, but we may run out of memory stack, as we are recursing in 2 functions, and isTwoTreeEqual can be improved.
public static boolean containsTree(BinaryTreeNode<Integer> T1, BinaryTreeNode<Integer> T2) { if (T2 == null) return true; // The empty tree is always a subtree else return subTree(T1, T2); } private static boolean subTree(BinaryTreeNode<Integer> r1, BinaryTreeNode<Integer> r2) { if (r1 == null) return false; // big tree empty & subtree still not found. if (r1.data == r2.data) { if (matchTree(r1, r2)) return true; } return (subTree(r1.left, r2) || subTree(r1.right, r2)); } private static boolean matchTree(BinaryTreeNode<Integer> r1, BinaryTreeNode<Integer> r2) { if (r2 == null && r1 == null) return true; // nothing left in the subtree if (r1 == null || r2 == null) return false; // big tree empty & subtree still not found if (r1.data != r2.data) return false; // data doesn’t match return (matchTree(r1.left, r2.left) && matchTree( r1.right, r2.right)); }
Time complexity -O(n+m), where n is size of T1, and m of T2
If k is the number of occurrences of T2’s root in T1, the worst case runtime can be characterized as O(n + k * m).
Method 2 -Pre process the tree T1
- Pre-process the T1 tree, keeping the list of all possible subtree roots (the cache list will have a millions of entries)
- Sort the cached list of roots by the descending order of data, kept in root. You may choose more elegant sorting function, for example, parse a character tree into string.
- Use binary search to locate the necessary sub-tree. You may quickly reject subtrees, with the number of nodes, not equal to the T2 number of nodes (or with the different depth).
Reference - stackoverflow
0 comments:
Post a Comment