Science  People  Locations  Timeline
Index: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Home > Binary search tree


 Contents
In computer science, a binary search tree (BST) is a binary tree where every node has a value, every node's left subtree has values less than or equal to the node's value, and every right subtree has values greater. Note that this requires a linear order on the values. A new node is added as a leaf. There is a sort algorithm based on binary search trees, and also a search algorithm. An in-order traversal of a binary search tree will visit the values in increasing order.

1 Operations

1.1 Search

Searching a binary tree for a specific value is easy due to its special properties. We begin by examining the root. If the value equals the root, we have found it. If it is less than the root, then it must be in the left subtree, so we recursively search the left subtree in the same manner. Similarly, if it is greater than the root, then it must be in the right subtree, so we recursively search the right subtree in the same manner. If we reach a nil leaf, then the item is not where it would be if it were present, so it is not in the tree at all. A comparison may be made with binary search, which operates in nearly the same way but using random access on an array instead of following links. Here is the search algorithm in the Python programming language:

def search_binary_tree(treenode, value): if treenode is None: return None # failure left, nodevalue, right = treenode if nodevalue > value: return search_binary_tree(left, value) elif value > nodevalue: return search_binary_tree(right, value) else: return nodevalue

Since this operation can follow any path from the root to a leaf, it requires worst-case time proportional to the height of the tree.

1.2 Insertion

Insertion is almost identical to search: we search for the value, but if we do not find the value we continue searching along either the left or right branch. Eventually we will reach a nil leaf, and then we simply add the value at that position. Put differently, we examine the root and recursively add it to the left subtree if the new value is less than or equal the root, or the right subtree if the new value is greater than the root. Here is the insert algorithm in Python:

def binary_tree_insert(treenode, value): if treenode is None: return (None, value, None) left, nodevalue, right = treenode if nodevalue > value: return (binary_tree_insert(left, value), nodevalue, right) else: return (left, nodevalue, binary_tree_insert(right, value))

Since this operation can follow any path from the root to a leaf, it requires worst-case time proportional to the height of the tree.

1.3 Deletion

Deleting a node which has no children is easy: we just delete it. If a node has only one child, we delete it and replace it with its child. The tricky part comes when we want to delete a node with two children. Call the node being deleted N. In order to do this, we need to replace the value of N with some other value from the tree which is greater than or equal to all nodes in N's left subtree, and less than or equal to all nodes in N's right subtree.

Where will we find such a value? We observe that are two such values: the largest value in N's left subtree, found by repeatedly following right links, and the smallest value in N's right subtree, found by repeatedly following left links. Once we find it, we copy its value into N, and then delete it. Since either of these nodes has less than two children, we already know how to delete it. In a good implementation, it's recommended to avoid consistently using one of these nodes, because this can unbalance the tree.

Here is Python code for deletion:

def binary_tree_delete(treenode, value): if treenode is None: return None ; Value not found left, nodevalue, right = treenode if nodevalue = value: if left is None: return right elif right is None: return left else: maxvalue, newleft = find_remove_max(left) return (newleft, maxvalue, right) def find_remove_max(treenode): left, nodevalue, right = treenode if right is None: return (nodevalue, left) else: (maxvalue, newright) = find_remove_max(right) return (maxvalue, (left, nodevalue, newright))

Although this operation doesn't always traverse the tree down to a leaf, this is always a possibility; thus in the worst case, it requires time proportional to the height of the tree. It doesn't require more even when the node has two children, since it still follows a single path and visits no node twice.



Read more »

Non User