Advanced Trees

Advanced Trees

Definitions:

Red/Black Tree:

It was invented by Rudolf Bayer in 1972.  Red-black trees are one of the preferred methods of maintaining binary search trees.  An n node red-black tree has the height of O(logn).  Moreover balancing time for such a tree on account of insertion and deletion is O(logn).  A red-black tree is a binary search tree with the following properties:

  • Every node is colored either red or black.
  • The root node is black.
  • The value of any node is greater than the value of its left child and less than the value of its right child.
  • Every red node that is not a leaf node has only black children.
  • Every path from the root to a leaf contains the same number of black nodes.
  • All leaf nodes are black.

A red-black tree contains one more field for storing the color the node either red or black.

Splay Tree:

A splay tree is a binary search tree.  It was invented by Daniel Sleator and Robert Tarjan.  All normal operations on a splay tree are based on one basic operation called splaying.  Splaying the tree for a certain element rearranges the tree so that the element is placed at the root of the tree.  To do this, the element is first found with a standard tree search, then tree rotations are performed in a specific fashion to bring the element to the top.  Thus in a splay tree certain elements will be accessed more often than others, thereby, resulting in good performance for certain applications such as implementing caches.  It performs basic operations such as insertion, look-up and removal in O(logn) average time.

Trie:

A trie is a data structure used for storing strings.  One node is reserved for every common prefix.  The strings are stored in extra leaf nodes.

Patricia Tree:

It is a data structure used to represent a Trie in compact form where all nodes with one child are merged with their parent.

Suffix Tree:

It is a patricia tree corresponding to a given string.  It is a compact representation of a trie corresponding to the suffixes of a given string where all nodes with one child are merged with their parent.

Multi-Suffix Tree:

A multi suffix tree is a suffix tree extended to multiple strings by concatenating the strings.

Bucket Trie:

A bucket trie is a type of trie where leaf nodes are buckets which can hold k strings.  Thus the size of the bucket is generally taken to be fixed.

Compact trie:

It is a trie in which non-branching subtrees leading to leaf nodes are cut off.

Digital Tree:

A tree for storing strings in which nodes are organized by substrings common to two or more strings.

Digital Search Tree:

It is a data structure, a trie, which stores the strings in internal nodes, so there is no need for extra leaf nodes to store the strings.

Complete Binary Tree:

A full binary tree is a tree in which every node has zero or two children.  A perfect binary tree is a complete binary tree in which leaves are at the same depth.  Sometimes the perfect binary tree is also called the complete binary tree.

Tagged Union:

A tagged union is a data structure used to hold a value that could take on several different values of fixed type.  Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use.  It is also called a variant, variant record, or disjoint union.  Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type since only one is in use at a time.  In languages with tagged unions, a tree node is often a tagged union of two types of nodes, one of which is a 3-tuple data, left child and right child and the other of which is a leaf node, which contains no data and functions much like the null value in a language with pointers.

Binary Space Partitioning:

It is a method for recursively subdividing a space into convex sets.  This subdivision gives rise to a representation of the scene by means of a tree data structure.

Binary Space Partition Trees:

Binary Space Partition Trees (BSP) were introduced by Fuchs, Kedem and Naylor around 1980.  BSP tree is a hierarchical subdivisions of n dimensional space into convex subspaces.  Each node has a front and back leaf.  Starting off with the root node, all subsequent insertions are partitioned by the hyperplane of the current node.  In two-dimensional space, a hyperplane is a line.  In three-dimensional space, a hyperplane is a plane.  The end goal of a BSP tree is for the hyerplanes of the leaf nodes to be trivially behind or in-front of the parent hyperplane.

Quadtree:

A quadtree is a tree data structure in which each internal node has up to four children.  Quadtrees are more often used to partition a two dimensional space by recursively subdividing it into four quadrants.  Some common applications of quadrees are image representation, spatial indexing etc.  Quadtrees are two dimensional analog of octrees.

Octrees:

An octree is a tree data structure mainly used for organizing 3-dimensional data.  Each node of an octree has eight children.

R-trees:

R-trees are tree data structures similar to B-trees, but are used for spatial access methods i.e. for indexing multi-dimensional information.  Each node of an R-tree has a variable number of entries.  Each entry within a non-leaf node stores two pieces of data; a way of identifying a child node and the bounding box of all entries within this child node.  A new element will go into the leaf node that requires the least enlargement in its bounding box.  The insertion and deletion algorithms use the bounding boxes from the nodes to ensure that nearby elements are placed in the same leaf node.  Different algorithms can be used to split nodes when they become too full, resulting in the “Quadratic” and “Linear” R-tree subtypes.

Hash Table:

A hash table is a data structure that provides fast lookup of a record indexed by a key where the domain of the key is too large for simple indexing; as would occur if an array were used.  Like arrays, hash tables can provide O(1) lookup with respect to the number of records in the table.  Hash tables are often used to implement associative arrays.

Parse Tree:

A parse tree is a grammatical structure represented as a tree data structure.  Grammar is the study of the rules governing the use of a language.  The subfields of grammar are phonetics, phonology, morphology, syntax and semantics.  Grammar is part of the general study of language called linguistics.

B+ Tree: 

A B+ tree index takes the form of a balanced tree in which every path from the root to the tree leaf is of the same length.  Each non-leaf node in the tree has between ‘n/2’ and ‘n’ children, where n is fixed for a particular tree.  The B+ tree structure creates performance overhead on insertion and deletion and adds space overhead.  The space overhead is acceptable when we consider the performance benefits of the B+ tree structure.  Each node of a B+ tree contains a pointer and the associated search key value.  The pointer in the leaf node of the B+ tree points either to a file with the associated search key value or to a bucket of pointers. Each leaf can hold upto (n-1) values.  The minimum value that a leaf node may contains is (n-1)/2.  The non-leaf nodes of the B+ tree form a multilevel sparse index on the leaf nodes.  The structure of the non-leaf node is the same as the leaf node, except that all pointers are pointers to tree nodes.  A non-leaf node may hold upto ‘n’ pointers but must hold atleast (n/2) pointers.  ‘B’ in the ‘B+’ trees stands for balanced.  It is the balance property of B+ trees that ensures good performance for lookup, insertion and deletion of records.  The main difference between a B tree index and a B+ index is that a B tree eliminates the redundant storage of search key values.  A B tree allows search key value to appear only once.  Thus, B+ tree is preferred for its structural simplicity.