Van Emde Boas tree
Van Emde Boas tree (or Van Emde Boas priority queue or vEB tree) is a tree data structure which implements an associative array with m-bit integer keys. It performs all operations (insert, delete, lookup, maximum, minimum, successor and predecessor) in O(log log M) time, where M is the maximum number of elements that can be stored in the tree.
It was formulated by a team led by Dutch computer scientist Peter van Emde Boas in 1975.
Van Emde Boas tree is an associated array implemented through a multiway tree structure.
It is expected to perform five operations:
- find(k): Find the key k and the value it points to.
- predecessor(k): Return the largest key in domain less than or equal to k.
- successor(k): Return the smallest key in domain less greater or equal to k.
- insert(k): Insert a key k into the tree.
- delete(k): Delete the key k from tree.
Van Emde Boas tree or vEB stores the keys in a bounded universe and performs above operations in double logarithmic time complexity with respect to universe size.
vEB tree’s other than being able to solve predecessor problem extremely fast, can also be used as an implementation of priority queue (which is faster than a heap implementation ).
vEB tree for a universe U containing [0 . U-1] is implemented by dividing the universe into √ U vEB trees, where the i th subtree would be responsible for universe of size √ U containing the keys [(i)√ U . (i+1)√ U -1]. The data structure is implemented recursively, where it shrinks by a factor of √ U at every level.
The vEB tree is a modification of proto-vEB structure. A vEB tree stores following information:
- children array of size √ U that points to children vEB trees. child i points to vEB tree responsible for the range
[(i)√ U . (i+1)√ U -1]. - min to store minimum element in the universe of the tree.
- max to store maximum element in the universe of the tree.
- summary , an auxillary vEB tree to keep track of which subtree/child is empty.
- A variable u to keep track of universe size.
When tree is empty, min and max are both NULL.
The element stored in min does not appear in any children trees, it is only stored at min, while element stored at max does.
When the universe size is 2, vEB does not need an array. It can just store the values in min and max attributes.
At the base level, a vEB tree is just a direct-addressing with added attributes.
A vEB tree can ve represented as:
A vEB that stores values 10, 12, 13, 14.
Algorithm
- Because of the way tree is indexed, a key k belongs to any one of √ U subtrees. Three function are used:
high(k, U) = floor(k/√U) low(k, U) = k mod(√U) index(k, k', U) = k√U + k'
The function high tells what subtree k belongs to
low tells the position of k in that subtree
The function index tells what is the actual key given the subtree number and position in the subtree.
The value of U is the universe size of tree in which the function is called. Also, it should be noted that universe size must be of form 2 2 x .
Successor operation
- successor(T, k): This is a recursive funtion where T is the vEB tree to find the successor of k in T. The algorithm will be similar for predecessor.
function successor(T, k): if k < T.min: return T.min else if k >T.max: return T.u # universe size used as null value i = high(k, T.u) # subtree index j = low(k, T.u) # key for the subtree if j < T.children[i].max: return k - j + successor(T.children[i], j) # otherwise find next subtree for successor return k - j + T.children[successor(T.summary, i)].min
predecessor function is implmented similarly.
Insert operation
function insert(T, k): # if tree is empty, i.e. first insertion if T.min > T.max: T.min = k T.max = k return # if key is new T.min, instead of k insert old T.min in subtrees # set k as new T.min if k < T.min: swap(k, T.min) # since T.max is also stored in subtrees, no need to swap if k >T.max: T.max = k # if at lowest vEB, i.e. the leaf tree if T.u == 2: # note that k can be 0 or 1 T.max = k return i = high(k, T.u) # subtree index j = low(k, T.u) # key for the subtree insert(children[i], j) # if the recursive insertion created a new subtree, add it to summary vEB if children[i].max == children[i].min: insert(T.summary, i)
Delete operation
function delete(T, k): # if only one element in tree if T.min == T.max: T.min = inf T.max = -inf return # if k is T.min, assign new value to it and return since min is not # stored in children if k == T.min: i = T.summary.min # index of first subtree T.min = index(i, T.children[i].min, T.u) # declared above return i = high(k, T.u) # index of subtree that contains k j = low(k, T.u) # key for the subtree # if last subtree if u == 2: if k == 1: if min == 1: min = T.u max = -1 else if min == 0: max = 0 else: # i.e. k == 0 if max == 0: min = T.u max = -1 else if max == 1 min = 1 return delete(T.children[i], j) if T.children[i] is empty: delete(T.summary, i) # if deleted element was T.max if k == T.max: # if no subtrees exist, tree just has one element stored at t.min if T.summary is empty: T.max = T.min else: # else find new max element i = T.summary.max # index of last subtree T.max = index(i, T.children[i].max, T.u)
Complexity
Implementations
/* * Code for van emde boas tree. Try to implement function for predecessor as * an exercise. */ #include #include #include class vEB < int u; int min; int max; vEB *summary; std::vectorchildren; int high(int k) < int x = std::ceil(std::sqrt(u)); return std::floor(k / x); >int low(int k) < int x = std::ceil(std::sqrt(u)); return k % x; >int index(int k, int kk) < return (k * (int)std::sqrt(u)) + kk; >public: vEB(int U) < // u = std::pow(std::sqrt(U), 2); u = U; min = U; max = -1; if (u <= 2)< summary = nullptr; children = std::vector(0, nullptr); > else < int children_count = std::ceil(std::sqrt(u)); int children_size = std::ceil(u / std::sqrt(u)); summary = new vEB(children_count); children = std::vector(children_count, nullptr); for(int i = 0; i < children_count; ++i)< children[i] = new vEB(children_count); >> > void insert(int k) < if(min >max) < min = max = k; return; >if(k < min)< int temp; temp = min; min = k; k = temp; >if(k > max) max = k; if(u == 2) < max = k; return; >int i = high(k); int j = low(k); children[i]->insert(j); if(children[i]->max == children[i]->min) summary->insert(i); > void remove(int k) < if(min >max) return; if(min == max) < min = u; max = -1; return; >if(u == 2) < if(k == 1)< if(min == 1)< min = u; max = -1; >else if(min == 0) max = 0; > else if(max == 0) < min = u; max = -1; >else if(max == 1) min = 1; return; > if(k == min)< int i = summary->min; min = index(i, children[i]->min); return; > int i = high(k); int j = low(k); children[i]->remove(j); if(children[i]->min > children[i]->max)< summary->remove(i); > if(k == max)< if(summary->min > summary->max) < max = min; >else< i = summary->min; max = index(i, children[i]->max); > > > int getMin() < return min; >int getMax() < return max; >int universe() < return u; >int successor(int k) < if(k max) return u; int i = high(k); int j = low(k); if(j max)< int res = children[i]->successor(j); if(res >= (u / (int)std::sqrt(u))) return u; return k - j + res; > else< int res = children[summary->successor(i)]->min; if(res >= children[summary->min]->u) return u; return index(summary->successor(i), res); > > >; int main()
Applications
- vEB trees were made specefically to solve predecessor problem in double logarithmic time.
- vEB trees can be used as extremely fast priority queues and find uses in routing applications.
- If universe size is reasonable, vEB trees are basically faster binary search trees.
References/ Further reading
- CLRS chapter 20 to know more about vEB trees and prototype structure, and motivations beind it.
- Lecture slides from stanford cs166.
- Lecture by Eric Demaine on youtube
Yash Aggarwal
Intern at OpenGenus (2019) and Weir Minerals (2019) | B. Tech (2016 to 2020) in Computer Science at National Institute of Engineering, Mysore
Vote for Author of this article:
OpenGenus Tech Review Team
Data Structures
John Conway's Game of Life
The Game of Life is a cellular automaton created in 1970 by the British mathematician John Horton Conway. The game is a zero-player game, where you include an initial input and watch how the board evolve through the generations and if the life will prosper or will be extinct over time.
Gabriel Lechenco
Counting Sort
Counting sort is an algorithm for sorting integers in linear time. It can perform better than other efficient algorithms like Quick Sort, if the range of the input data is very small compared to the number of input data. It is a stable, non-comparison and non-recursive based sorting
Rohit Kumar
Источник
Van Emde Boas Tree | Set 1 | Basics and Construction
It is highly recommended to fully understand Proto Van Emde Boas Tree. Van Emde Boas Tree supports search, successor, predecessor, insert and delete operations in O(lglgN) time which is faster than any of related data structures like priority queue, binary search tree, etc. Van Emde Boas Tree works with O(1) time-complexity for minimum and maximum query. Here N is the size of the universe over which tree is defined and lg is log base 2. Note: Van Emde Boas Data Structure’s key set must be defined over a range of 0 to n(n is positive integer of the form 2 k ) and it works when duplicate keys are not allowed. Abbreviations:
) is an abbreviation for VEB containing u number of keys.
Structure of Van Emde Boas Tree: Van Emde Boas Tree is a recursively defined structure.
- u: Number of keys present in the VEB Tree.
- Minimum: Contains the minimum key present in the VEB Tree.
- Maximum: Contains the maximum key present in the VEB Tree.
- Summary: Points to new VEB( ) Tree which contains overview of keys present in clusters array.
- Clusters: An array of size each place in the array points to new VEB( ) Tree.
See the image below to understand the basics of Van Emde Boas Tree, although it does not represent the actual structure of Van Emde Boas Tree: Basic Understanding of Van Emde Boas Tree:
- Van Emde Boas Tree is recursively defined structure similar to Proto Van Emde Boas Tree.
- In Van Emde Boas Tree, Minimum and Maximum queries works in O(1) time as Van Emde Boas Tree stores Minimum and Maximum keys present in the tree structure.
- Advantages of adding Maximum and Minimum attributes, which help to decrease time complexity:
- If any of Minimum and Maximum value of VEB Tree is empty(NIL or -1 in code) then there is no element present in the Tree.
- If both Minimum and Maximum is equal then only one value is present in the structure.
- If both are present and distinct then two or more elements are present in the Tree.
- We can insert and delete keys by just setting maximum and minimum values as per conditions in constant time( O(1) ) which helps in decreasing recursive call chain: If only one key is present in the VEB then to delete that key we simply set min and max to the nil value. Similarly, if no keys are present then we can insert by just setting min and max to the key we want to insert. These are O(1) operations.
- In successor and predecessor queries, we can take decisions from minimum and maximum values of VEB, which will make our work easier.
In Proto Van Emde Boas Tree the size of universe size is restricted to be of type 2 2k but in Van Emde Boas Tree, it allows the universe size to be exact power of two. So we need to modify High(x), low(x), generate_index() helper functions used in Proto Van Emde Boas Tree as below.
) ), which is basically the cluster index in which the key x is present.
High(x) = floor(x/ceil( title="Rendered by QuickLaTeX.com" height=28 width=37 style=vertical-align:-7px>))
) which is its position in the cluster.
Low(x) = x % ceil( title="Rendered by QuickLaTeX.com" height=28 width=37 style=vertical-align:-7px>)
- generate_index(a, b) : It will return position of key from its position in cluster b and its cluster index a.
generate_index(a, b) = a * ceil( title="Rendered by QuickLaTeX.com" height=28 width=37 style=vertical-align:-7px>) + b
Construction of Van Emde Boas Tree: Construction of Van Emde Boas Tree is very similar to Proto Van Emde Boas Tree. Difference here is that we are allowing the universe size to be any power of two, so that high(), low(), generate_index() will be different.
To construct, empty VEB: The procedure is the same as Proto VEB just two things minimum and maximum will be added in each VEB. To represent that minimum and maximum is null we will represent it as -1.
Note: In the base case, we just need minimum and maximum values because adding a cluster of size 2 will be redundant after the addition of min and max values.
Below is the implementation:
Источник