Segment Tree in Data Structure

Overview

What is a segment tree? A flexible tree-based data structure that is used to effectively solve problems involving range query operations.

Takeaways

A Segment Tree is a data structure that allows answering range queries over an array effectively, while still being flexible enough to allow modifying the array.

What is Segment Tree?

A segment tree enters the scene, where other techniques such as prefix computation fail. Let's understand this contrast between the two.

Suppose we need to compute the sum of the elements of an array within a range. The prefix sum array method yields an O(1) solution, while the segment tree would result in an O(log n) solution. Doesn't sound very helpful?

What if the operation was finding the maximum/minimum element within a range, bitwise OR over a range, gcd over a range, etc.?

Suppose we have an array Arr = [0,1,2,3,4,5]. Its prefix sum array will be preSum = [0,1,3,6,10,15]. Now to calculate the sum of elements within index bounds [1,4] we use ans = preSum[4]-preSum[0]. Easy peasy.

Unfortunately, the prefix minimum element array will be preMin = [0,0,0,0,0,0]. Is there any way of calculating the minimum element within the same range [1,4]? NO! :( Not with prefix technique. Fortunately, with segment trees' versatility, such problems can be solved efficiently.

Prefix computation does enable making updates to the array, but it needs to recompute the whole array making O(n) time complexity overhead per update. One would not only update the element at a given index in the array but will have to also make changes in prefix array.

Thus prefix computation is definitely not the smart choice to make when updation is required. The segment tree is!

Now imagine if we could update such ranges in logarithmic time as well, wouldn’t it be cool?

Let’s see how segment tree achieves it!

A segment tree is implemented in the array-based tree format { C.P. GUYS COULD RELATE }. The parent-child hierarchy is such that each array block represents a node by storing its value. For ith node, its left and right children are stored in (i * 2)th and (i * 2 + 1)th block respectively.

The entire array of elements to be operated upon is present in the segment tree in chunks of many sizes referred to as segments; hence the name segment tree 😉.

The divide and conquer strategy here dictates that during traversal along some path from top to bottom in the tree, at each level the sample space is halved. This tremendously improves efficiency during searching, updation, etc. Let’s have a look at a pictorial representation of a segment tree:

Pictorial Representation of a Segment Tree

As observed from the diagram, The root of the tree is a complete segment, i.e. carries information related to all the elements of the array. For the next level, this data in root is divided into two halves.

The first half is assigned to the left subtree (precisely the left child) and the other half to the right one. This procedure is followed by subsequent nodes, recursively.

This brings us to leaves in the tree, which hold individual elements of the array. Thus all the leaves collectively represent the entire data (array elements) upon which the segment tree is built, individually.

Implementation of Segment Tree

The ‘buildTree()’ function builds a segment tree using the array ‘v’ and stores it into the array ‘tree’.

At each level of recursion, the range is divided into two halves.
Then recursively the left and right children nodes are built for the first half and the other half of the range respectively.
The current node will then hold the sum of its children’s values. The base condition is the case of a leaf, which is when the size of the range is one.

Suppose an array arr = [1,2,3]. It will be stored in the tree in 3 layers of nodes/blocks:

C++

JAVA

PYTHON

Operations of Segment Tree

This example is an implementation of a segment tree specifically as a solution to the problem of finding the sum of element in an array segment in range-based queries, and therefore is not a formal generic segment tree implementation as it would vary with each problem, just as the various coffees do.

Query

The ‘query()’ function takes in as parameters the boundaries of the query, ‘l’ and ‘r’, and returns the sum of the elements in this range of ‘l’ and ‘r’ in the array. Thus a query call of range [ 1,2 ] would return the sum of elements present in the array within the index bounds of 1 and 2.

the-sum-of-values-segment-tree segment-tree-the-sum-of-values

Taking an example shown in the above cool gif, a query of [0,2] is easily retrieved as [0,2] happens to be present in a node already. But for a query [0,4] some extra muscle is required.
We don't have [0,4] on a node, but what we have are components like [0,2], [3,5], etc. Its almost like figuring out something from pieces of a map 😁.
We've got to extract and refine information from these components and merge them to obtain the desired result, i.e. an answer to the query [0,4].
We get the value of [0,2] from the pre-existing node of same label beneath [0,5]. How do we get [3,4]? From [3,5]. Instead of using the value present at node [3,5], one would look at its depth and to one's wonder, node [3,5] happens to have [3,4] as its left child.
This node is exactly what we need (remember [0,2] from previous case). Now we have the sum of arrays within indices range [0,2] and [3,4] as well which on addition would yield the desired output which is query(0,4). Bravo!

C++

JAVA

PYTHON

Update

The update operation takes in the new value to be set and the corresponding index. After making the update to the tree array, it needs to make similiar relevant changes to the actual segment tree. This is done by calling the ‘updateValueUtil()’ function which recursively updates the segment tree. To understand this better let us take a look at the previous example. Suppose we need to set the element present at index 2 to the value of 10.

We first update the tree array at index 2 to 10, and then call the ‘updateValueUtil()’ function to update the segment tree.
Now begins the real adventure. The recursive function takes a bunch of parameters- the difference this update will produce, the index of the current node in the segment tree, start index and end index of the current range in recursion.
If current node index is out of range, we return. If the current node index is within the range, we update the value of the node and then recursively call the function on the left and right child of the current node. Thus the new value at node [2,2] will be 10.
This affects a couple of other nodes in the tree, naturally the ones having ranges inclusive of 2. These are [0,2] and [0,5]. Note that the old value at node/index 2 was 7 and the value of dif is 10-7=3. At [0,5] dif will be added to set its new value as 18(=15+3). Finally comes the turn of root node [0,5]. Its value will be updated as 47=(44+3).

C++

JAVA

PYTHON

Complexities of Segment Tree

The time complexity of tree construction is O(n), n being the number of elements put in the tree, as n calls to ‘buildTree()’ will be made overall, as there are (2*n)-1 nodes in the tree and value of each node is computed only once during tree construction. The time complexity of a query operation is $O( log_2{n})$ as a maximum of 4 nodes will be visited at any level of recursion and the height of the tree is $log_2{n}$ which corresponds to the number of levels. The space complexity of a segment tree is O(n).

Uses Cases of Segment Tree

Computational geometry: A common use case is in solving geometric problems that involve querying a range of points, such as finding the count of points that lie in a specific rectangular region or determining the closest point to a given point. This is achieved by storing the points in the segment tree and using its efficient search and update properties to answer these queries. Furthermore, segment trees can be utilized in higher dimensions for problems such as rectangle or box queries in 2D or 3D space. These applications make segment trees a powerful tool in many areas of computational geometry.
Geographic information systems: Segment trees in Geographic Information Systems (GIS) are crucial for spatial range queries, such as identifying all geographic features within a specific area. Their efficient search and update capabilities also enable rapid nearest neighbor queries to identify the closest feature to a given point. These functions make segment trees a valuable tool for GIS data management and analysis.
Machine learning: In machine learning, segment trees can be utilized in reinforcement learning algorithms, specifically in prioritized experience replay, which is used to enhance the learning process. Prioritized experience replay selectively samples important experiences to learn from based on a priority value. The segment tree data structure, with its efficient range query handling and point update capabilities, is used to store and retrieve these experiences. It allows the algorithm to efficiently sample experiences based on their priority values, improving the learning efficiency and effectiveness of the model.

Other Range-Query Data Structures

Sqrt-decomposition - answers each query in O(√n), preprocessing done in O(n).
Fenwick tree - answers each query in O(logN), preprocessing done in O(NlogN).

Conclusion

Used for efficiently solving problems involving range query operations.
A divide and conquer algorithm.
Time Complexity for tree construction is O(n).
Time complexity for querying and updation are O(log n).
Applicable where pre-computation fails or is costly.
Alternatives to segment tree are Binary Indexed Tree and SQRT tree.

See More

Segment Tree with Lazy Propagation -Types of Trees in Data Structures