Bucket Sort Algorithm

Bucket sort efficiently organizes elements by distributing them into distinct buckets, facilitating subsequent sorting using alternative algorithms. This technique, pivotal in software engineering interviews, categorizes data into buckets based on uniform distribution. Once grouped, each bucket undergoes sorting, and the sorted elements are then amalgamated into a cohesive, ordered sequence. Bucket sort's approach streamlines sorting processes, ensuring effective data arrangement through systematic bucket allocation and subsequent organization.

Uniformly Distributed Data

Uniformly distributed data has difference between every adjacent element almost the same.
Examples and graph to represent uniformly distributed data.

An array is said to be uniformly distributed if the difference between each adjacent element inside the array is almost the same. Or, in other terms, the elements of the array spread evenly throughout the whole span/range of the array.

For example:

Consider this array: [10, 21, 29, 41, 52] The difference between each adjacent term is almost equal to 10. Hence, this array has uniformly distributed data and can be sorted using the bucket sort algorithm.
Consider another array: [1,4,23,5,44,9,6,43] This array is not uniformly distributed because the number of elements between the range [0-10] = 5 (i.e., 1,4,5,9, and 6), whereas the number of elements between [10-20] is 0, and the same for the range [30-40]. The data is scattered over different data ranges but is concentrated within some specific ranges, whereas sparse among the others. Hence, this array is not uniformly distributed.

This is how a uniform distribution data looks like: The complete data is scattered equally within each range and hence depicts uniformly distributed data.

what is bucket sort

Scatter-Gather-Approach

Solving large and complex problems can be a little hard at times. Scatter-gather-approach tries to simplify such complex problems by first scattering the complete data into clusters, solving these sub-problems individually, and finally gathering them all together to get the final result. This is how Bucket sort uses scatter-gather-approach:

Bucket sort scatter-gather-approach

Works on Uniformly Distributed data
Follows Scatter-Gather-Approach
Sorting technique with time complexity O(n)

Bucket sort is a sorting technique that uses the Scatter-Gather-Approach to sort the array.

It divides the unsorted array into separate groups and calls them buckets. Sort the individual buckets, and then gather them all together to form the final sorted array.

If the elements of the array are floats ranging between 0 and 1, we primarily make 10 buckets, numbered from 0 to 9, and then insert elements into these buckets depending upon their most significant number. The bucket index is calculated as: (int) (elementNumber * 10)
If the elements of the array are integers, as shown in the figure below, we simply calculate the range: range = (maximumNumber - minimumNumber) / noOfBuckets and divide the whole range into buckets and then perform bucket sorting.

Graphically, this is how the bucket sort with integer elements look like:

bucket sort with integer elements

Working of Bucket Sort

Let us understand how the algorithm works with the help of the above example. Consider this unsorted array

bucket sort unsorted array

Step 1:

Since we have the array with integer elements, we'll calculate the range first.

Hence, we'll have buckets of the following ranges: [1-5] [6-10] [11-15] [16-20] [21-25]

bucket sort range

Step 2: Scatter

Now, iterate through the unsorted array and keep inserting the numbers in the bucket of their corresponding range.

For example:

and so on..

Finally, the buckets will look like this:

Bucket Sort with unsorted array

Step 3:

Now sort all the elements in each of the buckets, and the sorted buckets look like this:

sorted buckets

Step 4: Gather

At last, visit each bucket and gather all the numbers together. Merge them all, and we'll get the sorted array.

Hence, the sorted array is: Bucket sorted array

PseudeCode

Hence, we have the sorted array.

Bucket Sort Algorithm for floating points numbers

Below written is the complete code for bucket sort.

At first, we create a vector, bucket and then create n buckets inside this vector. The whole array is then traversed, and elements are placed in the bucket corresponding to their range.

JAVA Code

Output:

Note: This applies if the elements of the array range between 0 and 1. What if the elements are integers? In that case, we'll set the range of each bucket a little differently, which we will learn in the next section.

Following images show the working of code with the inputs taken above:

working of bucket sort code working of bucket sort coding bucket sort working coding bucket sort working coding inputs

Bucket Sorting for Integer elements

Find range = (max - min) / noOfBuckets
bucketIndex = (array[i] - minimum) / range
Finally, sort and gather

For integer elements, we need to follow the following algorithm:

Calculate the maximum and the minimum element of the array
Calculate the range: range = (maximum - minimum) / n, where n is the number of buckets (Given as parameter)
Create n empty buckets and initialize them with 0
Loop through the unsorted array and perform the following: a) Calculate bucketIndex bucketIndex = (array[i] - minimum) / range b) Insert the ith element of the array into the bucket[bucketIndex]
Sort the individual buckets
Gather all the elements together

Hence, we have the sorted array with us. This is how it works: Scatter Sort Gather

Python code - Bucket Sort with Integer elements

Output

Bucket Sort time complexity

Best Case Time Complexity: O(n+k)
Average Case Time Complexity: O(n)
Worst Case Time Complexity: O(n^2^)

Best Case Time Complexity:

If the array elements are uniformly distributed, bucket size will almost be the same for all the buckets. Hence, this will be the best case which will take up the least amount of time.

Sorting time complexity will reduce even further if all the elements inside each bucket are already sorted.

To create n buckets and scatter each element from the array, time complexity = O(n). If we use Insertion sort to sort each bucket, time complexity = O(k). Hence, best case time complexity for bucket sort = O(n+k), where n = number of elements, and k = number of buckets

Worst Case Time Complexity

If the array elements are not uniformly distributed, i.e., elements are concentrated within specific ranges.

This will result in one or more buckets having more elements than other buckets, making bucket sort like any other sorting technique, where every element is compared to the other. Time complexity increases even further if the elements in the array are present in the reverse order. If insertion sort is used, the worst-case time complexity can go up to O(n^2^).

Bucket Sort Space Complexity

Space Complexity : O(n+k)

Space Complexity for bucket sort is O(n+k), where n = number of elements in the array, and k = number of buckets formed Space taken by each bucket is O(k), and inside each bucket, we have n elements scattered. Hence, the space complexity becomes O(n+k).

Applications

Uniformly Distributed data
Floating point numbers between range 0.0 to `1.0

Bucket Sort is a very different type of sorting algorithm as it does not involve direct comparison between the numbers. It can only be applied to uniformly distributed data.
Whenever we have floating-point numbers between 0 and 1, bucket sort might be the best sorting approach. Reason - if we use merge-sort, quick-sort, heap-sort, etc, the problem will take a minimum of O(nlogn) time complexity. Also, counting sort cannot be used because floating point numbers cannot be used as index. Hence, bucket sort is best suited for sorting the array with floating point numbers of range [0.0-1.0].

Supercharge Your Coding Skills! Enroll Now in Our DSA Course and Master Algorithmic Excellence.

Conclusion

The bucket Sort algorithm sorts the elements of the array by first segregating the array into a number of buckets, sorting each bucket, and then gathering the elements back to form the sorted array.
Bucket Sort is used to sort an array where elements are uniformly distributed, or where the elements of the array range between 0 and 1.
Bucket sort can exhibit the best case time complexity of O(n+k), where n is the number of buckets and k is the bucket size.
The way buckets are provided ranges differ in cases when the elements of the array are floats and integers. This is discussed in detail above.