Counting Sort Algorithm

Video Tutorial
FREE
Counting Sort thumbnail
This video belongs to
DSA Problem Solving for Interviews using Java
11 modules
Certificate

Counting sort is a sorting technique which is based on the range of input value. It is used to sort elements in linear time. In Counting sort, we maintain an auxiliary array which drastically increases space requirement for the algorithm implementation.

Takeaways

  • Counting sort is a stable sorting technique, which is used to sort objects according to the keys that are small numbers. It counts the number of keys whose key values are the same.

  • Complexity of counting sort

    • Time complexity - O(nlog(n)nlog(n))
    • Space complexity - O(n)

Sorting is one of the most commonly used computer algorithms in everyday life. Right from the beginning of our school life, we have been sorted according to the lexicographical order, or to simplify, according to our names, like a typical English dictionary.

counting sort algorithm

If that wasn’t enough, we were also sorted based on the marks in examinations. Yikes!! We’re surrounded by sorting. The competitive programming competitions that all of us like so much also sort the contestants based on the number of problems solved.

Definition of Counting Sort

Sorting is arranging the elements of a data set in such an order that each element is greater (or equal to) than the previous element and less than (or equal to) than the next element.

The criteria for lesser or greater is based on the type of elements present in the data set.

aryan, bunty, harbhajan, ……., rahul, sachin, yuvraj Names sorted in lexicographical order


Since the applications of sorting are practically infinite, we should know how and which algorithm we should use to sort different types of data sets, as one might be useful for some specific data sets and might not be so useful for other types.

Today we will be discussing one of the most simple and elegant sorting algorithms out there, Counting Sort.

Counting Sort Algorithm

In the given example, the array represents the marks of students in an exam. The total number of marks for the exam is fixed, denoted as T. The range of marks that a student can score is also fixed, which is [0, T]. Therefore, all the values in the array, representing the marks of the students, will always fall within this range. Can we use that to our advantage? Oh yes we can!!

Since the values will always lie in a fixed range, we can count the total number of occurrences of each distinct value in the array.

value: `0 1 2 3 4 5 6 7 8 9 10`

count: `0 2 1 0 2 1 4 0 1 3 1`

As we can see above, the total occurrences of 6 in the above array is 4 whereas for 3, it is 0. Similarly, we have calculated the occurrences for all possible values, or the range of the array.

We are using each index of the count array to store the number of occurrences, i.e. the frequency of a particular value in the marks array. Neat right?

  • Now how do we create the final sorted array of marks from the count array?

  • We take each element of the count array, and check if the count at a particular index, count[i]is greater than 0 or not, where 0 <= i <= max and max is the maximum element of the array.

  • If the value of count[i] is equal to 0, it means the value "i" is not present in the original array at all, i.e. no student got this number of marks. Hence, this value will not be part of the final sorted array as well. We can just skip/ignore it.

  • If the value of count[i] is greater than 0, it means the value “i” of the occurs "count[i]" number of times in the original array. So, we can just insert the value "i" into the sorted array "count[i]" number of times as well. Let’s see how this would work.

Assume that result is the final sorted array that we will create, which of course will have the same size as the initial marks array.


Now let’s start with the first index i = 0. Since, the count of 0 is, count[0] = 0, it means no student scored 0 marks. Yayyy!!! We can just skip it.

Now for the next element, i = 1, and its count is, count[1] = 2. It means there are two students who scored 1 mark. Thus, we will add two 1's in the result array as well.


result: 1 1 - - - - - - - - - - - - -

Now, i = 2, and it’s count is, count[2] = 1. So we’ll append just one 2 in the result array.


result: 1 1 2 - - - - - - - - - - - -

Similarly, for i = 6, the count is, count[6] = 4. And we’ll append four 6's in the result array.


result: 1 1 2 4 4 5 6 6 6 6 - - - - -

We repeat this until we have appended each element in the result array based on it’s frequency.


result: 1 1 2 4 4 5 6 6 6 6 8 9 9 9 10

And voila, we’ll have the final result array which is sorted in the increasing order of the marks scored by the students.

The magical part is that since, we start appending elements from the lowest index of the count array upto the highest index, the values are automatically inserted in a sorted manner in the final sorted array. How elegant is that!! 😀

Flow Chart for Counting Sort Algorithm

flowchart of counting sort

counting sort algorithm flowchart

Algorithm of Counting Sort

Below is the algorithm of Counting Sort.

  1. Initialise n = size of the array.

  2. Run a loop to traverse the array and find the maximum element of the array. Let’s call it max.

  3. Initialise a new array called the count of size max+1. We will use the count array to store the frequencies of all the elements in the range [0, max].

  4. Initialise i = 0. if i < n, increment the number of occurrences of arr[i] in the count array, i.e. count[arr[i]]++.

  5. Initialise lastIndex = 0. This variable will tell us the position at which we need to append an element in the sorted array at any point of time. Also, to save space, we will overwrite the original array to store the sorted array.

  6. Initialise i = 0. Run a loop over the count array.

    • if i > max, it means we have finished traversing the count array and we have completely filled the sorted array. We can return the sorted array, i.e. arr after this step.
    • else if count[i] > 0, we will run a loop for count[i] times, and append the element i at the lastIndex into the sorted array, i.e. arr[lastIndex++] = i.

Implementation of Counting Sort

Counting sort in Java

Output:

Counting sort in C/C++

Output:

Counting sort in Python

Output:

Time Complexity of Counting Sort Algorithm

In Counting Sort, the algorithm involves looping over the original array of size N and the count array of size K. Both of these loops have linear time complexity, O(N) and O(K) respectively. In the final step, while appending elements into the sorted array, there is a nested while loop. However, the inner loop runs only count[i] times, where count[i] represents the frequency of the element. Since the sum of all count[i] values is equal to N, the size of the array, the total complexity remains O(N + K). Therefore, the overall complexity of Counting Sort is O(N + K) in general.

Now, let’s see the best, worst and average case complexities of counting sort.

  1. Best Case: If the array has only one unique element which is 0, i.e. all the elements of the array are equal to 0, then the maximum element of the array is 0 as well. So, the total complexity of the algorithm becomes O(N + 0) = O(N).

  2. Worst Case: If the size of the array is N, the other factor which can affect the complexity is K, the maximum element of the array. If the array has a very large value for the maximum element of the array, i.e. K ~ CN. Therefore, the total complexity of the algorithm becomes O(N + CN) ~ O(C*N).

  3. Average Case: On average, if the array has the maximum element which is the approximately equal to the size of the array, then the total complexity becomes O(N + N) = O(2*N).

Space Complexity of Counting Sort Algorithm

As it’s evident from the algorithm, we use an additional array of size K, which is the maximum element of the array to store the frequencies of all the elements in the range [0, K]. Thus, the space complexity largely depends on the value of K, and is equivalent to O(K).

P.S. The extra variables used like max, lastIndex and looping variables are constant since they don’t vary with the input size of the array.

Applications of Counting Sort Algorithm

Counting Sort is not a stable sort algorithm, meaning the relative order of elements in the original array is not preserved in the final sorted array. If stability is required, Counting Sort may not be the best choice.

The time and space complexity of Counting Sort are largely dependent on the maximum element of the array. If the maximum value is very large, the complexity increases. The algorithm may not work if the maximum value exceeds the size limit of the array, typically around the range of 10610^6 to 10710^7. In such cases, alternative approaches like using hashing can be considered to overcome the limitations. Counting Sort is most suitable when the range of the dataset is limited and linear time complexity is a crucial requirement.

Conclusion

Today we learnt about a very elegant algorithm to sort a data set. The power of Counting Sort is it’s linear running time. If the data set is huge but the range of elements in the array is fairly less, counting sort is actually much more powerful than algorithms like merge and heap sort. But it’s applications are limited due to it’s big memory constraint.

Thanks for reading. May the code be with you 🙂

Code with Precision! Enroll in Scaler Academy's Online DSA Certification Course and Master Efficient Algorithms.