Longest Consecutive Subsequence

Learn via video course
FREE
View all courses
DSA Problem Solving for Interviews using Java
DSA Problem Solving for Interviews using Java
by Jitender Punia
1000
4.9
Start Learning
DSA Problem Solving for Interviews using Java
DSA Problem Solving for Interviews using Java
by Jitender Punia
1000
4.9
Start Learning
Topics Covered

Problem Statement

We are given an unsorted integer array nums, and we have to find the length of the longest consecutive sequence of elements in it.

The order of elements in the array shouldn't matter as we are looking for a subsequence and not a subarray.

For example, the longest consecutive subsequence in the array [8, 2, 5, 6, 7, 3], is [5, 6, 7, 8].

Note: The algorithm should run in O(n) time, where n is the size of the input array nums.

Example 1:

Input:

Output:

:::

Explanation:

The longest consecutive subsequence in the input array is [6, 7, 8, 9]; therefore, its length is 4.

:::

Example 2:

Input:

Output:

Explanation: The longest consecutive subsequence in the input array is [-1, 0, 1, 2, 3, 4, 5], therefore its length is 7.

Constraints

  • 0nums.length1050 \leq nums.length \leq 10^5
  • 109nums[i]109-10^9 \leq nums[i] \leq 10^9

Approach 1: Brute Force Approach

The Brute Force approach is the most inefficient way in terms of time complexity as in this approach, we will be considering each number in numsnums and check if that is the part of the longest consecutive subsequence or not.

Algorithm

This algorithm considers each number in the numsnums array and counts as high as the sequence goes from that. The greatest count recorded is returned as the length of the longest consecutive subsequence.

This is an optimal algorithm as it checks for every possibility and returns an accurate answer.

Implementation (C++ 20, Java 1.8, Python 3)

Implementation in C++ 20

Implementation in Java 1.8

Implementation in Python 3

The output for all of the implementations will be the same as the problem and the algorithm are the same.

Output

The longest consecutive sequence in the array [3, 4, 6, 7, 8] will be [6, 7, 8].

Complexity analysis

Time Complexity: O(n3)O(n^3)

The for loop in the longestConsecutive() function runs exactly nn times and the while loop nested inside it also runs in O(n)O(n) time as currNum is getting incremented by 11 in each iteration. Then on each iteration in the while loop, a lookup in the array numsnums is performed to check if currNum+1currNum+1 is present in it or not, this also costs O(n) runtime.

Therefore, the runtime of this brute force algorithm is actually the same as three nested O(n) loops.

Space Complexity: O(1)

This algorithm only allocates only a handful of integers that too independent of n.

Approach 2: Naive Approach (Sorting)

This algorithm is what comes to everyone's mind upon seeing this problem, as it is the simplest to perceive and think of.

The intuition behind this algorithm is that it will be easier to find sequences of consecutive numbers in a sorted array than in an unsorted one.

Algorithm

In this algorithm, we are first sorting the numsnums array and considering each number to compare to its previous number. To store the longest length recorded and the current length of the subsequence, we will initialize two variables - longest and current.

Now, we will consider each number (except the first, as we need to compare each number to its previous number). If the current number and previous number are equal, then our sequence is neither extended nor broken, and we can simply move to the next number.

If the current number and previous number are unequal, we will check if the current number is one more than the previous (extends the sequence). If it does, then we will add 11 to our current count (current) and move forward, and if it doesn't, we will store our current count and reset it to 1 (to include the count of the current number that broke the sequence).

In the end, we will return the longest count (longest) as the length of the longest consecutive sequence of numsnums.

Implementation (C++ 20, Java 1.8, Python 3)

Implementation in C++ 20

Implementation in Java 1.8

Implementation in Python 3

The output for all of the implementations will be the same as the problem and the algorithm are the same.

Output

The longest consecutive sequence in the array [-1, -2, -3, -4, 0, 8, 4] will be [-4, -3, -2, -1, 0].

Complexity analysis

Time Complexity: O(nlogn)O(n \; log \; n)

The for loop inside the longestConsecutive()longestConsecutive() function runs exactly nn times, but it is dominated by the time taken to sort the numsnums array (O(nlognO(n \; log \; n). Therefore, the asymptotic time complexity will be O(n)+O(nlogn)O(n) + O(n \; log \; n), i.e., O(nlogn)O(n \; log \; n).

Space Complexity: O(1)O(1)
This algorithm only allocates only a handful of integers that are independent of nn.

This algorithm is better than the Brute Force algorithm as it has lesser time complexity.

Approach 3: Using a Priority Queue

In this approach, we will be using a priority queue to find the length of the longest subsequence in an array.

Note: A priority queue is a special type of queue in which all the elements are associated with a priority value. In this case, the bigger numbers are a higher priority than the smaller numbers.

Algorithm

At first, we will create an integer priority queue p_nums and copy all the elements of nums to it. A prevprev variable will be initialized and assigned the value at the top of p_nums.

Then we will run a while loop till p_nums becomes empty. Inside the while loop, we will check if the top of p_nums is equal to prev, if yes, the current number will be ignored, and we will move pop out the topmost element. Then we will check if the top of the queue (current number) is l greater than the previous number or not, if yes, our current count will be incremented by l, and the topmost element will be popped out from the queue.

If none of these conditions satisfies, we will simply reset the current count to 1 and assign the topmost element of the queue to the previous number before popping it out from the queue.

We will keep track of the greatest count recorded in each iteration, and after the end of the while loop, the greatest count will be returned as the length of the longest consecutive subsequence of nums.

Note: This algorithm only works with a min-heap priority queue but priority queues in C++ are max-heap by default.

To solve this problem we will initialize our priority queue using the greater<>, our priority queue initialization will look like this:

priority_queue<int, vector<int>, greater<>> var_name;

Here, greater<> is an object class for greater-than equality comparison, which converts max-heap priority queue to min-heap priority queue.

Implementation (C++ 20, Java 1.8, Python 3)

Implementation in C++ 20

Implementation in Java 1.8

Implementation in Python 3

The output for all of the implementations will be the same as the problem and the algorithm are the same.

Output

The longest consecutive sequence in the array [0, 2, 3, 5] will be [2, 3].

Complexity analysis

Time Complexity: O(nlogn)

The main while loop in this algorithm runs exactly nn time, but inside the while loop, we are also using a priority queue simultaneously, which has a time complexity of O(nlogn). Therefore, the total time complexity will be O(nlogn).

Space Complexity: O(n)O(n)
This algorithm has a linear space complexity as a priority queue is maintained throughout the algorithm of size n.

Approach 4: Using a HashSet

Using a HashSet, we can optimize our brute force solution to a great extent! None of our previous solutions has reached O(n)O(n) time complexity, but that will not be the case with this approach (making it the only acceptable solution for this problem).

Algorithm

This algorithm is more similar to the brute force approach than any other algorithm as it has only two changes.

  1. The numbers are stored in a HashSet, allowing O(1) lookups.
  2. An attempt to build a sequence from numbers that are already not a part of a longer sequence.

The 2nd2nd point is accomplished by ensuring that the number which precedes the current number by 11 is not present in the HashSet, because if it were present, we would have ignored the current number as that number would have been a part of a longer sequence.

For example, if nums = [4, 3, 2, 6], the longest consecutive sequence will not start from 4 or 3 as they will be a part of the subsequence starting from 2.

Implementation (C++ 20, Java 1.8, Python 3)

Implementation in C++

Implementation in Java 1.8

Implementation in Python 3

The output for all of the implementations will be the same as the problem and the algorithm are the same.

Output

The longest consecutive sequence in the array [0, 2, 3, 5] will be [2, 3].

Complexity analysis

Time Complexity: O(n)O(n)
This algorithm appears to be of O(n^2) time complexity due to a nested while loop inside a for loop, but on inspecting closely, it is realized that it is a linear time complexity algorithm as the for loop and the while loop runs for nn times throughout the algorithm. In other words, the while loop is amortized.

This means that the time complexity of the algorithm is O(n+n), i.e., O(n).

Space Complexity: O(n)
This algorithm has a linear space complexity as a HashSet is maintained throughout the algorithm of nn number of elements.

This algorithm is accepted as a solution to the problem and is more efficient than any other algorithm due to its lowest time complexity.

Conclusion

  • The problem was to find the length of the longest consecutive subsequence in an array.
  • We can find the solution by using Brute Force (three nested loops) approach, it is an optimal solution, but it has O(n^3), which is not preferable at all.
  • The Sorting approach can also be used to find the solution, it is more efficient than the brute force approach, but still, it cannot be accepted as a solution due to its high time complexity.
  • The Priority Queue approach works, but that is just unnecessary, first of all, it's difficult to intuitively come up with this approach, everyone will prefer the sorting approach, and the sorting approach is a more efficient solution of the two (as the sorting approach doesn't take any auxiliary space).
  • The HashSet is the best solution for this problem as it has the lowest time complexity of all O(n).

FAQs

Q.What is a Priority Queue?

A.A Priority Queue is a special type of a queue data structure. Each of its elements is assigned a priority. Elements with a higher priority are served before the elements with a lower priority.
For example, Java's implementation of a Priority Queue is min-heap, i.e., smaller numbers are a higher priority than larger numbers, whereas, C++'s implementation of a Priority Queue is max-heap, i.e., larger numbers have a higher priority than smaller numbers.

Q.When are HashMaps used?

A.HashMaps are used when we have unique keys with their values to be stored. We use HashMaps to search for items based on their keys to save time as it has a O(1)O(1) lookup time.

Q.Why only the HashSet solution is accepted by this problem and not other approaches?

A.The HashSet solution is accepted and not the others because only the HashSet solution works with O(n)O(n) time complexity, others have a higher time complexity which doesn't satisfy the problem's needs.

Q.What are HashSets?

A.HashSets are an unordered sequence of elements that only contain unique elements. They are called unordered sets in C++ and sets in Python.

Q.Why did we use a HashSet instead of a HashMap in the last approach?

A. We could have used HashMaps in that approach, but the use of HashSet makes much more sense. A HashSet only has unique values, and there is no use of duplicate values in this algorithm. It takes less memory to store the same number of elements in a HashSet than in a HashMap, as a Hashmap requires two objects for every element (key and value).