Leetcode: 41. First Missing Positive
Problem Statement: Given an unsorted array \(nums\), return the smallest missing positive integer. The restriction is that the algorithm should have time complexity of \(O(n)\) and space complexity of \(O(1)\).
The hint about the time complexity made me to focus on what decision I could take while processing each element of the array. By any mean, this is the wrong way to think, but I got some wrong ideas before finding the right one.
Wrong idea 1: Suppose that \(\text{nums}=[3, 2, 1]\). What about if we track the intervals created after iterating over each element. After we pass through number 3, we know that the answer is either \(1..2\) or \(4\). After the number \(2\), the intervals become \(1..1\) and \(4\). After the number \(1\), we know that the solution needs to be \(4\). The idea works for this example, but it doesn't for \(\text{nums}=[5, 4, 2, 1]\). I tried to fix this idea, but it would require extra memory to do.
Wrong idea 2: After the failure tentative, I started to ponder what is the relationship between the answer and the numbers on the array. For example, the answer is \(3\) if the input is \([5, 4, 2, 1]\). Is there a way to discover this number using a formula? I tried without success to find the number by looking to the sum of the numbers in the array. For the example, the sum of the array is \(12\) while \(1+2+3+4+5=15\), so \(15-12=3\) which is the number missing in the array. Note that it is using the sum \(1+2+...+(n+1)\) to find the missing number and not \(1+...+n\). The problem is that the array can have negative numbers and numbers greater than or equal to \(n\). Those numbers are useless to this idea and having more than one of those would make the difference of the sums to represent multiple numbers. Failed, but this observation led me to the right idea.
Suppose that the array is \([5, 4, 2, 1]\). The answer should be 1 if 1 isn't in the array. The answer should be 2 if 2 isn't in the array and so on. In both cases, these numbers are contained in the array. Can we pre-process the input in a way to make easy to solve the problem? For example, suppose that the array is sorted: \([1, 2, 4, 5]\). If there is 1 in the array, it should be in the first position, 2 should be in the second position and so on. The first position of the array that breaks this assertion is the answer. In this example, the number 4 is in the third position. So, 3 is the answer that we are looking for. We can't use sort
since the solution must be \(O(n)\) and the sort would make it \(O(n\log n)\). But that is not the only problem. Suppose that the array is \([5, -1, 2, 1]\). The sorted array is \([-1, 1, 2, 5]\) and the suggested solution would return 1, since -1 is sitting on the first position. The conclusion here is that we don't really want to sort the array, but re-arrange its elements in a way that the idea works. For the second example, we would like the array to re-arranged as follow: \([1, 2, -1, 5]\) or \([1, 2, 5, -1]\). The position of -1 and 5 doesn't matter, but 1 should be on the first position and 2 should on the second position. To do so, we can iterating over each number and either skip it if there is no "right" place for it in the array (e.g. -1 and 5 in the last example), or swap it with the number that sits on its position. Let's run this process on the input \([5, -1, 2, 1]\).
Input | Output | Explanation |
\([\underline{5}, -1, 2, 1]\) | \([\underline{5}, -1, 2, 1]\) | Nothing to do. |
\([5, \underline{-1}, 2, 1]\) | \([5, \underline{-1}, 2, 1]\) | Nothing to do. |
\([5, -1, \underline{2}, 1]\) | \([5, \underline{2}, -1, 1]\) | Swap 2 with -1. |
\([5, 2, -1, \underline{1}]\) | \([\underline{1}, 2, -1, 5]\) | Swap 1 with 5. |
Is there an edge case for this solution? Suppose that the array is \([4, 0, 1, 2]\). If we swap 4 and 2 and move to the next index, we would have the array as \([2, 0, 1, 4]\). The problem here is that 2 wouldn't have a chance to be moved to its position. So, the swap operation should be applied as many times is necessary, until the number of the source position is either one that should be skipped or the correct number.
Input | Output | Explanation |
\([\underline4, 0, 1, 2]\) | \([2, 0, 1, \underline4]\) | Swap 4 with 2. |
\([\underline2, 0, 1, 4]\) | \([0, \underline2, 1, 4]\) | Swap 2 with 0. |
\([\underline0, 2, 1, 4]\) | \([\underline0, 2, 1, 4]\) | Nothing to do. |
\([0, 2, \underline1, 4]\) | \([\underline1, 2, 0, 4]\) | Swap 1 with 0. |
The last edge case is when you swap two equal numbers: \([1, 1]\).
- Time complexity: \(O(n)\)
- Space complexity: \(O(1)\)
def solve(nums): def swap(i): v = nums[i] if v <= 0 or v > len(nums) or v == i + 1: return False if nums[i] == nums[v - 1]: return False nums[i], nums[v - 1] = nums[v - 1], v return True for i in range(len(nums)): while swap(i): pass for i, v in enumerate(nums): if v != i + 1: return i + 1 return len(nums) + 1 assert solve([1, 2, 0]) == 3 assert solve([3, 4, -1, 1]) == 2 assert solve([7, 8, 9, 11, 12]) == 1 class Solution: def firstMissingPositive(self, nums: List[int]) -> int: return solve(nums)