Overview
Binary search is a search algorithm that returns the index of a target value within a sorted array. Instead of iterating through each element of the array, it utilizes the sorted nature of the array to search more efficiently and improve the time complexity from $O(n)$ to $O(logn)$.
Problem
Given a sorted array
nums
and a target valueT
, return the index ofT
or -1 if it doesn’t exist.
Algorithm
- Initialize a left boundary $L=0$ and a right boundary $R=length($
nums
$)$ - Initialize a middle pointer $M = (L + R) / 2$
- If the value at index $M$ is ever equal to
T
, return $M$; otherwise shrink $L$ or $R$ accordingly - If no value is equal to
T
, return $-1$
Implementation
Variation 1: Return Early
def search(self, nums: List[int], target: int) -> int:
left, right = 0, len(nums)-1
while left <= right:
mid = (left + right) // 2
if nums[mid] == target:
return mid
elif nums[mid] > target:
right = mid - 1
else:
left = mid + 1
return -1
Variation 2: Return After Exiting
def search(self, nums: List[int], target: int) -> int:
left, right = 0, len(nums)-1
while left < right:
mid = (left + right) // 2
if nums[mid] >= target:
right = mid
else:
left = mid + 1
return left if nums[left] == target else -1
Analysis
Time Complexity
$O(logn)$: In the worst case, there are $O(logn)$ comparisons made where $n$ is the number of array elements. The entire array will be searched, where half the array is discarded after every iteration.
Space Complexity
$O(1)$: There are no additional data structures used.
Explanation
While the overall algorithm is straightforward, binary search’s implementation can be tricky due to different edge cases and variations. The following sections were heavily inspired by this Leetcode discussion.
Left/Right Boundaries
When searching for a target in a sorted array, these pointers are almost always initialized as:
left, right = 0, len(nums) - 1
However, both pointers must cover the entire search space. If the problem was instead “Find the insert position”, it is possible to insert after the last element. In this case, the boundaries would be initialized as:
left, right = 0, len(nums)
Middle Pointer Calculation
The standard middle pointer calculation is:
mid = (left + right) // 2
However, there are two subtleties to consider: overflow and even middle selection.
1. Overflow
Languages with signed integers (ie. Java, C++) have the possibility of overflowing from arithmetic operations. To circumvent this, the following calculation can be used instead:
mid = left + (right - left) // 2
2. Even Middle Selection
In arrays with an even number of elements, there are two possibitilies for the “middle” element; it can either be the element on the middle left or the middle right.
mid = left + (right - left) // 2
|
[1, 2, 3, 4, 5, 6]
|
mid = left + (right - left + 1) // 2
This selection will directly affect how the boundaries should be shrunk, as the wrong choice can lead to an infinite loop.
While Loop Condition
Two common options for the while loop condition are while l <= r
and while l < r
.
1. while l <= r
This should be used when the result is returned from inside the loop. The termination condition is when l > r
, which signifies that the array does not contain the target. Note that this may not apply to all questions, as not all problems involve searching for the existence of a target.
2. while l < r
This should be used when the result is returned from outside the loop. The termination condition is when l == r
, which signifies when the loop exits that the two boundaries point to the same value, presumably the target.
Shrinking the Boundaries
Shrinking the boundary will depend on whether the value at the current index is greater or less than the target. If the value is greater, then that means the target is guaranteed to be on its left (since the array is sorted), so the right boundary is moved. If the value is less than the target, then the target will be on its right, so the left boundary is moved.
There are two possibilities to consider when shrinking the boundaries - whether to include the middle pointer or not. This will depend on both the middle pointer calculation and the conditional check prior to the boundary move.
Assuming a while condition of l < r
, the following would not work:
mid = left + (right - left) // 2 # this line is incorrect
if nums[mid] > target: # given this condition
right = mid - 1
else:
left = mid
This can be fixed by changing the conditional to include the target value:
mid = left + (right - left) // 2
if nums[mid] >= target: # correct
right = mid - 1
else:
left = mid
Or by rounding the middle pointer up during its calculation:
mid = left + (right - left + 1) // 2 # correct
if nums[mid] > target:
right = mid - 1
else:
left = mid
A trick to easily verify if the middle pointer calculation and boundary shrinking logic are correct is to visualize how it would behave on an array with two elements. If the logic is incorrect, then the boundaries won’t move which will result in an infinite loop.