You don’t actually have to be able to write a binary search.
But let’s look at how to anyway.
Invariants are things that must always be true if the program is working properly.
BHSawesome has talked a bit about “preconditions” and “postconditions” of methods. Those are kinds of invariants.
Preconditions are things that must be true before a method is called.
They can be facts about the arguments to the method or facts about the state of various objects.
If they are not true the invariant is said to be “violated” and the method is under no obligation to work correctly.
Postconditions are things that must be true after a method completes. If a method does not always establish its postcondition it is buggy.
Callers of a method need to be able to rely on its postcondition.
Loops are tricky.
Maybe even more tricky than methods.
Loop invariants are things that must be true every time through the loop.
If we make sure the invariants are true before the loop starts and are always true at the end of an iteration, assuming they were at the start, then we can know they will always be true.
Signature looks like:
public int search(int[] nums, int target) {
I.e. take an array of int
s, and a number we are looking
for, and return the index where the number occurs in the array or -1
if it isn’t there.
The array must be sorted. (This is an example of a precondition.)
Let’s define two variables low
and
high
such that if target
is in
nums
it is at an index, i
such that:
low ≤ i < high
If we we can shrink the range in a way that maintains the invariant
we’ll either find target
or end up with an empty range
in which case the only possibility is that target
was
never in the range to start with.
So we can start here:
int low = 0;
int high = nums.length;
Clearly if target
is in nums
it is at some
legal index which is greater than or equal to 0 and less than
nums.length
.
Now we loop until we find target
or the interval
shrinks down to an empty interval.
while (low < high) {
As long as low < high
the interval isn’t empty yet.
When low == high
the interval is empty and if we
haven’t found target
, yet it’s not there.
int mid = low + (high - low) / 2;
int midVal = nums[mid];
This is what makes this a binary search—we split the range in half.
The textbook tells you to compute mid
like this:
int mid = (low + high) / 2;
This is actually buggy.
However it’s a sutble enough bug that it existed in the Java standard library for nine years. You can read about it here.
The problem is if low + high
overflows, we end up with
a negative value for mid
.
if (target < midVal) {
// Target is below mid so in interval low ≤ i < mid
high = mid;
} else if (target > midVal) {
// Target is above mid so in interval mid + 1 ≤ i < high
low = mid + 1;
} else {
// Target is at mid so we've found it
return mid;
}
Note that in the two cases where we don’t return, we cut the
interval between low
ad high
in half but
maintain the invariant.
}
return -1;
}
If the loop ends, then we didn’t find target
so we
return -1
.
In actual binary search routines, such as in the JDK, they return something better than -1 namely:
return -(low + 1);
This is a negative number, so not a possible index, but one that can be translated back to the position where the item should have been (or would need to be inserted to keep the array sorted.)
What’s the must number of times the loop will run for an array of eight elements?
What’s the must number of times the loop will run for an array of n elements?
They may ask you to trace a binary search to determine how many times the loop will run for a given set of arguments.
That’s kind of silly.
But it is what it is.