Binary search

Writing a binary search

You don’t actually have to be able to write a binary search.

But let’s look at how to anyway.

Side quest: Invariants

Invariants are things that must always be true if the program is working properly.

BHSawesome has talked a bit about “preconditions” and “postconditions” of methods. Those are kinds of invariants.

Preconditions

Preconditions are things that must be true before a method is called.

They can be facts about the arguments to the method or facts about the state of various objects.

If they are not true the invariant is said to be “violated” and the method is under no obligation to work correctly.

Postconditions

Postconditions are things that must be true after a method completes. If a method does not always establish its postcondition it is buggy.

Callers of a method need to be able to rely on its postcondition.

Loop invariants

Loops are tricky.

Maybe even more tricky than methods.

Loop invariants are things that must be true every time through the loop.

If we make sure the invariants are true before the loop starts and are always true at the end of an iteration, assuming they were at the start, then we can know they will always be true.

A basic binary search method

Signature looks like:

public int search(int[] nums, int target) {

I.e. take an array of ints, and a number we are looking for, and return the index where the number occurs in the array or -1 if it isn’t there.

The array must be sorted. (This is an example of a precondition.)

Invariant

Let’s define two variables low and high such that if target is in nums it is at an index, i such that:

low ≤ i < high

If we we can shrink the range in a way that maintains the invariant we’ll either find target or end up with an empty range in which case the only possibility is that target was never in the range to start with.

Establish the invariant

So we can start here:

    int low = 0;
    int high = nums.length;

Clearly if target is in nums it is at some legal index which is greater than or equal to 0 and less than nums.length.

Start of loop

Now we loop until we find target or the interval shrinks down to an empty interval.

    while (low < high) {

As long as low < high the interval isn’t empty yet.

When low == high the interval is empty and if we haven’t found target, yet it’s not there.

Find the midpoint

      int mid = low + (high - low) / 2;
      int midVal = nums[mid];

This is what makes this a binary search—we split the range in half.

A common bug

The textbook tells you to compute mid like this:

      int mid = (low + high) / 2;

This is actually buggy.

However it’s a sutble enough bug that it existed in the Java standard library for nine years. You can read about it here.

The problem is if low + high overflows, we end up with a negative value for mid.

The meat of the loop

      if (target < midVal) {
        // Target is below mid so in interval low ≤ i < mid
        high = mid;
      } else if (target > midVal) {
        // Target is above mid so in interval mid + 1 ≤ i < high
        low = mid + 1;
      } else {
        // Target is at mid so we've found it
        return mid;
      }

Note that in the two cases where we don’t return, we cut the interval between low ad high in half but maintain the invariant.

After the loop

    }
    return -1;
  }

If the loop ends, then we didn’t find target so we return -1.

A subtlety

In actual binary search routines, such as in the JDK, they return something better than -1 namely:

return -(low + 1);

This is a negative number, so not a possible index, but one that can be translated back to the position where the item should have been (or would need to be inserted to keep the array sorted.)

Run time of binary search

What’s the must number of times the loop will run for an array of eight elements?

What’s the must number of times the loop will run for an array of n elements?

Annoying exam questions

They may ask you to trace a binary search to determine how many times the loop will run for a given set of arguments.

That’s kind of silly.

But it is what it is.