Binary Search
Definition, how to implement, and some considerations.
A simple game
Imagine, you’re with a friend, who suddenly says to you: “I’m thinking of a number between 1 and 100, guess which one”. You start giving guesses, and each time you make a suggestion, you would get an indication: “It’s more”, “It’s less”, “You guessed it”.
How would you find the correct number?
The iterative way
A first solution would be to start with 1, and say each number one after the other until you get the right one. It works, but it could be very long. Imagine the number is 100, you would have had to say a hundred words before finding it. It’s long, fastidious, not to mention completely boring for each players.
The fun way
Or you could really try guessing and give a random number between 1 and 100. Say 14.
“It’s more” — Says your friend.
Now you know you have to guess a number between 15 and 100. So you take another guess, say 63.
“It’s less”
Now you know the number is between 15 and 62. So you say 42.
“It’s less”. Hmm, 15 to 41. It’s getting close.
“27?”
“You guessed it!”
YES! Well done! And it only took you 4 guesses. A lot better than 100 isn’t it?
What happened here is that with each guess you made, you’ve been excluding a part of the solution, progressively reducing the interval in which you could choose your guesses. The more guesses you make, the smaller the interval becomes, and the faster it is to find the solution.
Well, this is the main principle of the binary search. With one minor twist: it does not guess at random, but always try the item located at the central position of the interval. That way, the interval size gets divided by 2 after each iteration.
Usage
So Binary Search is a really fast search algorithm that allows you to determine quickly if an item is present in an array. One thing to note is that in the previous game example, we were considered numbers from 1 to 100, ordered.
This is a requirement for Binary Search. The considered array of items must be sorted. The type of items does not matters, as long as the items can be compared and sorted. It can be numbers, or string ordered lexicographically, or even objects if their classes implement the comparison operator.
Complexity
Runtime
For the linear search, we check the elements one after the other, so it has an O(n) complexity.
For Binary Search however, we always divide by 2 the size of the array where we’re searching. Suppose an array of size 8. In a worst case scenario, first guess will move the size to 4, second guess to 2, third guess to 1. Once the size is 1, we make a final guess that can be true or false, then we’re done. So for an array of size 8, we made 4 guesses.
Then what about an array of size 16? Well, after first guess, we get an array of size 8. Which we know takes 4 guesses maximum. So for a size of 16, we have 4 + 1 = 5 guesses at top.
See where this is going?
Every time we double the size of the array, we only need one more guess, at most. Which means the complexity of the Binary Search is actually lg(n) (base-2 log).
Space
When implementing the Binary Search in an iterative way, as described earlier, we are only keeping in memory the min and max index. Nothing else. So Binary Search space complexity is O(1).
However, a second implementation is possible, in a recursive way. In that case the calls will be stacked, and if K guesses are needed to find the solution, then K calls are needed. Which means for a recursive implementation, Space Complexity is also lg(n).
Implementation
Linear
Here is the pseudocode for a linear implementation. The function returns the index of the found element, or -1 when not found.
Recursive
Here is the same algorithm, implemented recursively. The conditions are the same, it just calls itself recursively instead of just updating the min and max.
Optimisation
Now this works well in most of cases. However, there can be some edge cases which might throw some exceptions at runtime.
A first one is to imagine you have a really big array. When executing that line:
m = floor((min + max) / 2)
It is possible that min + max results in a number too big that results in an arithmetic overflow.
One way to deal with this is by a simple trick:
m = floor(min +( max — min) / 2)
That’s all for the general view of Binary Search. Stay tuned for some other cool algorithm descriptions!