Finding Primes

So, when I do interviews, I almost always ask a math question. I don't ask a "brain teaser"--I find even the basics of mathematics to be challenging enough for most interviewees.

(And I've interviewed plenty of non-Americans. So, this post is not a diss on the American math education system, though it could possibly use some refinement. Different post.)

One question that has really shown to be a litmus test is the problem of detecting whether or not a natural number is prime. I don't count the one interviewee who didn't know the definition of a prime number (oh, dear). For the rest of them, though, I've found that only a small percentage can do even the most basic solution.

My favorite response was just a short time ago. After I posed the problem to the candidate, he said, "Now, if you had given me the Fibonacci problem, I could do that!"

Why is this problem so hard?

Mostly, I think people get tied up with efficiency. There is a brute force way that works great:


for ( int i = 2; i < number; i++ ) {
if ( number % i == 0 ) {
return false;
}
}

return true;


The majority of people that can't find a good solution get hung up on how inefficient the algorithm is because you are testing unnecessary numbers. For example, does one really need to test divisibility by 4? Of course not--you already eliminated that when you tested 2.

The mind-bending response that I get from some is, "well I would just have a list of all the primes from 2 up to that number available and would test the number's divisibility against those". Good idea, but how do you create that list of primes? Isn't that why you are creating the method anyway?

For fun, we could create a really terrible recursive solution based on that:


public boolean isPrime(int num) {
for ( int i = 2; i < number; i++ ) {
if ( isPrime(i) && num % i == 0 ) {
return false;
}
}

return true;
}


That said, I have never once gotten that as a response even though so many people say they only want to test the primes beneath that number against it.

So, both are inefficient, though O(n) ain't bad. Can we improve it? You betcha.

First of all, we don't need to test everything up to that number. Other than the number 2, there is no number on the natural number line that is evenly divisible by the number immediately beneath it. 3 isn't divisible by 2, 53 isn't divisible by 52, etc. Also, other than 3 and 4, there is no number on the natural number line that is evenly divisible by the number - 2. 5 isn't divisible by 3, 74 isn't divisible by 72.

In fact, speaking of 72, let's see what its factors are:

2 x 36
3 x 24
4 x 18
6 x 12
8 x 9
9 x 8
12 x 6
18 x 4
24 x 3
36 x 2

The first thing that you noticed is that no factor is above half 72. That is true for all numbers. Thus, we don't need to test above half way because we know for sure that none of the numbers above half will be factors.

We can do better, though. Do you notice how we just invert and repeat ourselves when listing the factors after we pass a certain point? Here, it is at 8x9. We certainly don't need to test above that since multiplication is commutative. Where is that point for other numbers.

How about 64?

2 x 32
4 x 16
8 x 8
16 x 4
2 x 32

Or how about 154?

2 x 77
7 x 22
11 x 14
14 x 11
22 x 7
77 x 2

You see the pattern? Right in the center, the factors invert and repeat. That center is the square root.

So, a more efficient algorithm would be:

double lim = Math.sqrt(num);
for ( int i = 2; i <= lim; i++ ) {
if ( lim % i == 0 ) {
return false;
}
}
return true;

That's O(sqrt(n)), which I think is pretty good.

If you were bored up until now (why are you still reading, then?), here is where things get interesting.

The fact is that that is a very brute force way to do it. It's probably the best we have as a general approach for discovering if a particular number is prime. What about getting a list back of all the primes beneath a given number?

So, I've always wanted to ask this question, but I think I'd definitely get a blank stare from 99% of candidates.

Somehow, the size of this question gives more room for bits to compress into a faster algorithm that the brute force way.

Let's review the brute force way really quick, though. To return a list of primes beneath a given number, one needs to test each one for primacy and add them to the list if they pass the test:

public static boolean isPrime(int num, List primes) {
double sqrt = Math.sqrt(num);
for ( Integer prime : primes ) {
if ( prime > sqrt ) {
return true;
}
if ( num % prime == 0 ) {
return false;
}
}
return true;
}

public static List findPrimes(int max) {
List primes = new ArrayList();
for ( int i = 2; i <= max; i++ ) {
if ( isPrime(i, primes) ) {
primes.add(i);
}
}
return primes;
}

You'll notice that I am adding one efficiency here which is only testing against those that I have already discovered to be prime in previous iterations. Regardless, though, this is still about O(n*sqrt(n)).

Enter the awesome Eratosthenes. He offered a different way to discover primes. He recognized that with the premise that k was prime, we could assert the following inductive reasoning:

All numbers k*n are composite where n = {2, 3, ... }, so

k = 1 is composite
k = k + 1: If unmarked, k is prime: Mark all k*n as composite; otherwise, k is composite

Following his sage wisdom, we get the following algorithm:

public static List findPrimes(int max) {
boolean[] isComposite = new boolean[max];
isComposite[0] = true;
for ( int i = 2; i <= max; i++ ) {
if ( !isComposite[i-1] ) {
primes.add(i);
for ( int j = i*2; j <= max; j = j + i ) {
isComposite[j-1] = true;
}
}
}

return primes;
}

We could probably trim some cycles of off this to reduce its coefficient, but, overall this is a huge improvement.

Here are the numbers:

primes under 100000: 9592; 33 ms
primes under 100000: 9592; 8 ms

primes under 1000000: 78498; 277 ms
primes under 1000000: 78498; 18 ms

primes under 10000000: 664579; 3872 ms
primes under 10000000: 664579; 551 ms

primes under 100000000: 5761455; 88055 ms
primes under 100000000: 5761455; 8635 ms

Anyway, there's your nerdy math update for the day.

Josh Cummings

"I love to teach, as a painter loves to paint, as a singer loves to sing, as a musician loves to play" - William Lyon Phelps

0 comments: