You are currently browsing the monthly archive for December 2009.

My favorite kind of puzzles are those which sound like they’re going to be an absolute nightmare, but turn out to have a particularly beautiful solution. They’re even better if you don’t really need to know any math to solve them. Here’s such an example, which absolutely delighted me when I first solved it.

Playing cards

An irrelevant picture

I presented it to two of my friends on a long train journey — one of them is a mathematician (in fact he’s a research fellow at Queens’ College, Cambridge) and the other works for a pharmaceuticals company. The non-mathematician got it first. It really is a problem that anyone can have a crack at.

First, I’ll describe a warm-up puzzle. It’s kind of a barometer for the real puzzle.

The challenge: produce a deck of cards, each with a whole number painted on it, with which you can make any number between 1 and 1 million, by picking out some of your cards and adding together the values. You should do this with the least number of cards possible.

Got that? You may want to stop reading and think about it for a second, because the solution is coming up in the next paragraph. I’ve found that computer scientists and programmers tend to get this one faster than most people — often with less than a second’s thought.

The answer is that you should label your cards with the powers of two, so that your deck starts 1, 2, 4, 8 … If you didn’t come up with this solution, the way to think about it is to figure out what cards you’d need to make each of the numbers from 1 up to 1 million. You need to be able to make 1, so you’d better have a card labeled ‘1’. Similarly you need a card labeled ‘2’. But with those two cards you can already make 3, so the next card that needs to be added to your deck is labeled ‘4’. After a while you should spot the pattern, and come up with an argument for why this will work for every number up to a million.

Now here’s the real puzzle, which is a bit more tricky, but fun to think about. I’m not going to give the answer now — I’ll post it in a couple of days.

Produce a deck of cards with which you can make any number between 1 and a million by picking cards and adding together their values, even if you have lost one of the cards in the deck.

The requirement that your deck is insensitive to the loss of a card makes the problem tricker to think about, but a similar approach to that described for the easier problem can work. For example, you know that you need to include two cards labeled ‘1’, so that even if you lose one of them you can still pick cards that add up to give 1.

For those inclined to a challenge, there are three levels of answer:

  1. Describe the deck of cards that solves the puzzle (easy, requires almost no math.)
  2. Prove that you can make any number between 1 and a million, even if you lose a card (not too hard, requires a little bit of math.)
  3. Prove that this is the minimal deck of cards that solves the puzzle (difficult, but still needs no math beyond high school.)

I learned this from my office mate, Toby Wood, but I’m not certain where he got it from. If you know, please leave a comment!

My last post, on the 17 x 17 challenge, ended with a solution to a problem which I thought sounded incredibly naïve* but which turned out to be correct. It got me thinking about a problem that I was posed when I was interviewing to work at investment banks. In fact I was asked the same question in interviews for two different banks (potential applicants take note).


Sketch of the problem as presented to me

It goes like this: You find yourself in a dark cave, which has three tunnels leading from it (see the crude sketch on the right). One of the tunnels leads outside, and it takes 3 hours to traverse it. The other two tunnels connect to each other, returning you to the cave, and they take 1 hour to walk down. If you choose tunnels at random each time you’re faced with a choice, how long will it take you on average to get outside?

I’m about to tell you the answer, so maybe you want to stop and think about it for a bit. The point I’m making is that even though the calculation you have to do is a bit involved (although simple to anyone who’s ever studied Markov chains) there’s a cheeky shortcut: you can get the answer just by assuming that you take each of the tunnels once, and add up the times to traverse them, getting an answer of 1+1+3=5 hours.

Is this just a coincidence? To find out, let’s generalise the problem. Now we find ourselves in a cave with n tunnels, of which m of them lead to an exit and n-m dump you back in the cave. Each of the tunnels takes time \tau_i to traverse, for i=1,…,n, and is chosen with probability p_i. We’ll adopt the convention that the first m tunnels are the ones that lead to an exit, and the last n-m are the return tunnels (notice that our generalisation allows the possibility that a return tunnel might take a different amount of time to traverse if we go in the opposite direction).

To work out what the average escape time is, we notice that if we pick an escape tunnel then we’re free in the time it takes to traverse the tunnel. If we pick a return tunnel, then we’ve increased our expected time to leave by the time that it took to traverse the tunnel. If the time to leave is T, then we write the expected leaving time as E[T], and

E[T] = \sum_{i=1}^m p_i\tau_i + \sum_{j=m+1}^n p_j(\tau_j + E[T]).

After a couple of lines of algebra we see that the expected time to leave is given by

E[T] = (\sum_{i=1}^n p_i\tau_i) / (1 - \sum_{j=m+1}^n p_j)

or, written in everyday language:

Avg time to leave = (Avg tunnel length) ÷ (Chance of picking an exit tunnel)

That’s it! The average time to escape only depends on the average time it takes to traverse a randomly picked tunnel, and the probability of picking a tunnel which leads to an exit. It’s completely insensitive to the specifics of the problem, namely how long the individual tunnels are, and what the precise probabilities of picking each tunnel are. In the special case where each of the tunnels is equally likely, all you have to do is add up the time it takes to traverse each of the tunnels, and divide by the number of tunnels that lead to an exit.

This appeals to me because a problem which has a lot of fine detail turns out to be insensitive to the fine detail. I’d be interested to hear about other examples of this phenomenon.

* Is it pretentious to write ‘naïve’ instead of ‘naive’? Answers, opinions and furious rants on a postcard, please.

A recent post on bit-player drew my attention to William Gasarch’s 17 x 17 rectangle challenge. In short, Gasarch is offering $289 (289 = 17 x 17) for the solution to the following puzzle.

6 x 6 coloring

A coloring of a 6 x 6 grid which satisfies the 'no rectangles' constraint.

The problem is to take a rectangular n x m grid, and color each of the squares with one of four colors, so that there are no rectangles in the grid whose corners all have the same color. For example, on the right you can see a 6 x 6 grid which satisfies this constraint.

It turns out that if the size of the grid is 19 x 19 or bigger, then you can prove that it’s impossible to color the grid without creating a rectangle. In all cases up to a 16 x 16 grid, Gasarch has an example of a coloring that satisfies the no-rectangles constraint. However, the 17 x 17 and 18 x 18 cases are still open. To win the $289, you need to send Gasarch an example of a rectangle-free coloring of the 17 x 17 grid.

If it turns out that it’s possible to color the grid in this way, then the proof that you can do so is simple — you just draw a picture of the coloring (it’s easy to get a computer to check that it doesn’t have any rectangles). However, if it turns out to be impossible then it can be very hard (read: long and ugly) to prove that that’s the case. This is why the problem is so appealing to people with very little experience in mathematics: to solve the problem, all you have to do is draw a picture!

Of course, it’s easier said than done. There are 4^{289} different ways to color a 17 x 17 grid, and very few of them will be rectangle free. If you were to write a computer program to check all possible colorings, it would take longer than the age of the universe to run.

My first thought after reading the bit-player post was that a simulated annealing algorithm might do the trick. Simulated annealing is a computational technique which is based on the way that metals can be cooled slowly in order to minimize the number of defects, effectively finding a minimal energy solution to the problem of arranging the atoms in the metal. Simulated annealing tends to work well in situations that have a large number of possible states, and a large number of local minima — just as the rectangles problem does.

I’m working on this algorithm at the moment, and hopefully I’ll have a post about that in the near future. But while I was thinking about the problem, I became distracted by another question: if you color the grid randomly, how many same-color rectangles do you create on average? It’s easy to write a compute program that uses Monte Carlo to approximate this number, but I wanted to do it analytically.

It seems to be a difficult problem, because each cell in the grid can potentially be part of many rectangles. It seems as though there’s a lot of interdependence between the colors of different cells, which will make the calculation very difficult indeed.

Undeterred, I decided to try a naive calculation, in the hope that even though it was wrong, it wouldn’t be too wrong, and it would give me some insight into how to do the full calculation. First I worked out how many rectangles it’s possible to make on an n x n grid by joining corners. After some thought, you can see that this number N is given by:

N = \sum_{i=1}^{n-1} \sum_{j=1}^{n-1} (n-i)(n-j) = \tfrac{1}{4}n^2(n-1)^2

For each potential rectangle, to be counted it needs all four of its corners to be the same color to be included in our count. Each cell has a probability of 1/4 of being any particular color, so the probability of getting all four the same is (1/4)^4, but this could happen with any of the four possible colors, giving a total probability of 4(1/4)^4=1/64. Therefore, according to this dodgy calculation, the average number of rectangles R  in an n x n grid is

R = N/64 = \frac{n^2(n-1)^2}{256}

So how does this rough calculation compare with the numerically computer results? Take a look for yourselves. The dots are the numerical approximations (which should be pretty accurate) and the dashed line is the formula I derived above:

Average number of rectangles

Average number of rectangles plotted against grid size.

It turns out that this formula gives precisely the right value for the expected number of rectangles in a random coloring! So now my question is: is it actually the case that, for some subtle reason, the calculation I did above is valid after all, even though intuitively it seems that it should be completely wrong? Or is it just a coincidence that it comes out with exactly the right answer? I may be thinking about this one for a while.

Everyone knows that movie sequels nearly always suck, and that trilogies are the worst of the bunch. If you want a sure-fire way to ruin a good movie, then make two sequels.

But how does this piece of folk knowledge really hold up? I decided to investigate this afternoon, and work out quantitatively (using science) just how bad the second and third movies in a trilogy are.

My first task was to make a list of trilogies — using a combination of asking my friends on Twitter and my officemates, I got a list of 38 well known trilogies, including The Godfather, Jaws, Star Wars (I-III and IV-VI) and Die Hard. I then visited the IMDB page for each movie and recorded its aggregate score out of ten. These could then be plotted on a graph, to indicate visually just how much sequels suck:

IMDB movie ratings

Ratings out of ten for 38 movie trilogies.

One unsurprising result is that the second movie in a trilogy tends to be worst than the first, and the third movie tends to be worse than both of those. The average rating for the first movie was 7.58, as compared to 6.82 for the second movie and 6.38 for the third.

Something a bit more surprising is that the second and third movies in a trilogy aren’t uniformly worse — instead there tends to be more of a spread in quality, whereby second movies are much more variable in quality than first movies, and third movies are even more so. This can be seen in the standard deviations of the ratings: 0.73 for the first movie, 1.2 for the second movie and 1.4 for the third.

Finally, as with any rule, there are exceptions. The most striking is Sergio Leone’s Man With No Name trilogy, comprising A Fistful of Dollars, For a Few Dollars More, and The Good, The Bad and the Ugly. Each of these movies was better than the last, with the third movie getting an average rating of 9.0, the highest of any third movie in a trilogy.

Similarly, The Lord of the Rings maintained high ratings throughout, with the third movie being rated the best of the trilogy, although in this case the effect isn’t as dramatic. It’s noticeable that with both these exceptions, the three movies were conceptualized as a trilogy, planned that way from the beginning, rather than the second and third movies being made to cash in on the success of the first. Perhaps there is something to artistic merit after all.

I can make the full data set available on request, if anyone thinks they can do something cool with it.

In the last post we looked at the Mandelbrot set, a beautiful 2d fractal with infinite complexity along its boundary. The challenge is to find a three dimensional version of the set, the Mandelbulb, which we can then lovingly render with dynamic lighting and shadows.

The Mandelbulb

The three-dimensional Mandelbulb

Mathematicians were doing work related to the Mandelbrot set in the very early 20th century. However, it wasn’t until the 1980s that computers were powerful enough to draw high quality pictures of it. The Mandelbulb is even more complicated, and requires even more computational power (once you’ve located the object, you then need to render it — this is where the computer spends most of its time). This is why we’ve not been able to see pictures of the Mandelbulb until very recently.

The first problem is determining the rule which will generate the Mandelbulb. For technical reasons there is no such thing as a 3d analogue of the complex numbers (the quaternions are the closest 4d analogue) and thus coming up with the ‘correct’ map is tricky. The key is to re-use the ideas of multiplying the radius, rotating the angles and then translating. Mathematicians use a set of coordinates for 3d space called spherical polar coordinates, which tell you location of a point in terms of the distance from the origin, r, and two other angles, \theta and \phi.

For those with the appropriate background, the map which takes you from one point to the next is given by composing a stretching and rotation:

(r,\theta,\phi) \to (r^n,n\theta,n\phi)

with a constant translation:

(x,y,z) \to (x+a,y+b,z+c)

Mandelbulb Zoom

A zoom of the Mandelbulb

We use spherical coordinates to do the first transformation, and cartesian coordinates to do the second. If the sequence of iterates remains bounded when using the parameters a, b and c then the point with coordinates (a,b,c) is part of the Mandelbulb. If the sequence diverges, then it is not.

We’ve introduced a parameter n into the map — this tells you how much to rotate the angles. If n is 2, that means we double the angles. If n is 3 we triple them, and so forth (in the 2d Mandelbrot set we only looked at n = 2). In three dimensions, the low values of n don’t give us a very interesting object. As we increase n the object gets progressively more complex. It turns out that n = 8 gives a nice balance between beauty and novelty.

Another Mandelbulb Zoom

Another zoom of the Mandelbulb

Now that we have the ingredients, all we need to do is complete the renders. This is a very computationally intensive process, but the results are worth it! By playing around with the color of lighting and the depth of shadows, we can create some truly stunning images — certainly the most striking examples of fractal beauty that I’ve ever seen.

All of the information in this post is sourced from the Mandelbulb page at Skytopia, which is well worth a visit!

It sounds like a poorly thought through Pokémon name from 1995. But in fact the Mandelbulb is a beautiful three-dimensional mathematical object that generalises the two-dimensional Mandelbrot set, which has become well known as an example of a fractal.

The Mandelbrot Set

The Mandelbrot set (black) with additional coloring (blue to white).

The Mandelbrot set is generated by a very simple idea: we take a point in the 2d plane, and use that as the input to a rule which generates a new point. We then use the new point as the input to the same rule, to generate a third point, then a fourth point, and so on. If we start with the point z_0 = 0 (the origin) we generate the sequence z_1, z_2, z_3,\dots from it. This simple but powerful idea, iteration, is at the heart of chaos theory and fractal images.

The rule that generates the Mandolbrot set is given by the following formula, where c, z_n and z_{n+1} are complex numbers:

z_{n+1} = z_n^2 + c

If we think of a point in the plane defined by its magnitude (or distance from the origin) and the angle it makes with the horizontal, then this rule simply says: “to get the next point in the sequence you square the magnitude, double the angle, and then shift the point by a constant amount c.”

There are two possibilities when we start iterating points in this way: for some values of c the points get further and further away from where they started, and head off to infinity (we say that the sequence diverges) but for other values of c they never stray further away than some limit (in which case we say that the sequence is bounded.) To make the Mandelbrot set we color a point c black if the sequence for that value of c is bounded, and leave it white if the sequence diverges.

The boundary of the Mandelbrot set is infinitely complicated: no matter how far you zoom into it, you never stop seeing more and more detail. This never-ending detail qualifies it to be a fractal. We can make prettier pictures by coloring the points that diverge according to how long they take to diverge: in the image above the points that diverge quickly are shown in blue, whereas points that diverge only after many iterations (i.e. they’re nearly in the Mandelbrot set) are shown in white.

The challenge of the Mandelbulb is to extend this concept into three dimensions, to allow fully 3d renders that incorporate dramatic lighting and shadows. As we’ll see this isn’t as easy as it sounds at first.