A recent post on bit-player drew my attention to William Gasarch’s 17 x 17 rectangle challenge. In short, Gasarch is offering $289 (289 = 17 x 17) for the solution to the following puzzle.

6 x 6 coloring

A coloring of a 6 x 6 grid which satisfies the 'no rectangles' constraint.

The problem is to take a rectangular n x m grid, and color each of the squares with one of four colors, so that there are no rectangles in the grid whose corners all have the same color. For example, on the right you can see a 6 x 6 grid which satisfies this constraint.

It turns out that if the size of the grid is 19 x 19 or bigger, then you can prove that it’s impossible to color the grid without creating a rectangle. In all cases up to a 16 x 16 grid, Gasarch has an example of a coloring that satisfies the no-rectangles constraint. However, the 17 x 17 and 18 x 18 cases are still open. To win the $289, you need to send Gasarch an example of a rectangle-free coloring of the 17 x 17 grid.

If it turns out that it’s possible to color the grid in this way, then the proof that you can do so is simple — you just draw a picture of the coloring (it’s easy to get a computer to check that it doesn’t have any rectangles). However, if it turns out to be impossible then it can be very hard (read: long and ugly) to prove that that’s the case. This is why the problem is so appealing to people with very little experience in mathematics: to solve the problem, all you have to do is draw a picture!

Of course, it’s easier said than done. There are 4^{289} different ways to color a 17 x 17 grid, and very few of them will be rectangle free. If you were to write a computer program to check all possible colorings, it would take longer than the age of the universe to run.

My first thought after reading the bit-player post was that a simulated annealing algorithm might do the trick. Simulated annealing is a computational technique which is based on the way that metals can be cooled slowly in order to minimize the number of defects, effectively finding a minimal energy solution to the problem of arranging the atoms in the metal. Simulated annealing tends to work well in situations that have a large number of possible states, and a large number of local minima — just as the rectangles problem does.

I’m working on this algorithm at the moment, and hopefully I’ll have a post about that in the near future. But while I was thinking about the problem, I became distracted by another question: if you color the grid randomly, how many same-color rectangles do you create on average? It’s easy to write a compute program that uses Monte Carlo to approximate this number, but I wanted to do it analytically.

It seems to be a difficult problem, because each cell in the grid can potentially be part of many rectangles. It seems as though there’s a lot of interdependence between the colors of different cells, which will make the calculation very difficult indeed.

Undeterred, I decided to try a naive calculation, in the hope that even though it was wrong, it wouldn’t be too wrong, and it would give me some insight into how to do the full calculation. First I worked out how many rectangles it’s possible to make on an n x n grid by joining corners. After some thought, you can see that this number N is given by:

N = \sum_{i=1}^{n-1} \sum_{j=1}^{n-1} (n-i)(n-j) = \tfrac{1}{4}n^2(n-1)^2

For each potential rectangle, to be counted it needs all four of its corners to be the same color to be included in our count. Each cell has a probability of 1/4 of being any particular color, so the probability of getting all four the same is (1/4)^4, but this could happen with any of the four possible colors, giving a total probability of 4(1/4)^4=1/64. Therefore, according to this dodgy calculation, the average number of rectangles R  in an n x n grid is

R = N/64 = \frac{n^2(n-1)^2}{256}

So how does this rough calculation compare with the numerically computer results? Take a look for yourselves. The dots are the numerical approximations (which should be pretty accurate) and the dashed line is the formula I derived above:

Average number of rectangles

Average number of rectangles plotted against grid size.

It turns out that this formula gives precisely the right value for the expected number of rectangles in a random coloring! So now my question is: is it actually the case that, for some subtle reason, the calculation I did above is valid after all, even though intuitively it seems that it should be completely wrong? Or is it just a coincidence that it comes out with exactly the right answer? I may be thinking about this one for a while.