You are currently browsing the category archive for the ‘Math’ category.

This is based on a neat little post that I saw on Simon Frankau‘s blog that I thought I’d provide a few more details on, as well as bringing it to a wider audience.

Some higher-level data structures in object-oriented programming languages have *dynamic memory allocation*. This is used to create objects (usually things that behave like lists) that can grow and shrink as the program executes.

When initializing a regular array in C or Java, you have to specify the size of the array on creation, for example with

int myArray[] = new int[10];

in Java (actually Java has a wide variety of syntaxes for array creation, which I won’t go into). Importantly, your array has the size you initially specify for the rest of time. When it’s created, you are allocated a sequential block of memory that’s exactly big enough to hold the elements of the array (it’s sequential so that looping through the elements of the array is fast). If you find that you need more storage space at any point in time, you’re out of luck.

Fortunately most high-level languages like provide a data structure which is able to grow, for example C++’s std::vector or Java’s ArrayList and Vector. When these are created they are allocated a block of memory. If it ever turns out that you need more space, then they allocate a new block of memory to allow you to extend the array: for example, if you have an array of size 10 and then try to allocate something to the 11th position:

ArrayList myArray = new ArrayList(10); myArray.set(10, "foo");

But how much new memory should they allocate?

A little bit of investigation reveals that this varies across languages and across data structures. Commonly, a multiplicative growth factor is applied. For example, with a growth factor of two, you would double the size of the allocated block each time. To ensure that the array elements remain sequential in memory (so that it’s still fast) you probably want to allocate a new block and copy across the elements in the old block – so your strategy works like this:

- Allocate a new block of memory that is bigger than the current block by some factor
- Copy the old data into the new memory block
- Start writing the new elements here

As Simon points out, a natural strategy is to double the size of the array each time, but this is probably a bad choice. This is because at each stage you assign a new memory block of size , but the total size of the memory locations used so far (assuming your initial array size is rescaled to be 1) is , so you can’t reuse any of the old memory. Since you’re now only dealing with the newly allocated memory, all this old memory goes to waste.

What would be a better factor? Let’s say we used . Then initially we’d have a block of (rescaled) length 1. We allocate a new block of length . When we extend again we allocate a block of length . We need to keep the old block of length around to copy over the old data, and we can’t fit our block of size in the space of size 1, so we allocate this as a new block too. But when we allocate the next block of size , we might be able to fit it in the two old blocks, of size 1 and . This works if

which you can solve numerically and find that there is a solution . What if you were willing to wait for one more allocation before reusing the memory? Then we’d try and fit a block of size into a block of size , so we solve

for a solution of . How about if we were willing to wait for allocations? Then we’d be trying to solve

which is more tricky. However, if we notice that the left-hand side is a geometric series, then we can sum it up and multiply through to get

This still looks a but hopeless, but when $n$ is large the three terms involving will dominate the ‘-1’ and we can neglect it. This corresponds to an allocation strategy where we know that we will be able to reuse a block of memory at some point in the future. We get:

This is looking more hopeful. Now we can divide by and rearrange, giving a familiar-looking equation:

Why does this look so familiar? It’s positive solution is exactly , the well-known *golden ratio*! Now, you wouldn’t want to use exactly this value to do your memory reallocation, because you’d end up waiting forever to be able to reuse your old memory. But Simon suggests that a strategy of using a number a bit less than this, say 1.5, might work out.

And guess what? Java’s ArrayList uses exactly this factor to allocate new memory. I don’t know if it reuses its previously allocated memory in the style described here – that’s something to find out for the future.

I’ve been thinking a little bit about covariance matrices recently, particularly how to estimate them from sample data. A typical problem in finance is to estimate a covariance matrix for a universe of stocks from a finite sample of returns data. The problem is that the number of parameters quickly becomes large, and can completely dwarf the number of samples you have available.

For example, consider trying to estimate the correlation of the stocks in the S&P 500 from a year of daily returns data — overall that’s about pieces of information, to estimate a total of parameters. Oops. As if that wasn’t bad enough, you are building a matrix with rows out of only columns — this means that your matrix is going to be of rank *at best*, and possibly lower. In particular, it won’t be invertible. If your application requires an invertible covariance matrix — say, if you’re building a Markowitz portfolio — then you’re already out of luck.

Fortunately there are several techniques that can be employed to improve the estimation of the covariance matrix, and ensure that it has full rank and is invertible. The simplest of these is called *shrinkage*, which is a fancy name for something simple: we transform to the correlation matrix, scale that toward the identity matrix, and then transform back into correlation space:

for some . This ensures that the correlation matrix, and hence the covariance matrix, is invertible. Over the next few posts I’m going to look into some more advanced techniques, including techniques that allow practitioners to incorporate their prior beliefs about market structure into the model.

A related problem to the last post. You toss a fair coin, and keep going until the number of tails you’ve seen is *n* more than the number of heads. On average, how many times do you toss the coin before you stop?

Unlike the last problem, I don’t know the answer to this one, so I’d be interested in hearing anyone else’s approach. I believe that I have a good method for getting at the answer, but that I’m missing one crucial step that I haven’t yet been able to fill in.

I have a lot of coin-tossing problems floating around in my head, of varying levels of difficulty. Here’s one which requires some thought, but shouldn’t present too much of a challenge if you’ve seen this kind of problem before.

You start flipping a fair coin, and keep going until you’ve seen *n* heads in a row, when you stop. On average, how many times do you flip the coin before stopping?

Let’s play a game. I pick a number — any number at all (whole number, fraction, whatever). I then flip a coin to decide whether to add 10 or subtract 10 from it, and I tell you the result. You win the game if you can tell me what my original number was.

For example, suppose the number I picked was 31. I flip a coin, getting heads, and therefore add 10, getting 41, and that’s the number I tell you. Now you know that I picked either 31 or 51. It seems that you have no way of deciding between those, so your chance of winning the game is 1 in 2.

I contend that you can win the game with a probability of greater than 1 in 2, although I will admit that I can’t tell you how much greater the probability is.

Let’s introduce a bit of terminology. We’ll call the first number that I pick X. This is the number that you’ve got to guess. Then I flip a coin and add or subtract 10 from it, getting Y, which is the number I tell you. So we know that Y = X + 10 or Y = X – 10.

The secret to you winning the game is to pick your own number, Z, before I tell you Y. It doesn’t matter how you pick Z, so let’s just say that you have a computer program that generates a random number for you (from the standard normal distribution, if you want to get picky).

Now you play the following strategy:

- If Y < Z then you guess that X = Y + 10 (ie I subtracted 10)
- If Y > Z then you guess that X = Y – 10 (ie I added 10)

I claim that this strategy will win you the game more than half the time. To see why, we have to consider three cases:

- If X < Z – 10 then we always have Y < Z, so you guess X = Y + 10 and get it right 1 time in 2.
- If X > Z + 10 then we have Y > Z, so you guess X = Y – 10 and get it right 1 time in 2.
- But if X is in the interval [Z – 10, Z + 10] then your guess will always be right. If X = Y – 10 then we always have Y > Z, so you guess X = Y – 10, which is right. And if X = Y + 10 then we have Y < Z, so you guess X = Y + 10, which is right.

As long as there is some chance that X is in this interval, then there is a greater than 50% chance of you winning the game.

A little while back, science punk Frank Swain posted a mathematical puzzle on his blog. The puzzle can be described simply — space *n* points equally around the unit circle, and draw all possible lines connecting those points. How many triangles are there in the resulting graph?

The first couple of cases, *n = 3* and *n = 4* are easy — the answers are 1 and 8. For *n = 5* it requires a little more thought, but stand-up mathematician Matt Parker has a very clear explanation of why the answer is 35 on his website.

After that, it gets progressively more difficult to count the triangles. The diagram for *n = 6* is on the right. Try keeping track of all those triangles in your head (although CloudoidLtd has a very pretty, and easy to understand, solution for *n = 6* and *n = 7* on his blog.) The answers there are 110 and 287.

I was inspired by a comment on Frank’s blog to try a more traditionally combinatorial approach to solving the problem. The key observation is that each triangle is made up of exactly three chords. These chords can intersect at one of the points around the circle, or they might intersect somewhere inside the circle. We just have to classify the different ways that three chords can intersect to form a triangle, and then count up how many there are in each case.

**A guide to choosing**

The explanation will require that we understand how to calculate the number of ways of choosing a set of *k* objects from a larger set of *n* objects. We use the notation *(n,k)* as a shorthand for this. We pronounce it “*n* choose *k*“, and we definite it mathematically by

*(n,k) = n! / k! (n-k)!*

The exclamation mark is the factorial function, which means that we multiply together every number from *n* all the way down to 1, so that for example 4! = 4 x 3 x 2 x 1 = 24 (we also have the useful convention that 0! = 1). If you’ve not seen this before, it can be a fun exercise to try and work out why *(n,k)* has to take this form.

**Counting the triangles**

We’re trying to count the number of ways that three chords of the circle can intersect to form a triangle. Each chord involves two points around the circle, so there are only four cases to consider:

1. The chords form a triangle between three points. This is the simplest case to consider. Every set of three points around the edge of the circle contributes one triangle, so the number of triangles is simply the number of ways of choosing 3 objects from a set of *n* objects, which is (*n,*3).

2. The chords form a line connecting four points. There are (*n*,4) ways of choosing a set of four points, and for each set of four points there are four ways we can arrange the chords so that they form a triangle. So this contributes another 4 * (*n*,4) triangles.

3. Two of the chords share a point, and the third chord cuts across them. This involves five points in total. There are (*n*,5) ways of choosing 5 points, and for each set there are five ways to arrange the chords so that they form a triangle. This contributes another 5 * (*n*,5) triangles.

4. None of the chords share a point, so there are six points in total. To make a triangle we have to join opposite points, so there is only one way of arranging the chords so that a triangle is formed. This gives another (*n*,6) points.

With all of this counting done, we have a formula for the number of triangles formed by the mystic rose with *n* points, which we denote by *T*(*n*):

*T*(*n*) = (*n*,6) + 5 (*n*,5) + 4 (*n*,4) + (*n*,3).

So how does this formula stack up when we compare it to the results we already know? With suitable adjustments for when *n* < 6 (because (*n*,*k*) is meaningless when *k* > *n*) we get

*T*(3) = 1, *T*(4) = 8, *T*(5) = 35, *T*(6) = 111, *T*(7) = 287 and *T*(8) = 644.

This works for most of the cases, but the actual answers for *n* = 6 and *n* = 8 are 110 and 632. The formula overcounts by one triangle for the 6-sided rose, and it overcounts by 12 triangles for the 8-sided rose. What’s going on?

**Overcounting, and why the problem is hard
**

We can see the problem by looking at the 6-sided rose at the beginning of the post. The three chords which cut the circle in half all meet at the center. They don’t form a triangle, but our formula treats them as if they do. This is why it overcounts by 1.

It’s interesting to note that if the points weren’t *equally* spaced around the circle, but instead were *randomly* spaced, then we wouldn’t have this problem. It’s unusual for three lines in a plane to all intersect at the same point — it requires some special structure, and in this case the special structure is given by demanding that all the points lie on the vertices of a regular polygon. Injecting some more randomness into the problem would actually make it easier. The problem would be one of pure *combinatorics* — that is, just a problem of counting. Because of the additional regularity, we have made it a problem of *geometry*. Not only do we have to think about how the points are connected by lines, we also have to worry about the exact locations of the points. *This is why the mystic rose puzzle is hard!*

**Finding the intersections**

Whenever we have three or more chords intersecting at a point inside the circle, our formula is going to overcount the number of triangles, so we need to correct for this somehow.

Intersections at the center of the circle are easy to deal with. When there are *n* points, and *n* is even, then there are *n*/2 chords which pass through the center of the circle. Each set of three chords contributes one false triangle, and there are (*n*/2,3) ways of choosing 3 chords from a set of *n*/2, so we can modify our formula to:

*T*(*n*) = (*n*,6) + 5 (*n*,5) + 4 (*n*,4) + (*n*,3) – *a*(*n*) (*n*/2,3)

where *a*(*n*) = 1 if *n* is even, and *a*(*n*) = 0 if *n* is odd. This formula now gives the correct result for *n* = 6, although it still fails for *n* = 8.

A number of interesting hypotheses spring to mind, which would simplify the problem if they were true:

- The intersections only occur at the center of the circle.
- The intersections all occur along lines joining two points at opposite sides of the circle.
- The intersections all occur along symmetry axes of the rose.
- Except at the center, only three chords ever intersect at one point.

Unfortunately, none of these is true. Hypothesis 1 can be seen to be untrue if you look at the 8-sided roses in the pictures above. Around the center of the rose there are 8 points where three chords intersect. These explain 8 of the overcounted triangles. The other 4 are from the intersection of four chords at the center — every set of three chords intersecting at the center gives a false triangle, and there are (4,3) = 4 ways of choosing a set of three chords from four.

To see why Hypotheses 2 and 4 are false, we have to look at the 12-sided rose. Using the mystic rose drawer at nrich you can draw mystic roses with up to 20 points. For the 12-sided rose pictured on the right, we see that there is an intersection of 4 chords which doesn’t occur along a line joining two points on opposite sides of the circle. Damn!

To see why Hypothesis 3 is false, we have to look at the 16-sided rose. Now the three-point intersection doesn’t occur on a symmetry axis (can you see it? It’s in the top left quadrant of the picture). This means that this intersection is repeated 32 times around the circle, rather than the 16 times that we’d see if the intersection was on a symmetry axis.

One hypothesis that does seem to be true is this one:

5. When *n* is odd, there are no intersections of three or more chords.

This means that our formula should work for all odd *n*, and indeed for every *n* that I’ve checked, the formula holds. I still don’t have a proof, though!

With all of these hypotheses rejected, we have to come up with a criterion for the intersection of three or more chords of the circle, and a way of determining how many such intersections there are when there are *n* points spaced equally around the edge of the circle. This is to come in Part II of this blogpost.

My favorite kind of puzzles are those which sound like they’re going to be an absolute nightmare, but turn out to have a particularly beautiful solution. They’re even better if you don’t really need to know any math to solve them. Here’s such an example, which absolutely delighted me when I first solved it.

I presented it to two of my friends on a long train journey — one of them is a mathematician (in fact he’s a research fellow at Queens’ College, Cambridge) and the other works for a pharmaceuticals company. The non-mathematician got it first. It really is a problem that anyone can have a crack at.

First, I’ll describe a warm-up puzzle. It’s kind of a barometer for the real puzzle.

The challenge: produce a deck of cards, each with a whole number painted on it, with which you can make any number between 1 and 1 million, by picking out some of your cards and adding together the values. You should do this with the least number of cards possible.

Got that? You may want to stop reading and think about it for a second, because the solution is coming up in the next paragraph. I’ve found that computer scientists and programmers tend to get this one faster than most people — often with less than a second’s thought.

The answer is that you should label your cards with the powers of two, so that your deck starts 1, 2, 4, 8 … If you didn’t come up with this solution, the way to think about it is to figure out what cards you’d need to make each of the numbers from 1 up to 1 million. You need to be able to make 1, so you’d better have a card labeled ‘1’. Similarly you need a card labeled ‘2’. But with those two cards you can already make 3, so the next card that needs to be added to your deck is labeled ‘4’. After a while you should spot the pattern, and come up with an argument for why this will work for every number up to a million.

Now here’s the real puzzle, which is a bit more tricky, but fun to think about. I’m not going to give the answer now — I’ll post it in a couple of days.

Produce a deck of cards with which you can make any number between 1 and a million by picking cards and adding together their values, *even if you have lost one of the cards in the deck*.

The requirement that your deck is insensitive to the loss of a card makes the problem tricker to think about, but a similar approach to that described for the easier problem can work. For example, you know that you need to include two cards labeled ‘1’, so that even if you lose one of them you can still pick cards that add up to give 1.

For those inclined to a challenge, there are three levels of answer:

- Describe the deck of cards that solves the puzzle (easy, requires almost no math.)
- Prove that you can make any number between 1 and a million, even if you lose a card (not too hard, requires a little bit of math.)
- Prove that this is the minimal deck of cards that solves the puzzle (difficult, but still needs no math beyond high school.)

I learned this from my office mate, Toby Wood, but I’m not certain where he got it from. If you know, please leave a comment!

My last post, on the 17 x 17 challenge, ended with a solution to a problem which I thought sounded incredibly naïve* but which turned out to be correct. It got me thinking about a problem that I was posed when I was interviewing to work at investment banks. In fact I was asked the same question in interviews for two different banks (potential applicants take note).

It goes like this: You find yourself in a dark cave, which has three tunnels leading from it (see the crude sketch on the right). One of the tunnels leads outside, and it takes 3 hours to traverse it. The other two tunnels connect to each other, returning you to the cave, and they take 1 hour to walk down. If you choose tunnels at random each time you’re faced with a choice, how long will it take you on average to get outside?

I’m about to tell you the answer, so maybe you want to stop and think about it for a bit. The point I’m making is that even though the calculation you have to do is a bit involved (although simple to anyone who’s ever studied Markov chains) there’s a cheeky shortcut: you can get the answer just by assuming that you take each of the tunnels once, and add up the times to traverse them, getting an answer of 1+1+3=5 hours.

Is this just a coincidence? To find out, let’s generalise the problem. Now we find ourselves in a cave with *n* tunnels, of which *m* of them lead to an exit and *n-m* dump you back in the cave. Each of the tunnels takes time to traverse, for *i*=1,…,*n*, and is chosen with probability . We’ll adopt the convention that the first *m* tunnels are the ones that lead to an exit, and the last *n-m* are the return tunnels (notice that our generalisation allows the possibility that a return tunnel might take a different amount of time to traverse if we go in the opposite direction).

To work out what the average escape time is, we notice that if we pick an escape tunnel then we’re free in the time it takes to traverse the tunnel. If we pick a return tunnel, then we’ve increased our expected time to leave by the time that it took to traverse the tunnel. If the time to leave is *T*, then we write the expected leaving time as E[*T*], and

.

After a couple of lines of algebra we see that the expected time to leave is given by

or, written in everyday language:

Avg time to leave = (Avg tunnel length) ÷ (Chance of picking an exit tunnel)

That’s it! The average time to escape only depends on the average time it takes to traverse a randomly picked tunnel, and the probability of picking a tunnel which leads to an exit. It’s completely insensitive to the specifics of the problem, namely how long the individual tunnels are, and what the precise probabilities of picking each tunnel are. In the special case where each of the tunnels is equally likely, all you have to do is add up the time it takes to traverse each of the tunnels, and divide by the number of tunnels that lead to an exit.

This appeals to me because a problem which has a lot of fine detail turns out to be insensitive to the fine detail. I’d be interested to hear about other examples of this phenomenon.

* Is it pretentious to write ‘naïve’ instead of ‘naive’? Answers, opinions and furious rants on a postcard, please.

A recent post on bit-player drew my attention to William Gasarch’s 17 x 17 rectangle challenge. In short, Gasarch is offering $289 (289 = 17 x 17) for the solution to the following puzzle.

The problem is to take a rectangular n x m grid, and color each of the squares with one of four colors, so that there are no rectangles in the grid whose corners all have the same color. For example, on the right you can see a 6 x 6 grid which satisfies this constraint.

It turns out that if the size of the grid is 19 x 19 or bigger, then you can prove that it’s impossible to color the grid without creating a rectangle. In all cases up to a 16 x 16 grid, Gasarch has an example of a coloring that satisfies the no-rectangles constraint. However, the 17 x 17 and 18 x 18 cases are still open. To win the $289, you need to send Gasarch an example of a rectangle-free coloring of the 17 x 17 grid.

If it turns out that it’s possible to color the grid in this way, then the proof that you can do so is simple — you just draw a picture of the coloring (it’s easy to get a computer to check that it doesn’t have any rectangles). However, if it turns out to be impossible then it can be very hard (read: long and ugly) to prove that that’s the case. This is why the problem is so appealing to people with very little experience in mathematics: to solve the problem, all you have to do is draw a picture!

Of course, it’s easier said than done. There are different ways to color a 17 x 17 grid, and very few of them will be rectangle free. If you were to write a computer program to check all possible colorings, it would take longer than the age of the universe to run.

My first thought after reading the bit-player post was that a simulated annealing algorithm might do the trick. Simulated annealing is a computational technique which is based on the way that metals can be cooled slowly in order to minimize the number of defects, effectively finding a minimal energy solution to the problem of arranging the atoms in the metal. Simulated annealing tends to work well in situations that have a large number of possible states, and a large number of local minima — just as the rectangles problem does.

I’m working on this algorithm at the moment, and hopefully I’ll have a post about that in the near future. But while I was thinking about the problem, I became distracted by another question: if you color the grid randomly, how many same-color rectangles do you create on average? It’s easy to write a compute program that uses Monte Carlo to approximate this number, but I wanted to do it analytically.

It seems to be a difficult problem, because each cell in the grid can potentially be part of many rectangles. It seems as though there’s a lot of interdependence between the colors of different cells, which will make the calculation very difficult indeed.

Undeterred, I decided to try a naive calculation, in the hope that even though it was wrong, it wouldn’t be too wrong, and it would give me some insight into how to do the full calculation. First I worked out how many rectangles it’s possible to make on an n x n grid by joining corners. After some thought, you can see that this number N is given by:

For each potential rectangle, to be counted it needs all four of its corners to be the same color to be included in our count. Each cell has a probability of 1/4 of being any particular color, so the probability of getting all four the same is , but this could happen with any of the four possible colors, giving a total probability of . Therefore, according to this dodgy calculation, the average number of rectangles R in an n x n grid is

So how does this rough calculation compare with the numerically computer results? Take a look for yourselves. The dots are the numerical approximations (which should be pretty accurate) and the dashed line is the formula I derived above:

It turns out that this formula gives precisely the right value for the expected number of rectangles in a random coloring! So now my question is: is it actually the case that, for some subtle reason, the calculation I did above is valid after all, even though intuitively it seems that it should be completely wrong? Or is it just a coincidence that it comes out with exactly the right answer? I may be thinking about this one for a while.

In the last post we looked at the Mandelbrot set, a beautiful 2d fractal with infinite complexity along its boundary. The challenge is to find a three dimensional version of the set, the Mandelbulb, which we can then lovingly render with dynamic lighting and shadows.

Mathematicians were doing work related to the Mandelbrot set in the very early 20th century. However, it wasn’t until the 1980s that computers were powerful enough to draw high quality pictures of it. The Mandelbulb is even more complicated, and requires even more computational power (once you’ve located the object, you then need to render it — this is where the computer spends most of its time). This is why we’ve not been able to see pictures of the Mandelbulb until very recently.

The first problem is determining the rule which will generate the Mandelbulb. For technical reasons there is no such thing as a 3d analogue of the complex numbers (the *quaternions* are the closest 4d analogue) and thus coming up with the ‘correct’ map is tricky. The key is to re-use the ideas of multiplying the radius, rotating the angles and then translating. Mathematicians use a set of coordinates for 3d space called *spherical polar coordinates*, which tell you location of a point in terms of the distance from the origin, , and two other angles, and .

For those with the appropriate background, the map which takes you from one point to the next is given by composing a stretching and rotation:

with a constant translation:

We use spherical coordinates to do the first transformation, and cartesian coordinates to do the second. If the sequence of iterates remains bounded when using the parameters *a*, *b* and *c* then the point with coordinates (*a*,*b*,*c*) is part of the Mandelbulb. If the sequence diverges, then it is not.

We’ve introduced a parameter *n* into the map — this tells you how much to rotate the angles. If *n* is 2, that means we double the angles. If *n* is 3 we triple them, and so forth (in the 2d Mandelbrot set we only looked at *n *= 2). In three dimensions, the low values of *n* don’t give us a very interesting object. As we increase *n* the object gets progressively more complex. It turns out that *n *= 8 gives a nice balance between beauty and novelty.

Now that we have the ingredients, all we need to do is complete the renders. This is a very computationally intensive process, but the results are worth it! By playing around with the color of lighting and the depth of shadows, we can create some truly stunning images — certainly the most striking examples of fractal beauty that I’ve ever seen.

All of the information in this post is sourced from the Mandelbulb page at Skytopia, which is well worth a visit!