You are currently browsing the category archive for the ‘SICP’ category.

A few more exercises – one that looks at the way Scheme interprets expressions and a brief glance at special forms, and a couple of numerical ones.

Exercise 1.6

Alyssa P. Hacker doesn’t see why if needs to be provided as a special form. Can’t we just define it in terms of cond? Alyssa defines a new version of if:

(define (new-if predicate then-clause else-clause)
  (cond (predicate then-clause)
        (else else-clause)))

This seems to work:

> (new-if (= 0 1) 0 5)

> (new-if (= 1 1) 0 5)

Alyssa now uses new-if to rewrite the square root program:

(define (sqrt-iter guess x)
  (new-if (good-enough? guess x)
          (sqrt-iter (improve guess x)

What happens when Alyssa tries to use this program to compute square roots? Explain.

The program will enter into an infinite loop. When you call new-if the interpreter attempts to evaluate all the inputs, and then passes them into cond. In the square root program, the third input to new-if calls sqrt-iter, which involves another call to new-if, which calls sqrt-iter again, and so on.

In contrast, the special form if evaluates its first argument, and then evaluates only one of the remaining two arguments depending on whether the first argument is true or false — therefore the problem of entering into an infinite loop never arises, as eventually the first argument returns true, which means that the second argument (rather than the third) is the one that gets evaluated.

Exercise 1.7

The good-enough? test used to compute square roots will not be very effective for finding the square roots of very small numbers. Also, in real computers, arithmetic operations are almost always performed with limited precision. This makes our test inaccurate for very large numbers. Explain these statements, with examples showing how the test fails for small and large numbers. An alternative strategy for implementing good-enough? is to watch how guess changes from one iteration to the next and stop when the change is a very small fraction of the guess. Design a square root procedure that uses this kind of test. Does it work better for small and large numbers?

The procedure fails for very small numbers because it is not accurate enough. For example, say we want to compute the square root of 0.000001 (which is 0.001) using this procedure. If our initial guess was 0.02 (too big by a factor of 20) then our test squares it to get 0.0004, which is certainly within 0.001 of the input — so this guess passes the test, and is reported as the true number. Here’s an example, using 1.0 as an initial guess:

> (sqrt 0.000001)

It fails for large numbers because the computer only stores floating point numbers to a limited precision — it uses a fixed number of bits to store a number of significant digits, and the rest of the bits to store an exponent. The limit of precision is often called machine epsilon. For significantly large numbers, machine epsilon is bigger than 0.001, and thus the difference between any two different, sufficiently large numbers will be greater than 0.001, and the process will not terminate. For example, trying to computer

> (sqrt 1000000000000000)

(that’s 1 followed by 15 zeros) causes the interpreter to hang.

To get around this, we can redefine good-enough?. We supply it with the previous guess and the new guess, and it returns true when the difference in guesses is smaller than the new guess by some fixed ratio (let’s say 0.000001, or 10^{-6}). Our sqrt-iter process becomes:

(define (sqrt-iter old-guess new-guess)
  (if (good-enough? old-guess new-guess)
      (sqrt-iter new-guess (improve new-guess))))

with good-enough? now given by:

(define (good-enough? old-guess new-guess)
  (< (abs (- old-guess new-guess)) (* 0.000001 new-guess)))

We can now try computing the square roots of very small and large numbers:

> (sqrt 0.000001)

> (sqrt 1000000000000000)

and see that everything works as we would expect it to.

Exercise 1.8

Netwon’s method for cube roots is based on the fact that if y is an approximation to the cube root of x, then a better approximation is given by the value

\dfrac{x/y^2 + 2y}{3}.

Use this formula to implement a cube root procedure analogous to the square root procedure.

We only need to change a few of our procedures. Our newly improved good-enough? procedure is already fine. If we implement a procedure improve-cbrt as

(define (improve-cbrt guess x)
  (/ (+ (/ x (square guess))
        (* 2 guess))

and the iterative cube root as essentially identical to the iterative cube root:

(define (cbrt-iter old-guess new-guess x)
  (if (good-enough? old-guess new-guess)
      (cbrt-iter new-guess (improve-cbrt new-guess) x)))

and finally

(define (cbrt x)
  (cbrt-iter 0.0 1.0  x))

then we can use our new cube root procedure as we’d expect:

> (cbrt 8)

> (cbrt 100)

Next we’ll look at a way to hide our abstractions from the end user, by defining procedures inside other procedures.


Procedures are like mathematical functions, with one important difference: a mathematical function can provide declarative definitions, whereas a procedure must provide imperative definitions. A declarative statement is a ‘this is true’ statement. An imperative statement is a ‘do this’ statement. We’ll examine the difference by looking at how the square root function is defined in mathematics and in computer science.

1.1.7 Example: Square roots by Newton’s method

When we write sqrt(x) in mathematics, what do we mean? One possible definition is

sqrt(x) is the unique non-negative real number whose square is equal to x

or, expressed in mathematical notation,

\sqrt{x} is the unique y\in\mathbb{R} such that y\geq 0 and y^2=x.

This is a perfectly valid definition, and you can prove that it is well-formed: that is, you can prove that for non-negative numbers x, there really is a unique non-negative number y such that the square of y is x, and therefore that the definition of the ‘sqrt’ function makes sense.

However, just knowing that we have a sensibly defined function is not much help in computer science. We need to know how to computer the function. That is, we need a sequence of statements that that lets us deduce the value of the output of the function, given an input (actually we will only compute an approximation to the value of the function, but that’s okay – we can make our approximations really very accurate).

This is what we mean by declarative vs imperative knowledge. According to SICP, mathematics is concerned primarily with declarative (what is) statements, whereas computer science is primarily concered with imperative (how to) statements. I’d take issue with this, given the existence of a whole subfield of mathematics, numerical analysis, which concerns itself with how best to find solutions to equations. But I’m not here to debate semantics – I’m here to learn about computer science.

So how do we compute square roots?

The most common way is to use Newton’s method of successive approximations (tip: if you’re interviewing for a job that involves mathematical and programming literacy, learn how to compute square roots using Newton’s method. It’s a favourite interview question). Newton’s method is a formula for finding roots of general functions $f$, but here we only need it in a very simple form. To find the solution of y = sqrt(x), we make an initial guess (say y = 1) and check if it’s accurate enough. If not, then we refine our guess by averaging y and x/y, and check again. We continue iterating in this way until we eventually converge on the solution. You can prove that for any positive initial guess, this method converges to the right solution. Moreover, you can prove that you double the number of digits of accuracy with each iteration — this is known as quadratic convergence.

Let’s formalize this in terms of procedures. This is just a matter of translating our description of the formula above into Lisp code:

(define (sqrt-iter guess x)
  (if (good-enough? guess x)
      (sqrt-iter (improve guess x)

This procedure simply says that, given a guess for the solution, we first check if the guess is good enough; if it is, then we report the guess as our solution, if not, then we improve the guess and apply the procedure again. Note that we haven’t yet defined the procedures good-enough? or improve yet. We’re using a programming strategy called wishful thinking — we write the program as if the sub-procedures we need exist, and then we write them later.

We already know how to improve our guess: we average the old guess y with x/y, like this:

(define (improve guess x)
  (average guess (/ x guess)))

where the procedure average is defined to be

(define (average x y)
  (/ (+ x y) 2))

We also have to define what we mean by ‘good enough’. We’ll say that our guess is good enough if the square of the guess is within a small distance (say 0.001) of the input. This isn’t a very good test, for several reasons, but it will do for now:

(define (good-enough? guess x)
  (< (abs (- (square guess) x)) 0.001))

Finally, we need a starting guess. We can always guess that the square root of a number is 1:

(define (sqrt x)
  (sqrt-iter 1.0 x))

Note that because Scheme has a rational data type, we need to put 1.0 as our initial guess, rather than 1. If x was an integer and we guessed 1, then the interpreter would use rational arithmetic, and all subsequence operations would result in rational numbers. By guessing 1.0 instead, we force the result of subsequent operations to be floating point numbers.

We can now use our new procedure as we’d use any other:

> (sqrt 9)

> (sqrt 100)

> (square (sqrt 1000))

This demonstrates that the simple procedural language we have introduced so far is suffient for writing numerical programs. Notice that we haven’t introduced any iterative (looping) constructes yet. Instead, we relied on the simple ability of a procedure to call itself.

Now we get onto some more interesting exercises, that explore the way that the Lisp interpreter works.

Exercise 1.4

Our model of evaluation allows for combinations whose operators are compound expressions. Use this observation to describe the behaviour of the following procedure:

(define (a-plus-abs-b a b)
  ((if (> b 0) + -) a b))

This is cool! We want the function to return a + |b|, which we can do by making it return a + b if b>0 and a - b otherwise. Because procedures are like any other data type in the language, they can be the return value of an expression. We use that to our advantage here, by using an if procedure that returns the procedure + in the case that b > 0 and - otherwise. Try doing that in Java!

Exercise 1.5

Ben Bitdiddle has invented a test to determine whether the interpreter he is faced with is using applicative-order or normal-order evaluation. He defines the following two procedures:

> (define (p) (p))

> (define (text x y)
    (if (= x 0)

Then he evaluates the expression

> (test 0 (p))

What behaviour will Ben observe with an interpreter that uses applicative-order evaluation> What about normal-order?

The procedure p is defined in terms of itself, so if we ever try to evaluate p we will end up in an infinite loop. If we use applicative order application, then both arguments to test are evaluated before we call test with those arguments. Since this involves evaluating p, the interpreter will hang.

On the other hand, if we use normal-order application then we expand out the arguments without evaluating them, to get

(if (= 0 0)

now when we evaluate this, we first test if (= 0 0), which returns true, so we execute the first statement after the predicate: the second statement, which is p, is never evaluated, and the program happily returns 0.

The first exercise is merely to evaluate some Lisp expressions manually, and then check them in the interpreter. Easy and not too interesting, but I did say that I’d do every exercise, so I guess I’ll plough ahead.

This is good time to talk about the interpreter I’m using. I initially toyed with downloading the MIT Scheme interpreter, which is a minimalist product that’s designed to be used alongside this course. I also thought about downloading Emacs – you can get implementations with a built-in Lisp interpreter. But I’ve been getting along quite happily with a combination of Vim and TextMate (for Mac OS X) for the past few months, and I’m not sure that learning a new text editor is necessary right now. I also wanted to do some more advanced coding in Lisp (what’s the point of learning a new language if I’m not going to use it?) so I wanted something a bit meatier than MIT Scheme.



I eventually settled on Racket, a bells-and-whistles implementation of Scheme that includes a lot of extra features out of the box – lots of nice data structures, the ability to create GUIs and manipulate graphics, and lots more. It comes with its own interpreter, DrRacket, which is functional and pretty easy to use. A nice touch is that you can load any definitions file into the interpreter to give you a whole load of pre-defined functions – and if the first line of the definitions file is #lang <language-name> then you will only use the features of that language. For example, the first line of the definitions file I’m using for this project is #lang scheme so I don’t have access to all the funky procedures defined in Racket – just the core Scheme language.

On with the exercises.

Exercise 1.1

What is the result printed by the interpreter in response to each of the following expressions?

> 10 

Numerals have a value equal to the number they represent.

> (+ 5 3 4)

This is equivalent to 5 + 3 + 4. Note that the + procedure can take multiple arguments. I haven’t learned how to define a function with a variable number of arguments yet, but hopefully I will soon!

> (- 9 1)

This is equivalent to 9 – 1.

> (/ 6 2)

Equivalent to 6 / 2. Note that we get an integer result.

> (+ (* 2 4) (- 4 6))

Equivalent to (2 * 4) + (4 – 6) = 8 + (-2). Note that there is no need for operator precedence in Scheme, because we always include the parentheses! I wonder if a syntax for Scheme could be defined that allowed you to leave out parentheses when the meaning was clear, and had rules of operator precedence instead. For example, the K programming language that I use at work has simple right-to-left order of evaluation. This can lead to very code and lets you easily write one-line expressions that are very powerful, but has many gotchas: for example, 2*4+5 evaluates to 18 in K, rather than 13.

> (define a 3)

Whether or not define statements return a value is implementation-dependent. My interpreter doesn’t give a return value. However, we have now bound the value 3 to the variable a.

> (define b (+ a 1))

Again, there is no return value, but we have bound the value of (+ a 1) to the variable b, so that b now has the value 4.

> (+ a b (* a b))

This is equivalent to a + b + (a * b) where a = 3 and b = 4.

> (if (and (> b a) (< b (* a b)))

This procedure says “if (b > a) and (b < a * b) then return b, else return a”, which becomes “if (4 > 3) and (4 < 12) then return 4, else return 3” so it returns 4.

> (+ 2 (if (> b a) b a))

First evaluate the if statement. It says “if (b > a) return b, else return a” which becomes (by substitution) “if 4 > 3 then return 4, else return 3”, so it returns 4. We then evaluate (+ 2 4), which returns 6.

> (* (cond ((> a b) a)
           ((< a b) b)
           (else -1))
     (+ a 1))

We first evaluate the cond statement. It evaluate to 4, since the first predicate is false but the second one is true. The second argument (+ a 1) evaluates to 4 also, and finally (* 4 4) evaluates to 16.

Exercise 1.2

Translate the following expression into prefix form:

\dfrac{5 + 4 + (2 - (3 - (6 + 4/5)))}{3(6 - 2)(2 - 7)}

I reckon we have:

> (/ (+ 5
        (- 2
           (- 3
              (+ 6
                 (/ 4 5)))))
     (* 3
        (- 6 2)
        (- 2 7)))

That return value was a surprise to me! I had assumed that when applying the division operator to two integers, Scheme would either perform integer division (like Java) or return a floating point number (like Python). Apparently it has a built-in rational data type. Which is nice.

Exercise 1.3

Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.

Here’s a pretty ugly answer, using the built-in min function:

> (define (f a b c)
    (- (+ (* a a) (* b b) (* c c))
       (* (min a b c) (min a b c))))

Why is it ugly? Well, it needlessly applies min twice, and it does four multiplications, two additions and a subtraction, when all that’s needed is one addition and two multiplications. How about:

> (define (g a b c)
    (cond ((and (<= a b)
                (<= a c)) (+ (* b b) (* c c)))
          ((<= b c) (+ (* a a) (* c c)))
          (else (+ (* a a) (* b b)))))

This looks uglier, but it’s more efficient: it never performs more than two multiplications and one addition, and never more than three comparisons (the minimum possible, since you have to sort the arguments so that you know which is the smallest, and this takes three comparisons in the worst case. If anyone has a prettier and efficient solution (perhaps one that works for arbitrary lists of arguments?) then I’d like to see it.

We’re going to introduce a simple model for how the interpreter evaluates expression that you type in, and then go on to look at an important feature of Lisp (well, an important feature of any programming language…): conditionals.

1.1.5 The substitution model for procedure application

A model for how the interpreter applies primitive procedures to arguments is as follows:

To apply a compound procedure to arguments, evaluate the body of the procedure with each formal parameter replaced by the corresponding argument.

Let’s illustrate this by evaluating

(sum-of-squares 3 4)

which expands to

(+ (square 3) (square 4))

Now, (square 3) evaluates to 9 and (square 4) evaluates to 16, so we have

(+ 9 16)

and finally we get the result


This is called the substitution model for procedure application. This is not how the interpreter really works, but it is a simple model that captures the spirit.

Applicative order and normal order

We said that the interpreter first evaluates the operator and operands, and then applies the procedure to the resulting arguments. This is not the only way to evaluate compound procedures. An alternative model does not evaluate any expressions until their results are needed. Instead, it substitutes expressions for parameters until it obtains an expression involving only primitives, and then performs the whole evaluation.

For example, (sum-of-squares 3 4) would be evaluated first to

(+ (square 3) (square 4))

and then to

(+ (* 3 3) (* 4 4))

and now that we only have primitives, we can begin evaluating the whole expression:

(+ 9 16)

and finally


This is called normal-order evaluation, as opposed to applicative-order evaluation which is what the compiler actually uses. Normal-order evaluation is less efficient (since you can end up evaluating the same expression multiple times) and becomes problematic when we leave the realm of procedures that can be modeled by substitution.

1.1.6 Conditional expressions and predicates

There is a special construct in Lisp for doing case analysis, called cond which stands for “conditional”. It is used as follows:

> (define (abs x)
    (cond ((> x 0) x)
          ((= x 0) 0)
          ((< x 0) (- x))))

The general form of a conditional expression is

(cond (<p1> <e1>)
      (<p2> <e2>)
      (<pn> <en>))

i.e. it consists of the symbol cond followed by pairs of expressions in parentheses:

(<p> <e>)

called clauses. The first expression in each pair must be a predicate, i.e. an expression that evaluates to true or false.

Conditional expressions are evaluated as follows. The predicate <p1> is evaluated first. If its value is false then <p2> is evaluated, and so on. This continues until a predicate is found whose value is true, in which case the interpreter returns the value of the appropriate consequent <e> as the value of the cond expression. If none of the predicates evaluate to true, then the value of the cond is undefined.

There is a special symbol else which can be used in place of the final predicate. This causes the cond to return the value of the final expression <e>. We could also write the absolute value procedure as:

> (define (abs x)
    (cond ((< x 0) (- x))
          (else x)))

There is also the special form if, a restricted form of conditional that can be used when there are exactly two mutually exclusive cases. This gives another way to write the absolute value procedure:

> (define (abs x)
    (if (< x 0)
        (- x)

The general form of an if expression is

(if <predicate> <consequent> <alternative>)

To evaluate the expression, the interpreter first evaluates the <predicate> part. If it is true, then it returns the value of <consequent> as the value of the if, otherwise it returns the value of <alternative>. It is important to realise that only one of the <consequent> and the <alternative> are ever evaluated.

In addition to primitive predicates like and = there are also logical composition operators, enabling us to construct compound predicates. The three most common are

(and <e1> ... <en>)

which returns false if any of the <e> evaluate to false, and otherwise returns true,

(or <e1> ... <en>)

which returns true if any of the <e> evaluate to true, and otherwise returns false, and

(not <e>)

which returns true if <e> evaluates to false, and false if evaluates to true.

As an example of use, we could define a predicate to test whether one number is greater than or equal to another by

> (define (>= x y)
    (not (< x y)))

and we can use it:

> (>= 10 2)
> (>= 2 10)

Next I’ll look at some of the exercises.

About me

Proto-hacker, ex-mathematician and aspiring flaneur. Now living in London and making my living from algorithmic trading.


  • RT @JustinWolfers: The stock market is still rising if you measure its true value in bitcoin rather than artificial fiat currency. 2 weeks ago