You are currently browsing the category archive for the ‘Programming’ category.

The first exercise is merely to evaluate some Lisp expressions manually, and then check them in the interpreter. Easy and not too interesting, but I did say that I’d do every exercise, so I guess I’ll plough ahead.

This is good time to talk about the interpreter I’m using. I initially toyed with downloading the MIT Scheme interpreter, which is a minimalist product that’s designed to be used alongside this course. I also thought about downloading Emacs – you can get implementations with a built-in Lisp interpreter. But I’ve been getting along quite happily with a combination of Vim and TextMate (for Mac OS X) for the past few months, and I’m not sure that learning a new text editor is necessary right now. I also wanted to do some more advanced coding in Lisp (what’s the point of learning a new language if I’m not going to use it?) so I wanted something a bit meatier than MIT Scheme.

DrRacket

DrRacket

I eventually settled on Racket, a bells-and-whistles implementation of Scheme that includes a lot of extra features out of the box – lots of nice data structures, the ability to create GUIs and manipulate graphics, and lots more. It comes with its own interpreter, DrRacket, which is functional and pretty easy to use. A nice touch is that you can load any definitions file into the interpreter to give you a whole load of pre-defined functions – and if the first line of the definitions file is #lang <language-name> then you will only use the features of that language. For example, the first line of the definitions file I’m using for this project is #lang scheme so I don’t have access to all the funky procedures defined in Racket – just the core Scheme language.

On with the exercises.

Exercise 1.1

What is the result printed by the interpreter in response to each of the following expressions?

> 10 
10

Numerals have a value equal to the number they represent.

> (+ 5 3 4)
12

This is equivalent to 5 + 3 + 4. Note that the + procedure can take multiple arguments. I haven’t learned how to define a function with a variable number of arguments yet, but hopefully I will soon!

> (- 9 1)
8

This is equivalent to 9 – 1.

> (/ 6 2)
3

Equivalent to 6 / 2. Note that we get an integer result.

> (+ (* 2 4) (- 4 6))
6

Equivalent to (2 * 4) + (4 – 6) = 8 + (-2). Note that there is no need for operator precedence in Scheme, because we always include the parentheses! I wonder if a syntax for Scheme could be defined that allowed you to leave out parentheses when the meaning was clear, and had rules of operator precedence instead. For example, the K programming language that I use at work has simple right-to-left order of evaluation. This can lead to very code and lets you easily write one-line expressions that are very powerful, but has many gotchas: for example, 2*4+5 evaluates to 18 in K, rather than 13.

> (define a 3)

Whether or not define statements return a value is implementation-dependent. My interpreter doesn’t give a return value. However, we have now bound the value 3 to the variable a.

> (define b (+ a 1))

Again, there is no return value, but we have bound the value of (+ a 1) to the variable b, so that b now has the value 4.

> (+ a b (* a b))
19

This is equivalent to a + b + (a * b) where a = 3 and b = 4.

> (if (and (> b a) (< b (* a b)))
      b
      a)
4

This procedure says “if (b > a) and (b < a * b) then return b, else return a”, which becomes “if (4 > 3) and (4 < 12) then return 4, else return 3” so it returns 4.

> (+ 2 (if (> b a) b a))
6

First evaluate the if statement. It says “if (b > a) return b, else return a” which becomes (by substitution) “if 4 > 3 then return 4, else return 3”, so it returns 4. We then evaluate (+ 2 4), which returns 6.

> (* (cond ((> a b) a)
           ((< a b) b)
           (else -1))
     (+ a 1))
16

We first evaluate the cond statement. It evaluate to 4, since the first predicate is false but the second one is true. The second argument (+ a 1) evaluates to 4 also, and finally (* 4 4) evaluates to 16.

Exercise 1.2

Translate the following expression into prefix form:

\dfrac{5 + 4 + (2 - (3 - (6 + 4/5)))}{3(6 - 2)(2 - 7)}

I reckon we have:

> (/ (+ 5
        4
        (- 2
           (- 3
              (+ 6
                 (/ 4 5)))))
     (* 3
        (- 6 2)
        (- 2 7)))
-37/150

That return value was a surprise to me! I had assumed that when applying the division operator to two integers, Scheme would either perform integer division (like Java) or return a floating point number (like Python). Apparently it has a built-in rational data type. Which is nice.

Exercise 1.3

Define a procedure that takes three numbers as arguments and returns the sum of the squares of the two larger numbers.

Here’s a pretty ugly answer, using the built-in min function:

> (define (f a b c)
    (- (+ (* a a) (* b b) (* c c))
       (* (min a b c) (min a b c))))

Why is it ugly? Well, it needlessly applies min twice, and it does four multiplications, two additions and a subtraction, when all that’s needed is one addition and two multiplications. How about:

> (define (g a b c)
    (cond ((and (<= a b)
                (<= a c)) (+ (* b b) (* c c)))
          ((<= b c) (+ (* a a) (* c c)))
          (else (+ (* a a) (* b b)))))

This looks uglier, but it’s more efficient: it never performs more than two multiplications and one addition, and never more than three comparisons (the minimum possible, since you have to sort the arguments so that you know which is the smallest, and this takes three comparisons in the worst case. If anyone has a prettier and efficient solution (perhaps one that works for arbitrary lists of arguments?) then I’d like to see it.

Advertisements

Pretty basic stuff in these sections, but it explains some very important concepts and syntax, namely the idea of abstraction via the define statement.

1.1.1 Expressions

A programming language allows you to combine simple expressions to form more complex ideas. To do this it has three important mechanisms:

  1. Primitive expressions which represent the simplest entities
  2. Means of combination to build compound elements from simpler ones
  3. Means of abstraction to name compound elements and treat them as primitives

One kind of primitive expression in Scheme is a number. When you type an expression into the Scheme interpreter, the interpreter responds by evaluating the expression and printing its value, for example

> 486
486

I’ll use the convention that an expression prefixed by > is something that you type at the interpreter prompt, and the next line is the interpreter’s response. We form compound expressions using prefix notation, for example

> (+ 5 2)
7
> (- 10 7)
3
> (* 6 9)
63

These expressions are called combinations or S-expressions. The leftmost element is the operator, and the other elements are the operands. The value of the expression is obtained by applying the procedure specified by the operator to the arguments, which are the values of the operands.

Scheme used prefix notation, which looks unfamiliar at first but is just as valid as the standard infix notation, written 5 +2 or 10 - 7. In fact, it can have advantages, for example when procedures take an arbitrary number of arguments:

> (+ 10 7 3 5)
25

We can also easily nest combinations:

> (+ (* 4 2) (- 10 7))
11

1.1.2 Naming and the environment

None of this would be useful without the ability to give names to complex expressions, and refer to them by name later. For example, we can define the name radius to refer to a specific value by using the keyword define

> (define radius 10.0)

We could also define the value of a useful mathematical constant:

> (define pi 3.14159)

We can combine our names to make more complex expressions:

> (define area (* pi radius radius))
> area
314.159

To implement this ability to name expressions and refer to them later, the interpreter must have some way of associating names to values. This mapping is called the environment, or more accurately an environment, since we will see later that a computation may involve many different environments.

Out of a desire to (i) learn a functionalStructure and Interpretation of Computer Programs programming language, and (ii) generally become a better programmer, I’ve decided to work my way through Abelson and Sussman’s Structure and Interpretation of Computer Programs. It’s a bit of a beast, but I’ve read a chunk of it already and its style appears to me. I’ve also been intending to learn a dialect of Lisp for a while, and this seems like a good opportunity.

This is the book which forms the core of the borderline-legendary MIT course 6.001, which provides as good an introduction to the foundations of computer programming as is available anywhere in the world. It will be difficult, but rigorous.

The idea is that I’ll read every word, and do all of the programming exercises. As I finish eash section, I’ll post what I’ve learnt here, together with the source code I used to solve the exercises, and anything else I think will be useful.

I fully expect it to take me a long time (fitting it in around a day job will be tricky) but I’m feeling confident. If anyone has read it (or even better, is reading it at the moment) I’d love to hear from you.

I was linked to a collection of Edsger Dijkstra’s writings today, and spent an interested half hour reading some of them. Djikstra was known for composing his manuscripts in longhand, using a fountain pen. He would then make a dozen or so photocopies, and mail them out to people he knew to be interested in his work. They would then be further disseminated by those people, and so on, until his work had reached a wide audience. Each article was coded with his initials EWD, followed by a three or four digit number — they thus became known as EWDs in the computer science world.

The most interesting article I read today was EWD831, whose full title is Why numbering should start as zero. Djikstra argues persuasively that counting should start from zero, and thus that all arrays should be indexed beginning from zero. This runs counter to the natural intuition (and to what we are taught at school) that counting begins at 1.

In brief, Dijkstra’s argument runs like this. There are four possibilities for notating the range of numbers 1, …, 10 without using ellipsis notation, which can be ambiguous:

  1. 1 ≤ i < 11
  2. 0 < i ≤ 10
  3. 0 < i < 11
  4. 1 ≤ i ≤ 10

There are several desirable properties that we might want our system of notation to have. For example:

  • (a) the difference between the bounds is equal to the number of terms in the sequence
  • (b) two ranges are adjacent if the upper bound of one is equal to the lower bound of the other.

Both 1 and 2 satisfy these properties, but 3 and 4 don’t. If we want to use our notation to specify a range of natural numbers, then it seems sensible to ask for the following two properties:

  • (c) the lower bound is a natural number
  • (d) the upper bound is a natural number, even when defining the empty sequence

Property (c) is only satisfied by 1 and 4 and property (d) is only satisfied by 1 and 3 So we see that it is only notation 1 that satisfied all of the properties we desire.

Now we move on to the question of whether we should start indexing at 0 or at 1. This is simply an issue of aesthetics. If we want to describe the range of an array with N entries, then by starting at 0 we would write the range as 0 ≤ i < N, whereas if we start indexing at 1 then it is written as 1 ≤ i < N+1, which is significantly more ugly. So we see that we should start indexing at 0. Note that programming languages such as C, Java and Python all follow this convention, as do most widely adopted languages. The only commonly used languages I can think of that don’t use it are Fortran and R.