6.3   Machine-Language Programming


This section under construction.


Although the TOY machine language contains only 16 different instruction types, it is possible to perform a variety of interesting computations. In fact, any computation that can be done in the Java programming language on your PC can also be done in TOY (provided you give TOY enough main memory and time). This may come as quite a surprising fact; we will justify it later in Chapter 8. Below, we describe each of the instructions in the TOY language.

Memory-register transfer (opcodes 8 and 9).

To transfer data between registers and main memory, we use the load (opcode 8) and store (opcode 9) instructions. These operations are convenient because it is not possible to perform arithmetic operations directly on the contents of main memory. Instead, the data must be first transferred to registers. There are also circumstances where it is not possible to maintain all of our program's variables simultaneously in registers, e.g., if we need to store more than 16 values. We overcome the 16 register limitation by storing variables in main memory, and transferring them back and forth to registers using the load and store instructions.

Arithmetic operations (opcodes 1 and 2).

The add (opcode 1) and subtract (opcode 2) perform the conventional arithmetic operations. In TOY, all arithmetic operations involve 16 bit two's complement integers, as described in Section 5.1.

Standard input and standard output.

The load and store instructions are also used to access standard input and standard output. The distinguished memory address FF is intercepted by a TOY hardware interrupt: instead of loading or storing information in memory location FF, the data is received from the keyboard or is sent to the screen.

Implications of standard input and output. The standard input and standard output facilities of TOY have a profound effect on what the TOY machine is capable of. An obvious feature is to get information in and out of the machine. This information can be data, but it can also be instructions! Booting a computer is copying a sequence of stored instructions (e.g., the operating system) into the machine. The TOY machine has only a limited memory (256 words plus a few registers). Nevertheless, it is possible to process more information than this. Another advantage of standard input is that it offers a crude form of user interaction.

Flow control (opcodes C and D).

So far, we have seen how to use TOY as a calculator. Branch statements enable us to harness the true power of TOY, much like while loops and if-else conditionals enabled us to harness the power of Java. The value of the program counter controls which statement the TOY machine will execute next. Typically, the program counter is incremented by one at each time step. This causes the instructions to be executed in order, one after the other. The branch if zero (opcode C) and branch if positive (opcode D) enable us to directly change the program counter, thereby altering the flow of control of our programs.

TOY idioms. There are several common idioms or pseudo-instructions in TOY that can be used for common programming tasks. Many of these tricks rely on the fact that register 0 always stores the value 0000

Bit-whacking operators (opcodes 3, 4, 5, and 6).

The bit-whacking operations - bitwise and (opcode 3), bitwise xor (opcode 4), left shift (opcode 5), and right shift (opcode 6) - work just like the analogous ones in Java, except using 16-bit two's complement integers.

Efficient multiplication. Using the bitwise operators, we provide an efficient implementation of multiply.toy. To multiply two 16-bit integers a and b, we let bi denote the ith bit of b. That is,

          b = (b15 × 215) + (b14 × 214) + ... + (b1 × 21) + (b0 × 20)

By distributivity, we obtain:

          a × b = (a × b15 × 215) + ... + (a × b1 × 21) + (a × b0 × 20)

Thus, to compute a × b, it suffices to add the above 16 terms. Naively, this appears to reduce the problem of performing one multiplication to 32 multiplication, two for each of the 16 terms. Fortunately, each of these 32 multiplications are of a very special type. Recall that a × 2i is the same as left shifting a by i bits. Second, note that bi is either 0 or 1; thus term i is either a << i or 0. Program multiply-fast.toy loops 16 times. In iteration i it computes the ith term and adds it to the running total stored in register C. To gain some perspective, recall the standard grade school algorithm for multiplying two decimal integers. The bitwise procedure we just described is really just the grade school algorithm applied to binary integers.

Load address (opcode 7).

The load address instruction (opcode 7) is the most primitive type of assignment statement in the TOY language. The code fragment below initializes register A to 30.

11: 7A30    R[A] <- 0030

When using the load address instruction, we often think of the destination register as storing the memory address of some piece of data. This is especially useful when dealing with arrays. In the C programming language, this type of variable is known as a pointer. However, we can also use the load address instruction to store a small constant into a register, instead of using the load instruction. Note that load address only permits you to assign 8 bit integers (00 through FF) to a register, even though registers are capable of storing 16 bit integers.

Arrays (opcodes A and B).

Arrays are not directly built into the TOY language, but it is possible to achieve the same functionality using the load address (opcode 7), load indirect (opcode A), and store indirect (opcode B) instructions. We illustrate this technique with two examples. First, we will consider a program that reads in a sequence of integers and prints them in reverse order. Then, we will consider a more sophisticated application of arrays that performs the classic insertion sort algorithm on a sequence of integers.

Functions (opcodes E and F).

In Java it is quite useful to divide a program up into smaller functions. We can do the same in TOY. Below is a TOY program that "calls" a multiply function with two arguments and computers their product. Since all of the variables (registers) are global, we need to agree upon a protocol for calling our function. We'll assume that we want to multiply the integers stored in registers A and B, and store their product in register C. Program multiply-function.toy that calls the multiply function twice to compute x × y × z, once to compute x × y, then again to compute (x × y) × z. It uses the jump and link (opcode F) and jump register (opcode E) instructions that are especially designed for this purpose. Instructions 11 and 14 store the value of the program counter in register 3 before jumping to the function located at F0. This makes it possible to return back to the main program, without hardwiring in the address into the program.

Every time the program counter is reset to F0, the old program counter is saved away in register F for future use. Instruction F5 returns from the function by resetting the program counter to the value stored in register F. Note also, that the program counter is incremented before the instruction is executed. Thus, during the first function call register F is 16 and not 15.

Be very careful about which variables you are using when writing machine language functions. There is no such thing as a "local variable." Had we continued to use register 2 as the loop counter in the multiplication function, this would have overwritten register 2 in the main program, which was being used to store the quantity b.

Horner's method. We can use the multiplication function to evaluate polynomials: given integer coefficients an, ..., a2, a1, a0 and an integer x, evaluate the polynomial p(x) = an xn + ... + a2 x2 + a1 x1 + a0 x0 at the integer x. Polynomial evaluation was one raison d'etre for early machines. It has many applications including studying ballistic motion and converting an integer from its decimal representation to hexadecimal.

The brute force algorithm for polynomial evaluation is to sum up the n+1 terms, where term i is the product of ai and xi. To compute xi we could write a power function that multiplies x by itself i-1 times. Horner's method is a clever alternative that is more efficient and easier to code. The basic idea is to judiciously sequence the way in which terms are multiplied. We can rewrite an order 3 polynomial By distributivity, we obtain:

          p(x) = a3 x3 + a2 x2 + a1 x + a0   =  (((a3) x + a2) x + a1) x + a0

Similarly, we can rewrite an order 5 polynomial

          p(x) = a5 x5 + a4 x4 + a3 x3 + a2 x2 + a1 x + a0   =  (((((a5) x + a4) x + a3) x + a2) x + a1) x + a0

Using Horner's method, only n multiplications are required to evaluate an order n polynomial. Moreover, we can translate the method directly into Java or TOY code. Program horner.toy is the TOY version. In the TOY version, we call our multiply function every time we want to multiply two integers.

Horner's method was published in 19th century by British mathematician W. G. Horner, but the algorithm was used by Isaac Newton over a century earlier.

We can use horner.toy to convert a decimal integer to its hexadecimal representation. To convert 76510 to hex, we set the input x = A, n = 3, a2 = 7, a1 = 6, and a0 = 5. Since all arithmetic is performed in hex, the program computes the hexadecimal equivalent of 7 × 102 + 6 × 10 + 5 = 02FD.

Insertion sort. The program insertion-sort.toy reads in a sequence of positive integers from standard input and insertion sorts them. The program terminates upon reading in a nonpositive integer.

Linked lists.

// data
01: 0001                           the constant 1
02: 00D0                           memory address of first node of linked list

// initialize
10: 8101  R1 <- mem[01]            R1 holds 1 
11: 8202  R2 <- mem[02]            x = pointer to first node of linked list

// traverse the linked list
12: 1421    R4 <- R2 + 1           do {
13: A302    R3 <- mem[R2]             k = x.key  
14: 93FF    print R3                  print k to standard output
15: A204    R2 <- mem[R2 + 1]         x = x.next  
16: D212    if (R2 > 0) goto 12    } while (x != null) 
17: 0000  halt               

// data
D0: 0001    D4: 0004    D8: 0000    DC: 0000
D1: 00D6    D5: 0000    D9: 0000    DD: 0000
D2: 0000    D6: 0002    DA: 0003    DE: 0000
D3: 0000    D7: 00DA    DB: 00D4    DF: 0000

This code will traverse the linked list starting at memory location D0, printing out the integer stored in each "node." It will print out 0001 0002 0003 0004. Throughout the computation R1 is always 1. Indexed addressing is used in instructions A304 and A203 Register R2 is a pointer - it is the memory address of the next node. Register R3 is a pointer to the memory address immediately after R2. At each iteration of the loop we print the contents of the value in memory referenced by R2 and use the value in memory referenced by R3 to determine what memory address R2 will store in the next iteration. For the data given above, register R2 will have the values D0, D6, DA, D4, and 00 in that order. This process is repeated until R2 is 0000, i.e., the end of the linked list. In Java, the keyword null plays the role of 0000 and is used to terminate linked lists.

Recursion. You can also do recursion in TOY, but this is rather tricky.


Exercises

  1. Write a program powers2.toy that prints out all of the positive powers of 2 to standard output.
  2. Given an integer x, its Collatz sequence is defined by replacing it with x/2 if x is even, and 3x + 1 is x is odd, and repeating until x is 1. Write a program collatz.toy thats reads an integer x from standard input and prints its Collatz sequence to standard output. Hint: use right shift to perform integer division by two.
  3. Write a program sum_1-n.toy that reads in an integer N from standard input and prints out the sum 1 + 2 + 3 + ... + N.
  4. Write a program min3.toy that reads in three integers from standard input and print to standard output the smallest one.
  5. Write a program max3.toy that reads in three integers from standard input and print to standard output the largest one.
  6. Write a program sort3.toy that reads in three integers from standard input and print them out to standard output in ascending order.
  7. Write a program chop.toy that reads in an integer N from standard input and prints it out as the sum of powers of 2. For example if N = 012A, then the program should print out

    0002
    0008
    0020
    0100
    

    since 012A = 0002 + 0008 + 0020 + 0100.

  8. This question tests the difference between load address, load, and load indirect. For each of the following TOY programs, give the contents of registers 1, 2, and 3 upon termination.

    (a)  10: 7211        (b)  10: 8211         (c)  10: 7211
         11: 7110             11: 8110              11: A102
         12: 2321             12: 2312              12: 2312
         13: 0000             13: 0000              13: 0000
    
  9. Consider the following TOY program. What is the value of register 3 upon termination?

    10: 7101
    11: 7207
    12: 7301
    13: 1333
    14: 2221
    15: D213
    16: 0000
    
  10. Write a program that reads in one integer a from standard input, and outputs a3.
  11. Write a program that reads in an integer a from standard input and prints AAAA to standard output if
    • a = 3
    • a > 3
    • a < 3
    • a != 3
    • a >= 3
    • a <= 3
  12. Suppose that you load the following into locations 10-17 of TOY, set the PC to 10, and press RUN.

    10: 7100   R[1] <- 0000
    11: 8FFF   read RF
    12: 9F15   mem[15] <- RF
    13: 82FF   read R2
    14: 1112   R1 = R1 + R2
    15: C016   pc <- 16
    16: 91FF   write R1
    17: 0000   halt
    

    What, if anything, is printed to standard output if the following data appear on standard input?

    1112 1112
    
    Answer: The TOY programs reads two values from standard input. The first value is placed into memory location 15 where it is eventually executed as code. This inserts a second add instruction.
  13. Repeat the previous question, but now with the following data on standard input.

    C011 C011 1112 1112
    
  14. Write a program that reads in three integers a, b, and c from standard input, and computes the discriminant d = b2 - 4ac. Use the multiplication function described above.
  15. Suppose that you load the following into locations 10-17 of TOY, set the PC to 10, and press RUN. The program reads in an integer from standard input and prints out a single integer to standard output. List all input values between 0123 and 3210 for which the program print 0000.

    10: 8AFF   read R[A]
    11: 7101   R[1] <- 1
    12: 2BA1   R[B] <- R[A] - 1
    13: 3CAB   R[C] <- R[A] & R[B]
    14: 9CFF   write R[C]
    15: 0000   halt
    

    Answer: 0200 0400 0800 1000 2000. It returns 1 for all inputs that have at most one 1 in their binary representation, i.e, the hexadecimal integers 0000, 0001, 0002, 0004, 0008, 0010, ..., 8000.

  16. Suppose that you load the following data into memory locations 30 through 37 before pressing RUN.

    30-37:  0001  0002  0003  0004  0004  0003  0002  0001
    

    set the PC to 10, and press RUN. What will be the contents of memory locations 30 through 37 after running the program?

    10: 7101  R[1] <- 1                 // always 1
    11: 7A30  R[A] <- 30      a         // base memory address of array a[]
    12: 7B08  R[B] <- 8       N         // the number of elements N in a[]
    13: 130B  R[3] <- R[B]    i = N
    14: C320  
    15: 1400  R[4] <- 0       j = 0
    16: 2543  
    17: C51E
    18: 16A4  
    19: A706  
    1A: 1717  
    1B: B706  
    1C: 1414  
    1D: C016  
    1E: 6331  
    1F: C014
    
  17. Translate the above TOY program into Java code by filling in the ????.
    for (int i = N;  i > ???? ;  i = i ????)
        for (int j = 0;  j < ???? ;  j = j ????)
            a[ ???? ] = ???? ;
    
  18. Suppose that you load the following into locations 10-1F of TOY, load the following data into locations 30-37, Suppose that you load the following data into memory locations 30 through 37 before pressing RUN.

    0001  0002  0003  0004  0004  0003  0002  0001
    

    set the PC to 10, and press RUN. What will be the contents of memory locations 30 through 37 after running the program?

  19. Suppose that you load the following into locations 10-1B of TOY, set the PC to 10, and press RUN. Suppose also that the following data is entered from standard input. What value is printed?

    1CAB EF00 0000 4321 1234
    

    10: 7101   R[1] <- 1
    11: 7230   R[2] <- 30
    12: 8AFF      read R[A]
    13: CA17      if (R[A] == 0) goto 17
    14: BA02      mem[R[2]] <- R[A]
    15: 1221      R[2]++
    16: C012      goto 12
    17: 8AFF   read R[A]
    18: 8BFF   read R[B]
    19: FF30   R[F] <- pc; pc <- 30
    1A: 9CFF   write R[C]
    1B: 0000   halt                 
    

  20. Repeat the previous exercise, but with the following data on standard input.

    2CAB EF00 0000 4321 1234
    
  21. The following table shows the contents for the TOY registers and a section of TOY memory. Suppose that you set the program counter to 30 and hit run. What, if anything, is printed to standard output? List the final contents of registers 2 and 3 upon termination.

    R0: 0000 0000 0000 0000 0000 0000 0000 0000
    R8: 0000 0000 0000 0000 0000 0011 0000 0000
    
    20: 0000 0000 0000 0000 0000 0000 0000 0000
    28: 0000 005A 0000 0000 0000 0000 0000 0000
    30: 7101 7200 8329 1221 1331 A303 D333 92FF
    38: 0000 0000 0000 0000 0000 0000 0000 0000
    40: 7101 7200 8329 A403 1224 1331 A303 D343
    48: 92FF 0000 0000 0000 0000 0000 0000 0000
    50: 0003 0000 0005 0000 0004 0052 0000 0000
    58: 0001 0060 0000 0058 0000 0000 0000 0000
    60: 0002 0050 0000 0000 0000 0000 0000 0000
    
  22. Suppose that the data for memory locations D0 through E0 is as follows. What is the result of running linked.toy? Answer: 1 2 3 4 5 6 7.
  23. Change one word of memory in the previous exercise so that it prints out 1 2 6 7 instead of 1 2 3 4 5 6 7. (linked list deletion)
  24. Change three words of memory (overwriting one, and using two more) so that it prints out 1 2 3 4 8 5 6 7. (linked list insertion)

Creative Exercises

  1. Maximum. Write a program max.toy that reads in a sequence of nonnegative integers from standard input and prints out the maximum one. Stop reading in integers as soon as you encounter a negative integer.
  2. Magic swap. Write a TOY code fragment that swaps the contents of registers A and B, without writing to main memory or any other registers. Hint: use the XOR instruction.
  3. 32-bit integers. Represent a 32-bit two's complement integer in TOY with two consecutive words of memory or registers (big endian or little endian). Explain how to add two 32-bit integers.
  4. Gray codes. Write a TOY program graycode.toy that reads in an integer n (between 1 and 15) from standard input and then prints out (i >> 1) ^ i to standard output for i = 2n - 1 through 0. The resulting sequence is called a Gray code of order n. See Exercise XYZ in Section 5.1.

    The Visual X-TOY Simulator uses the LCD display to show standard output. It is instructive to watch the alternate standard output (the tape punch card) and see the individual bits (8 per row).

  5. Greatest common divisor. Write a program gcd.toy that reads in two integers from standard input and prints their greatest common divisor to standard output. Write a function that assumes that registers A and B contain two input integers, output the value in register C, and returns to the address stored in register F. You may use the following (inefficient) algorithm:

    while (b > 0) {
       if (b > a) swap a and b
       a = a - b
    }
    return a
    
  6. One-time pad. Implement a one-time pad in TOY to encrypt and decrypt 256 bit messages. Assume that the key is stored in memory location 30 - 3F and that the input consists of sixteen 16-bit integers.
  7. Dot product. Compute the dot product of two arrays, which start at locations RA and RB and have length RC.
  8. Axpy. Given a scalar a, a vector x, and a vector b, compute the vector ax + b.
  9. Find the singleton number. Suppose that a sequence 2N+1 16-bit integers appears on standard input such that N integers appear exactly twice and one integer appears only once. Write a TOY program to find the singelton integer.

    Hint: XOR all of the integers together.