Chapter 16

In this chapter we discuss the I/O subsystem. The I/O subsystem is the means by which the CPU communicates with the outside world. By “outside world” we mean devices other than the CPU and memory.

As you have learned, the CPU executes instructions, and memory provides a place to store data and instructions. Most programs read data from one or more input devices, process the data, then write the results to one or more output devices.

Typical input devices are keyboards and mice. Common output devices are display screens and printers. Although most people do not think of them as such, magnetic disks, CD drives, etc. are considered as I/O devices. It may be a little more obvious that a connection with the internet is also seen as I/O. The reasons will become clearer in this chapter, where we discuss how I/O devices are programmed.

16.1 Memory Timing

Since the CPU accesses I/O devices via the same buses as memory (see Figure 1.1, page 9), it might seem that the CPU could access the I/O devices in the same way as memory. That is, it might seem that I/O could be performed by using the movb instruction to transfer bytes of data between the CPU and the specific I/O device. This can be done with many devices, but there are other issues that must be taken into account in order to make it work correctly. One of the main issues lies in the timing differences between memory and I/O. Before tackling the I/O timing issues, let us consider memory timing characteristics.

Aside: As pointed out in Section 1.2 (page 10), the three-bus description given here shows the logical interaction between the CPU and I/O. Most modern general purpose computers employ several types of buses. The way in which the CPU connects to the various buses is handled by hardware controllers. A programmer generally deals only with the logical view.

Two types of RAM are commonly used in PCs.

Most of the memory in a PC is DRAM because it is much less expensive and smaller than SRAM. Of course, each instruction must be fetched from memory, so slow memory access would limit CPU speed. This problem is solved by using cache systems made from SRAM.

A cache is a small amount of fast memory placed between the CPU and main memory. When the CPU needs to access a byte in main memory, that byte, together with several surrounding bytes, are copied into the cache memory. There is a high probability that the surrounding bytes will be accessed soon, and the CPU can work with the values in the much faster cache. This is handled by the system hardware. See [28] and [31] for more details.

Modern CPUs include cache memory on the same chip, which can be accessed at CPU speeds. Even small cache systems are very effective in speeding up memory access. For example, the CPU in my desktop system (built in 2005) has 64 KB of Level 1 instruction cache, 64 KB of Level 1 data cache, and 512 KB of Level 2 cache (both instructions and data). In contrast, most of the memory in the system consists of 1 GB of DDR 400 memory.

The important point here is that memory is matched to the CPU by the hardware. Very seldom is memory access speed a programming issue.

Aside: There are some cases where knowing how to manipulate memory caches can speed up execution time. The x86 has instructions for working directly with cache. Optimizing cache usage is an advanced topic beyond the scope of this book.

16.2 I/O Device Timing

I/O devices are much slower than memory. Consider a common input device, the keyboard. Typing at 120 words per minute is equivalent to 10 characters per second, or 100 milliseconds between each character. A CPU running at 2 GHz can execute approximately 200 million instructions during that time. And the time intervals between keystrokes are very inconsistent. Many will be much longer than this.

Even a magnetic disk is very slow compared to memory. What if the byte that needs to be read has just passed under the read/write head on a disk that is rotating at 7200 RPM? The system must wait for a full revolution of the disk, which takes 8.33 milliseconds. Again, there is a great deal of variability in the rotational delay between reads from the disk.

In addition to being much slower, I/O devices exhibit much more variance in their timing. Some people type very fast on a keyboard, some very slow. The required byte on a magnetic disk might be just coming up to the read/right head, or it may have just passed. We need a mechanism to determine whether an input device has a byte ready for our program to read, and whether an output device is ready to accept a byte that is sent to it.

16.3 Bus Timing

Thus far in this book buses have been shown simply as wires connecting the subsystems. Since more than one device is connected to the same wires, the devices must follow a protocol for deciding which two devices can use the bus at any given time. There are many protocols in use, which fall into one of two types:

— data transfer is controlled by a clock signal. Typically, a centralized bus controller generates the clock signal, which is sent on a separate control line in the bus.
— data transfer is controlled by a “handshaking” exchange between the two devices. Many asynchronous protocols are handled by the devices themselves over the data and address lines in the bus.

Modern computer systems employ both types of buses. A typical PC arrangement is shown in Figure 16.1.

Figure 16.1: Typical bus controllers in a modern PC. The Memory Controller is often called the North Bridge; it provides synchronous communication with main memory and the graphics interface. The I/O Controller is often called the South Bridge; it provides asynchronous communication with the several types of buses that connect to I/O devices.

16.4 I/O Interfacing

In addition to a very wide range in their timing, there is an enormous range of I/O devices that are commonly attached to computers, which differ greatly in how they handle data. A mouse provides position information. A monitor displays graphic information. Most computers have speakers connected to them. Ultimately, the CPU must be able to communicate with I/O devices in bit patterns at the speed of the device.

The hardware between the CPU and the actual I/O device consists of two subsystems — the controller and the interface. The controller is the portion that works directly with the device. For example, a keyboard controller detects which keys are pressed and converts this to a code. It also detects whether a key is pressed or not. A disk controller moves the read/write head to the requested track. It then detects the sector number and waits until the requested sector comes into position. Some very simple devices do not need a controller.

The interface subsystem provides registers that the CPU can read from or write to. An I/O device is programmed through the interface registers. In general, the following types of registers are provided:

It is common for one register to provide multiple functionality. For example, there may be one register for transmitting and receiving, its functionality depending on whether the CPU writes to or reads from the register. And it is common for an interface to have more than one register of the same type, especially control registers.

16.5 I/O Ports

The CPU communicates with an I/O device through I/O ports. The specific port is specified by a value on the address bus. There are two ways to distinguish an I/O port address from a physical memory address:

With isolated I/O, the I/O ports can be numbered from 0x0000 to 0xffff. This address space is separate from the physical memory address space. Instructions are provided for accessing the I/O address space. The distinction between the two addressing spaces is made in the control bus.

One instruction to perform input is:

ins source, destination

where s denotes the size of the operand:

s meaning number of bits
b byte 8
w word 16
l longword 32
q quadword 64

Intel®; Syntax

in destination, source

The in instruction moves data from the I/O port specified by the source into the register specified by the destination. The source operand can be either an immediate value, or a value in the dx register. The destination must be al, ax, or eax, consistent with the operand size. For example, the instruction

     inb     $4, %al

reads I/O port number 4, placing the value in the al register.

An instruction to perform output is:

outs source, destination

where s denotes the size of the operand:

s meaning number of bits
b byte 8
w word 16
l longword 32
q quadword 64

Intel®; Syntax

out destination, source

The out instruction moves data to the I/O port specified by the destination from the register specified by the source. The destination operand can be either an immediate value, or a value in the dx register. The source must be al, ax, or eax, consistent with the operand size. For example, the instruction

     outb    %al, $6

writes the value in the al register to I/O port number 6.

16.6 Programming Issues

One of the primary jobs of an operating system is to handle I/O. The software that does this is called a device handler. The operating system coordinates the activities of all the device handlers so that the hardware is utilized in an efficient manner. In Linux, a device handler may either be compiled into the kernel or in a separate module that is loaded into memory only if needed.

Thus, programming I/O devices generally means changing the operating system kernel. This can be done, but it requires considerably more knowledge than is provided in this book. It is possible to give user applications permission to directly access specific I/O devices, but this can produce disastrous results, especially in a multi-user environment.

We will not do any direct I/O programming in this book, but we will look at the general concepts. Listing 16.1 sketches the general algorithms in C. The code was abstracted from some I/O routines that work with a Dual Asynchronous Universal Receiver/Transmitter (DUART) on a single board computer. It is incomplete code and does not run on any known computer, but it illustrates the basic concepts.

This example uses memory-mapped I/O. The program calls three functions:

We will examine what each does.

2 * io_sketch_mm.c 
3 * This code sketches the algorithms to initialize 
4 * a DUART, read one character and echo it using 
5 * isolated I/O. 
6 * WARNING: This code does not run on any known 
7 *          device. It is meant to sketch some 
8 *          general I/O concepts only. 
9 * Bob Plantz - 18 June 2009 
10 */ 
12/* register offsets */ 
13#define MR  0x01   /* mode register */ 
14#define SR  0x03   /* status register */ 
15#define CSR 0x03   /* clock select register */ 
16#define CR  0x05   /* command register */ 
17#define RR  0x07   /* receiver register */ 
18#define TR  0x07   /* transmitter register */ 
19#define ACR 0x09   /* auxiliary control register */ 
20#define IMR 0x0B   /* interrupt mask register */ 
22/* status bits */ 
23#define RxRDY 1   /* receiver ready */ 
24#define TxRDY 4   /* transmitter ready */ 
26/* commands */ 
27#define RESETRECEIVER 0x20 
28#define RESETTRANSMIT 0x30 
29#define RESETERROR    0x40 
30#define RESETMODE     0x10 
31#define TIMER         0xF0 
32#define NOPARITY8BITS 0x13 
33#define STOPBIT2      0x0F 
34#define BAUD19200     0xC 
35#define BAUDRATE BAUD19200+(BAUD19200<<4) 
36#define ENABLE        0x05 
37#define NOINTERRUPT   0x00 
39void init_io(); 
40unsigned char charin(); 
41void charout( unsigned char c ); 
43int main() { 
44   unsigned char aCharacter; 
46   init_io(); 
47   aCharacter = charin(); 
48   charout(aCharacter); 
50   return 0; 
53void init_io() { 
54   unsigned char* port = (unsigned char*) 0xff000; 
56   *(port+CR) = RESETRECEIVER;  /* reset receiver */ 
57   *(port+CR) = RESETTRANSMIT;  /* reset transmitter */ 
58   *(port+CR) = RESETERROR;     /* clear any errors */ 
59   *(port+CR) = RESETMODE;      /* make sure were using MR1 */ 
61   *(port+ACR) = TIMER;         /* baud set 2, crystal divide by 16 */ 
62   *(port+MR) = NOPARITY8BITS;  /* no parity, 8 bits */ 
63   *(port+MR) = STOPBIT2;       /* stop bit length 2.000 */ 
64   *(port+CSR) = BAUDRATE;      /* set baud */ 
65   *(port+IMR) = 0;             /* turn off interrupts */ 
66   *(port+CR) = ENABLE;         /* enable receiver and transmitter */ 
69unsigned char charin() { 
70   unsigned char* port = (unsigned char*) 0xff000; 
71   unsigned char character, status; 
73   do 
74   { 
75      status = *(port+SR); 
76   } while ((status & RxRDY) != 0); 
77   character = *(port+RR); 
78   return character; 
81void charout( unsigned char c ) 
83   unsigned char* port = (unsigned char*) 0xff000; 
84   unsigned char status; 
85   do 
86   { 
87      status = *(port+SR); 
88   } while ((status & TxRDY) != 0); 
89   *(port+TR) = c; 
Listing 16.1: Sketch of basic I/O functions using memory-mapped I/O — C version.

Lines 12 – 37 define symbolic names for values that are used to program the device. Notice that some names have the same value. For example, on lines 17 and 18 the receiver register (RR) and transmitter register (TR) are actually the same register. The CPU receives when it reads from this register and transmits when it writes to it. A similar situation is seen on lines 14 and 15. Reading from register 0x03 provides status information, and the clock selection commands are written to the same register. This illustrates an important point — I/O interface registers are not simply data storage places like CPU registers. It would probably be more accurate to call them “interface ports,” but “registers” is the commonly used terminology.

This example uses memory-mapped I/O, so simple assignment statements are used to access the I/O interface registers. The memory addresses 0xff000 0xff020 are associated with I/O registers for this device instead of physical memory. The base address of the device is assigned to a pointer variable on line 54 in the init io function. Then the commands to initialize the device are written to the appropriate registers on lines 56 – 66. It is not important that you completely understand what this function is doing, but the comments should give you a rough idea.

Lines 56 – 59 assign four different values to the same location:

   *(port+CR) = RESETRECEIVER;  /* reset receiver */ 
   *(port+CR) = RESETTRANSMIT;  /* reset transmitter */ 
   *(port+CR) = RESETERROR;     /* clear any errors */ 
   *(port+CR) = RESETMODE;      /* make sure were using MR1 */

If these were assignment to an actual memory location or to a CPU register, only the final statement would be required. But the Command Register is an I/O interface register. And as described above, it really is not a storage register, even on the I/O interface. In fact, these are four different commands that are sent to the Command Register “port” on the I/O interface.

The order in which commands are sent to the I/O interface may also be important. For example, on this particular device, the sequence on lines 62 – 63

   *(port+MR) = NOPARITY8BITS;  /* no parity, 8 bits */ 
   *(port+MR) = STOPBIT2;       /* stop bit length 2.000 */

must be performed in this order. There are actually two Mode Registers, which are both accessed through the same I/O interface register. The first time the register is accessed, it is connected to Mode Register 1. This access causes the hardware to automatically switch to Mode Register 2 for all subsequent accesses. Now you can understand the reason for sending the “RESETMODE” command to the Command Register on line 59. It’s important to ensure that the first access will be to Mode Register 1.

When compiling I/O functions, it is very important not to use optimization. If you do, the compiler may try coalesce command values into one value. (See Exercise 1.)

The next function is charin(). Its job is to read a character from the DUART. In the lab where this code was used, the DUART receiver was connected to a keyboard. The DUART must wait until somebody presses a key on the keyboard, then convert the code for that key to an eight-bit ASCII code representing the character. When the DUART has a character ready to be read from its receiver register, it sets the “receiver ready” bit in its status register to one. The do-while loop on lines 73 – 76 in charin show how the code must wait for this event.

When the status indicates that a character is ready, line 77 shows how it is read from the receiver register.

The charout() function writes a character to the transmitter. As you might expect, the transmitter was connected to a computer monitor. Although it is clear that keyboard input is very slow, writing on a monitor screen is also slow compared to CPU processing. Thus, we need a similar do-while loop (lines 83 – 88) to wait until the monitor is ready to accept a new character. Once the value provided by the status register shows it is ready, line 89 shows how the character is written to the DUART’s transmitter register.

Listing 16.2 shows the assembly language generated by the gcc compiler for the C program in Listing 16.1. Some comments have been added to explain the general concepts.

1        .file  "io_sketch_mm.c" 
2        .text 
3        .globlmain 
4        .type  main, @function 
6        pushq  %rbp 
7        movq  %rsp, %rbp 
8        subq  $16, %rsp 
9        movl  $0, %eax 
10        call  init_io 
11        movl  $0, %eax 
12        call  charin 
13        movb  %al, -1(%rbp) 
14        movzbl-1(%rbp), %eax 
15        movl  %eax, %edi 
16        call  charout 
17        movl  $0, %eax 
18        leave 
19        ret 
20        .size  main, .-main 
21        .globlinit_io 
22        .type  init_io, @function 
24        pushq  %rbp 
25        movq  %rsp, %rbp 
26        movq  $1044480, -8(%rbp) # initialize pointer variable to 0xff000 
27        movq  -8(%rbp), %rax     # base address of DUART 
28        addq  $5, %rax           # address of command register 
29        movb  $32, (%rax)        # reset receiver 
30        movq  -8(%rbp), %rax 
31        addq  $5, %rax 
32        movb  $48, (%rax)        # reset transmitter 
33        movq  -8(%rbp), %rax 
34        addq  $5, %rax 
35        movb  $64, (%rax)        # reset error 
36        movq  -8(%rbp), %rax 
37        addq  $5, %rax 
38        movb  $16, (%rax)        # reset mode 
39        movq  -8(%rbp), %rax     # base address of DUART 
40        addq  $9, %rax           # address of auxiliary control register 
41        movb  $-16, (%rax)       # baud set, crystal rate 
42        movq  -8(%rbp), %rax 
43        addq  $1, %rax 
44        movb  $19, (%rax) 
45        movq  -8(%rbp), %rax 
46        addq  $1, %rax 
47        movb  $15, (%rax) 
48        movq  -8(%rbp), %rax 
49        addq  $3, %rax 
50        movb  $-52, (%rax) 
51        movq  -8(%rbp), %rax 
52        addq  $11, %rax 
53        movb  $0, (%rax) 
54        movq  -8(%rbp), %rax 
55        addq  $5, %rax 
56        movb  $5, (%rax) 
57        popq  %rbp 
58        ret 
59        .size  init_io, .-init_io 
60        .globlcharin 
61        .type  charin, @function 
63        pushq  %rbp 
64        movq  %rsp, %rbp 
65        movq  $1044480, -8(%rbp) # initialize pointer variable to 0xff000 
67        movq  -8(%rbp), %rax     # base address of DUART 
68        movzbl3(%rax), %eax      # read status register 
69        movb  %al, -10(%rbp)     #   and save locally 
70        movzbl-10(%rbp), %eax 
71        andl  $1, %eax           # check receiver status 
72        testl  %eax, %eax         # if bit is 0 
73        jne    .L5                #   recheck 
74        movq  -8(%rbp), %rax     # receiver ready, get DUART address 
75        movzbl7(%rax), %eax      # read input byte 
76        movb  %al, -9(%rbp)      # store locally 
77        movzbl-9(%rbp), %eax     # return value 
78        popq  %rbp 
79        ret 
80        .size  charin, .-charin 
81        .globlcharout 
82        .type  charout, @function 
84        pushq  %rbp 
85        movq  %rsp, %rbp 
86        movl  %edi, %eax 
87        movb  %al, -20(%rbp) 
88        movq  $1044480, -8(%rbp) # initialize pointer variable to 0xff000 
90        movq  -8(%rbp), %rax     # base address of DUART 
91        movzbl3(%rax), %eax      # read status register 
92        movb  %al, -9(%rbp)      #   and save locally 
93        movzbl-9(%rbp), %eax 
94        andl  $4, %eax           # check transmitter status 
95        testl  %eax, %eax         # if bit is 0 
96        jne    .L8                #   recheck 
97        movq  -8(%rbp), %rax     # transmitter ready, get DUART address 
98        leaq  7(%rax), %rdx      # address of transmitter register 
99        movzbl-20(%rbp), %eax    # load byte to send 
100        movb  %al, (%rdx)        # send it 
101        popq  %rbp 
102        ret 
103        .size  charout, .-charout 
104        .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0" 
105        .section      .note.GNU-stack,"",@progbits
Listing 16.2: Memory-mapped I/O in assembly language. Comments have been added to explain the code.

The comments on line 26 – 41 in the init_io function describe how values are written to the appropriate memory addresses, which are mapped to I/O registers.

Lines 66 – 73 in the charin function make up a loop that waits until the receiver has a character ready to be read. The readiness of the receiver is indicated by bit 2 in the status register. The character is read from that register on line 75. A similar loop is used on lines 89 – 96 in the charout function to wait until the status register shows that the transmitter is ready for another character. When it is ready, the address of the transmitter register is computed on lines 97 – 98, the byte to be sent is loaded into the eax register on line 99, and it is written to the transmitter register on line 100.

As we saw in Section 16.5, special instructions are required to access isolated I/O. The Linux kernel source includes macros to use these instructions. The macros are defined in the file io.h. Listing 16.3 illustrates the use of these macros to write the same program as in Listing 16.1 if the DUART interface were connected to the isolated I/O system.

2 * io_sketch_iso.c 
3 * This code sketches the algorithms to initialize 
4 * a DUART, read one character and echo it using 
5 * isolated I/O. 
6 * WARNING: This code does not run on any known 
7 *          device. It is meant to sketch some 
8 *          general I/O concepts only. 
9 * Bob Plantz - 18 June 2009 
10 */ 
11#include <sys/io.h> 
13/* register offsets */ 
14#define MR  0x01   /* mode register */ 
15#define SR  0x03   /* status register */ 
16#define CSR 0x03   /* clock select register */ 
17#define CR  0x05   /* command register */ 
18#define RR  0x07   /* receiver register */ 
19#define TR  0x07   /* transmitter register */ 
20#define ACR 0x09   /* auxiliary control register */ 
21#define IMR 0x0B   /* interrupt mask register */ 
23/* status bits */ 
24#define RxRDY 1   /* receiver ready */ 
25#define TxRDY 4   /* transmitter ready */ 
27/* commands */ 
28#define RESETRECEIVER 0x20 
29#define RESETTRANSMIT 0x30 
30#define RESETERROR    0x40 
31#define RESETMODE     0x10 
32#define TIMER         0xF0 
33#define NOPARITY8BITS 0x13 
34#define STOPBIT2      0x0F 
35#define BAUD19200     0xC 
36#define BAUDRATE BAUD19200+(BAUD19200<<4) 
37#define ENABLE        0x05 
38#define NOINTERRUPT   0x00 
39#define NOINTERRUPT   0x00 
41void init_io(); 
42unsigned char charin(); 
43void charout( unsigned char c ); 
45int main() { 
46   unsigned char aCharacter; 
48   init_io(); 
49   aCharacter = charin(); 
50   charout(aCharacter); 
52   return 0; 
55void init_io() { 
56   outb(CR, RESETRECEIVER); 
57   outb(CR, RESETTRANSMIT); 
58   outb(CR, RESETERROR); 
59   outb(CR, RESETMODE); 
60   outb(ACR, TIMER); 
61   outb(MR, NOPARITY8BITS); 
62   outb(MR, STOPBIT2); 
63   outb(CSR, BAUDRATE); 
64   outb(IMR, NOINTERRUPT); 
65   outb(CR, ENABLE); 
68unsigned char charin() { 
69   unsigned char character, status; 
71   do 
72   { 
73      status = inb(SR); 
74   } while ((status & RxRDY) != 0); 
75   character = inb(RR); 
76   return character; 
79void charout( unsigned char c ) 
81   unsigned char status; 
82   do 
83   { 
84      status = inb(SR); 
85   } while ((status & TxRDY) != 0); 
86   outb(TR, c); 
Listing 16.3: Sketch of basic I/O functions, isolated I/O — C version.

On line 11 we need to include the file containing the macros:

      #include <sys/io.h>

The use of the outb() macro can be seen in lines 56 – 65. And on line 75 we see the inb() macro being used to read the status register.

The gcc compiler generates assembly language as shown in Listing 16.4

1        .file  "io_sketch_iso.c" 
2        .text 
3        .type  inb, @function     # begin inb function 
5        pushq  %rbp 
6        movq  %rsp, %rbp 
7        pushq  %rbx 
8        movl  %edi, %eax 
9        movw  %ax, -28(%rbp) 
10        movzwl-28(%rbp), %edx 
11        movw  %dx, -30(%rbp) 
12        movzwl-30(%rbp), %edx 
14# 48 "/usr/include/x86_64-linux-gnu/sys/io.h" 1 
15        inb %dx,%al                # read the byte 
16# 0 "" 2 
18        movl  %eax, %ebx 
19        movb  %bl, -9(%rbp) 
20        movzbl-9(%rbp), %eax 
21        popq  %rbx 
22        popq  %rbp 
23        ret 
24        .size  inb, .-inb 
25        .type  outb, @function    # begin outb function 
27        pushq  %rbp 
28        movq  %rsp, %rbp 
29        movl  %edi, %edx 
30        movl  %esi, %eax 
31        movb  %dl, -4(%rbp) 
32        movw  %ax, -8(%rbp) 
33        movzbl-4(%rbp), %eax 
34        movzwl-8(%rbp), %edx 
36# 99 "/usr/include/x86_64-linux-gnu/sys/io.h" 1 
37        outb %al,%dx               # write the byte 
38# 0 "" 2 
40        popq  %rbp 
41        ret 
42        .size  outb, .-outb 
43        .globlmain 
44        .type  main, @function 
46        pushq  %rbp 
47        movq  %rsp, %rbp 
48        subq  $16, %rsp 
49        movl  $0, %eax 
50        call  init_io 
51        movl  $0, %eax 
52        call  charin 
53        movb  %al, -1(%rbp) 
54        movzbl-1(%rbp), %eax 
55        movl  %eax, %edi 
56        call  charout 
57        movl  $0, %eax 
58        leave 
59        ret 
60        .size  main, .-main 
61        .globlinit_io 
62        .type  init_io, @function 
64        pushq  %rbp 
65        movq  %rsp, %rbp 
66        movl  $32, %esi 
67        movl  $5, %edi 
68        call  outb      # outb(CR, RESETRECEIVER); 
69        movl  $48, %esi 
70        movl  $5, %edi 
71        call  outb      # outb(CR, RESETTRANSMIT); 
72        movl  $64, %esi 
73        movl  $5, %edi 
74        call  outb      # outb(CR, RESETERROR); 
75        movl  $16, %esi 
76        movl  $5, %edi 
77        call  outb      # outb(CR, RESETMODE); 
78        movl  $240, %esi 
79        movl  $9, %edi 
80        call  outb      # outb(ACR, TIMER); 
81        movl  $19, %esi 
82        movl  $1, %edi 
83        call  outb      # outb(MR, NOPARITY8BITS); 
84        movl  $15, %esi 
85        movl  $1, %edi 
86        call  outb      # outb(MR, STOPBIT2); 
87        movl  $204, %esi 
88        movl  $3, %edi 
89        call  outb      # outb(CSR, BAUDRATE); 
90        movl  $0, %esi 
91        movl  $11, %edi 
92        call  outb      # outb(IMR, NOINTERRUPT); 
93        movl  $5, %esi 
94        movl  $5, %edi 
95        call  outb      # outb(CR, ENABLE); 
96        popq  %rbp 
97        ret 
98        .size  init_io, .-init_io 
99        .globlcharin 
100        .type  charin, @function 
102        pushq  %rbp 
103        movq  %rsp, %rbp 
104        subq  $16, %rsp 
106        movl  $3, %edi           # address of status register 
107        call  inb                # read status 
108        movb  %al, -2(%rbp) 
109        movzbl-2(%rbp), %eax 
110        andl  $1, %eax           # check receiver status 
111        testl  %eax, %eax         # if bit is 0 
112        jne    .L8                #   recheck 
113        movl  $7, %edi           # ready, address of receiver register 
114        call  inb                # read input byte 
115        movb  %al, -1(%rbp)      # store locally 
116        movzbl-1(%rbp), %eax     # return value 
117        leave 
118        ret 
119        .size  charin, .-charin 
120        .globlcharout 
121        .type  charout, @function 
123        pushq  %rbp 
124        movq  %rsp, %rbp 
125        subq  $24, %rsp 
126        movl  %edi, %eax 
127        movb  %al, -20(%rbp) 
129        movl  $3, %edi           # address of status register 
130        call  inb                # read status 
131        movb  %al, -1(%rbp) 
132        movzbl-1(%rbp), %eax 
133        andl  $4, %eax           # check transmitter status 
134        testl  %eax, %eax         # if bit is 0 
135        jne    .L11               #   recheck 
136        movzbl-20(%rbp), %eax    # load byte to send 
137        movl  %eax, %esi 
138        movl  $7, %edi           # address of transmitter 
139        call  outb               # send it 
140        leave 
141        ret 
142        .size  charout, .-charout 
143        .ident"GCC: (Ubuntu/Linaro 4.7.0-7ubuntu3) 4.7.0" 
144        .section      .note.GNU-stack,"",@progbits
Listing 16.4: Isolated I/O in assembly language. Comments have been added to explain the code.

Looking at lines 3 – 24 and lines 25 – 42, we see that the inb() and outb() macros generate functions. The actual inb instruction is used on line 15 and outb is used on line 37.

At the points where the macros are called in the C source code, the compiler generates calls to the appropriate function. For example, the C sequence

56   outb(CR, RESETRECEIVER); 

generates the assembly language (comments added)

66        movl  $32, %esi 
67        movl  $5, %edi 
68        call  outb      # outb(CR, RESETRECEIVER); 
69        movl  $48, %esi 
70        movl  $5, %edi 
71        call  outb      # outb(CR, RESETTRANSMIT);

16.7 Interrupt-Driven I/O

Reading the code in Section 16.6, you probably realize that the CPU can waste a lot of time simply waiting for I/O devices. Most I/O interfaces include hardware that can send an interrupt signal to the CPU when they have data ready for input or are able to accept output (see Section 15.1, page 874). While waiting for an I/O device, the operating system will suspend the requesting process and allow another process, perhaps being run by another user, to use the CPU.

The device handler for each I/O device that can interrupt includes a special interrupt handler function. The address of each interrupt handler is stored in a table in the operating system. When the requested I/O device is ready for I/O, it sends an interrupt signal to the CPU on the control bus. The device identifies itself to the CPU, and the CPU consults the table to obtain the address of the corresponding interrupt handler. CPU execution control then transfers to the interrupt handler function, which contains code to read from or write to the device as needed. When the interrupt handler function completes its servicing of the I/O device, the last instruction in the function is an iret (see Section 15.5 on page 877). This causes CPU execution control to return to the control flow where it was interrupted.

This is a highly simplified description. The operating system must perform a great deal of “bookkeeping” in this transfer of control. For example, before allowing the interrupt handler function to execute, at least any registers that will be used in the function must be saved. And more than one process may be waiting for I/O to complete. The operating system must keep track of which process is waiting for which I/O device and make sure that the process gets or sends the correct input or output.

Many other issues face the device handler programmer. For example, I/O devices are left to run on their own time, so one device may attempt to interrupt while another device’s interrupt handling function is being executed. The programmer must decide whether the interrupt should be allowed or not. In general, it cannot be ignored because this would cause the loss of I/O data. On the other hand, spending too much time handling the second interrupt may cause the first device to lose data.

16.8 I/O Instructions

opcode source destination page

ins $imm/%reg %reg/mem 893

outs $imm/%reg %reg/mem 894

s = b, w, l, q

16.9 Exercises


16.6) Enter the C program in Listing 16.1. Compile it to the assembly language stage (use the -S option) with different levels of optimization. For example, -O1, -O2. Compare the results with the non-optimized version in Listing 16.2.


16.6) Enter the C program in Listing 16.3. Compile it to the assembly language stage (use the -S option) with different levels of optimization. For example, -O1, -O2. Compare the results with the non-optimized version in Listing 16.4.