15 Interrupts and Exceptions

Thus far in this book all programs have been executed under the Linux operating system. An operating system (OS) can be viewed as a set of programs that provide services to application programs. These services allow the application programs to use the hardware, but only under the auspices of the OS.

Linux allows multiple programs to be executing concurrently, and each of the programs is accessing the hardware resources of the computer. One of the jobs of the OS is to manage the hardware resources in such a way that the programs do not interfere with one another. In this chapter we introduce the CPU features that enable Linux to carry out this management task.

The read system call is a good example of a program using the services of the OS. It requests input from the keyboard. The OS handles all input from the keyboard, so the read function must ﬁrst request keyboard input from the OS. One of the reasons this request must be funneled through the OS is that other programs may also be requesting input from the keyboard, and the OS needs to ensure that each program gets the keyboard input intended for it.

Once the request for input has been made, it would be very ineﬃcient for the OS to wait until a user strikes a key. So the OS allows another program to use the CPU, and the keyboard notiﬁes the OS when a key has been struck. To avoid losing a character, this notiﬁcation interrupts the CPU so that the OS can read the character from the keyboard.

Another example comes from something you probably did not intend to do. Unless you are a perfect programmer, you have probably seen a “segmentation fault.” This can occur when your program attempts to access memory that has not been allocated for your program. I have gotten this error (yes, I still make programming mistakes!) when I have made a mistake using the stack, or when I dereference a register that contains a bad address.

In response to any of these events, the CPU performs an operation that is very similar to the call instruction. The value in the rip register is pushed onto the stack, and another address is placed in the rip register. The net eﬀect is that a function is called, just as in the call instruction, but the address of the called function is speciﬁed in a diﬀerent way, and additional information is pushed onto the stack. Before describing the diﬀerences, we discuss what ought to occur in order for the OS to deal with each of these events.

15.1 Hardware Interrupts

Keyboard input is a good place to start the discussion. It is impossible to know exactly when someone will strike a key on the keyboard, nor how soon the next key will be struck. For example, if a key is struck in the middle of executing the ﬁrst of the following two instructions

in order to avoid losing the keystroke, we would like to read the character immediately after the cmpb instruction is executed but before the CPU starts working on the je instruction.

The function that reads the character from the keyboard is called an interrupt handler or simply handler. Handlers are part of the OS. In Linux they can either be built into the kernel or loaded as separate modules as needed.

The timing — between the two instructions — means that the CPU will acknowledge an interrupt only between instruction execution cycles. Just before executing the je instruction the rip register has the address of the instruction, and it is that address that gets pushed onto the stack. That is, since calling a handler occurs automatically and does not involve fetching an instruction, the current value of the rip pushed onto the stack is the correct return address from the handler.

There is another important issue. It is almost certain that the rflags register will be changed by the handler that gets called. When program control returns to the je instruction (which is supposed to depend on the state of the rflags register as a result of executing the cmpb instruction), there is little chance that the program will do what the programmer intended. Thus we conclude that in addition to saving the rip register,

The next issue is the question of how the CPU knows the address of the appropriate handler to call. In the call instruction, the address of the function to call is speciﬁed as an operand to the instruction. For example,

Since the keyboard has no knowledge of the software, there must be some other mechanism for specifying the address of the handler to call. The answer to this problem is that addresses of interrupt handlers are stored in an Interrupt Descriptor Table (IDT). Each possible interrupt in the system is associated with a unique entry in the IDT.

The IDT table entries are data structures (128 bits in 64-bit mode, 64 bits in 32-bit mode) called gate descriptors. In addition to the handler address, they contain information that the CPU uses to help protect the integrity of the OS.

After it has completed execution of the current instruction, the following actions must occur when there is an interrupt from a device external external to the CPU:

15.2 Exceptions

We next consider exceptions. These are typically the result of a number that the CPU cannot deal with. Examples are

In a perfect world, the application software would include all the checks that would prevent the occurrence of many of these errors. The reality is that no program is perfect, so some of these errors will occur.

When they do occur, it is the responsibility of the OS to take an appropriate action. The currently executing instruction may have caused the exception to occur. So the CPU often reacts to an exception in the midst of a normal instruction execution cycle. The actions that the CPU must take in response to an exception are essentially the same as those for an interrupt:

Not all exceptions are due to actual program errors. For example, when a program references an address in another part of the program that has not yet been loaded into memory, it causes a page fault exception. The OS must provide a handler that loads the appropriate part of the program from the disk into memory, then continues with normal program execution.

15.3 Software Interrupts

The usefulness of the interrupt/exception handling mechanism for requesting OS services is not apparent until we discuss privilege levels. As mentioned above, one of the jobs of the OS is to keep concurrently executing programs from interfering with one another. It uses the privilege level mechanism in the CPU to do this.

At any given time, the CPU is running in one of four possible privilege levels. The levels, from most privileged to least, are:

0	Provides direct access to all hardware resources. Restricted to the lowest-level operating system functions, e.g., BIOS, memory management.
1	Somewhat restricted access to hardware resources. Might be used by library routines and software that controls I/O devices.
2	More restricted access to hardware resources. Might be used by library routines and software that controls I/O devices.
3	No direct access to hardware resources. Applications programs run at this level.

The OS needs to have direct access to all the hardware, so it executes at privilege level 0. Application programs should be limited, so they execute at privilege level 3. The CPU includes a mechanism for recognizing the privilege level of the memory associated with each program. A program can access memory at a lower privilege level, but not at a higher level. Thus, an application program (running at level 3) cannot access memory that belongs to the OS.

Gate descriptors include privilege level information in addition to the handler address. The CPU’s interrupt/exception mechanism automatically switches to this privilege level when it calls the handler function. Thus, for example, the keyboard might interrupt during the execution of an application program running at privilege level 3, but its handler function would execute at privilege level 0.

The software interrupt allows an application program to use OS services while still allowing the OS to control this access. The instruction is

to make system calls. The code corresponding to the desired action is loaded into eax and the arguments are loaded into the proper registers before the system call is executed. The recommended technique for making system calls is discussed in Section 15.6 on page 875.

15.4 CPU Response to an Interrupt or Exception

Each entry in the IDT is called a vector. The CPU is hardwired to associate vectors 0 – 31 with speciﬁc exceptions. For example, vector number 0 represents a divide-by-zero exception. Vector number 14 is a page fault exception.

Vectors 32 – 255 can be assigned to interrupts, both external and the int $n instruction. These assignments are determined by the OS programmers.

During OS initialization, the address of a handler function is stored in the gate descriptor corresponding to the vector number it is designed to handle. Other information in the gate descriptor causes the CPU to switch to a higher (numerically lower) privilege level, so the handler function has appropriate access to the hardware.

Whenever an interrupt or exception occurs the CPU executes an exception processing cycle, which consists of the following actions:

The CPU continues with a normal instruction processing cycle — fetch the instruction at the address in rip, etc. Thus, control will transfer to the handler function.

Depending upon the nature of an exception and what actually caused it, CPU execution may or may not be returned to the program that was executing when the exception occurred.

15.5 Return from Interrupt/Exception

There is one more part of this puzzle. Since the ret instruction simply pops the value at the top of the stack into the rip register, it will not work for the OS’s handler function. The CPU has another instruction

that correctly pops everything oﬀ the stack into the rip and rflags registers and restores the privilege level to where it was before the handler function was invoked. (The privilege level information was also stored on the stack.)

15.6 The syscall and sysret Instructions

Using a software interrupt to invoke one of the services provided by the OS is somewhat of an overkill. The x86-64 architecture includes another instruction that causes the CPU to change priority levels but not use the stack nor go through the IDT table, thus saving execution time. The instruction is

Now the CPU has been switched to privilege level 0, and the OS has control and can enforce orderly use of the hardware.

The program in Listing 15.1 illustrates the use of syscall to do system calls without using the C libraries. See Exercise 15-1 for using syscall within the C runtime environment.

1# myCat.s
2# Writes a file to standard out
3# Does not use C libraries
4# Bob Plantz -- 18 June 2009
5
6# Useful constants
7        .equ    STDIN,0
8        .equ    STDOUT,1
9   # from asm/unistd_64.h
10        .equ    READ,0
11        .equ    WRITE,1
12        .equ    OPEN,2
13        .equ    CLOSE,3
14        .equ    EXIT,60
15   # from bits/fcntl.h
16        .equ    O_RDONLY,0
17        .equ    O_WRONLY,1
18        .equ    O_RDWR,3
19# Stack frame
20        .equ    aLetter,-16
21        .equ    fd, -8
22        .equ    localSize,-16
23        .equ    fileName,24
24# Code
25        .text                  # switch to text segment
26        .globl  __start
27        .type   __start, @function
28__start:
29        pushq   %rbp           # save caller’s frame pointer
30        movq    %rsp, %rbp     # establish our frame pointer
31        addq    $localSize, %rsp   # for local variable
32
33        movl    $OPEN, %eax        # open the file
34        movq    fileName(%rbp), %rdi # the filename
35        movl    $O_RDONLY, %esi    # read only
36        syscall
37        movl    %eax, fd(%rbp)     # save file descriptor
38
39        movl    $READ, %eax
40        movl    $1, %edx           # 1 character
41        leaq    aLetter(%rbp), %rsi # place to store character
42        movl    fd(%rbp), %edi     # standard in
43        syscall                    # request kernel service
44
45writeLoop:
46        cmpl    $0, %eax           # any chars?
47        je      allDone            # no, must be end of file
48        movl    $1, %edx           # yes, 1 character
49        leaq    aLetter(%rbp), %rsi # place to store character
50        movl    $STDOUT, %edi      # standard out
51        movl    $WRITE, %eax
52        syscall                    # request kernel service
53
54        movl    $READ, %eax        # read next char
55        movl    $1, %edx           # 1 character
56        leaq    aLetter(%rbp), %rsi # place to store character
57        movl    fd(%rbp), %edi     # standard in
58        syscall                    # request kernel service
59        jmp     writeLoop          # check the char
60allDone:
61        movl    $CLOSE, %eax       # close the file
62        movl    fd(%rbp), %edi     # file descriptor
63        syscall                    # request kernel service
64        movq    %rbp, %rsp         # delete local variables
65
66        popq    %rbp               # restore caller’s frame pointer
67        movl    $EXIT, %eax        # end this process
68        syscall

Listing 15.1: Using syscall to cat a ﬁle. Use “ld -e __start -o myCat myCat.o” after assembling this ﬁle.

In Section 8.1 (page 542) we saw how to call the write system call function to write characters to standard out (the screen). write and the other system call functions are simply C wrappers that load the proper code in eax and the arguments into the appropriate registers.

For additional system call codes see the unistd_64.h ﬁle on your system. The arguments for each system call are given in the man page for the corresponding C version. For example,

There is a complementary instruction, sysret, which the OS executes in order to return from a system call:

15.7 Summary

We summarize the diﬀerences between a call instruction and an interrupt/exception. The similarities are

15.8 Instructions Introduced Thus Far

This summary shows the assembly language instructions introduced thus far in the book. The page number where the instruction is explained in more detail, which may be in a subsequent chapter, is also given. This book provides only an introduction to the usage of each instruction. You need to consult the manuals ([2] – [6], [14] – [18]) in order to learn all the possible uses of the instructions.

15.8.1 Instructions

data movement:
opcode	source	destination	action	page

cbtw			convert byte to word, al → ax	696

cwtl			convert word to long, ax → eax	696

cltq			convert long to quad, eax → rax	696

cwtd			convert word to long, ax → dx:ax	786

cltd			convert long to quad, eax → edx:eax	786

cqto			convert quad to octuple, rax → rdx:rax	786

cmovcc	%reg/mem	%reg	conditional move	706

movs	$imm/%reg	%reg/mem	move	506

movs	mem	%reg	move	506

movsss	$imm/%reg	%reg/mem	move, sign extend	693

movzss	$imm/%reg	%reg/mem	move, zero extend	693

popw		%reg/mem	pop from stack	566

pushw	$imm/%reg/mem		push onto stack	566


s = b, w, l, q; w = l, q; cc = condition codes

arithmetic/logic:
opcode	source	destination	action	page

adds	$imm/%reg	%reg/mem	add	607

adds	mem	%reg	add	607

ands	$imm/%reg	%reg/mem	bit-wise and	747

ands	mem	%reg	bit-wise and	747

cmps	$imm/%reg	%reg/mem	compare	676

cmps	mem	%reg	compare	676

decs	%reg/mem		decrement	699

divs	%reg/mem		unsigned divide	777

idivs	%reg/mem		signed divide	784

imuls	%reg/mem		signed multiply	775

incs	%reg/mem		increment	698

leaw	mem	%reg	load eﬀective address	579

muls	%reg/mem		unsigned multiply	769

negs	%reg/mem		negate	789

ors	$imm/%reg	%reg/mem	bit-wise inclusive or	747

ors	mem	%reg	bit-wise inclusive or	747

sals	$imm/%cl	%reg/mem	shift arithmetic left	756

sars	$imm/%cl	%reg/mem	shift arithmetic right	751

shls	$imm/%cl	%reg/mem	shift left	756

shrs	$imm/%cl	%reg/mem	shift right	751

subs	$imm/%reg	%reg/mem	subtract	612

subs	mem	%reg	subtract	612

tests	$imm/%reg	%reg/mem	test bits	676

tests	mem	%reg	test bits	676

xors	$imm/%reg	%reg/mem	bit-wise exclusive or	747

xors	mem	%reg	bit-wise exclusive or	747


s = b, w, l, q; w = l, q

program ﬂow control:
opcode	location	action	page

call	label	call function	546

iret		return from kernel function	875

ja	label	jump above (unsigned)	683

jae	label	jump above/equal (unsigned)	683

jb	label	jump below (unsigned)	683

jbe	label	jump below/equal (unsigned)	683

je	label	jump equal	679

jg	label	jump greater than (signed)	686

jge	label	jump greater than/equal (signed)	686

jl	label	jump less than (signed)	686

jle	label	jump less than/equal (signed)	686

jmp	label	jump	691

jne	label	jump not equal	679

jno	label	jump no overﬂow	679

jcc	label	jump on condition codes	679

leave		undo stack frame	580

ret		return from function	583

syscall		call kernel function	587

sysret		return from kernel function	880


cc = condition codes

x87 ﬂoating point:
opcode	source	destination	action	page

fadds	memﬂoat		add	859

faddp			add/pop	859

fchs			change sign	859

fcoms	memﬂoat		compare	859

fcomp			compare/pop	859

fcos			cosine	859

fdivs	memﬂoat		divide	859

fdivp		divide/pop	859

filds	memint		load integer	859

fists		memint	store integer	859

flds	memﬂoat		load ﬂoating point	859

fmuls	memﬂoat		multiply	859

fmulp			multiply/pop	859

fsin			sine	859

fsqrt			square root	859

fsts		memint	ﬂoating point store	859

fsubs	memﬂoat		subtract	859

fsubp			subtract/pop	859


s = b, w, l, q; w = l, q

SSE ﬂoating point conversion:
opcode	source	destination	action	page

cvtsd2si	%xmmreg/mem	%reg	scalar double to signed integer	845

cvtsd2ss	%xmmreg	%xmmreg/%reg	scalar double to single ﬂoat	845

cvtsi2sd	%reg	%xmmreg/mem	signed integer to scalar double	845

cvtsi2sdq	%reg	%xmmreg/mem	signed integer to scalar double	845

cvtsi2ss	%reg	%xmmreg/mem	signed integer to scalar single	845

cvtsi2ssq	%reg	%xmmreg/mem	signed integer to scalar single	845

cvtss2sd	%xmmreg	%xmmreg/mem	scalar single to scalar double	845

cvtss2si	%xmmreg/mem	%reg	scalar single to signed integer	845

cvtss2siq	%xmmreg/mem	%reg	scalar single to signed integer	845

15.8.2 Addressing Modes

register direct:	The data value is located in a CPU register.
	syntax: name of the register with a “%” preﬁx.
	example: movl %eax, %ebx

immediate data:	The data value is located immediately after the instruction. Source operand only.
	syntax: data value with a “$” preﬁx.
	example: movl $0xabcd1234, %ebx

base register plus oﬀset:	The data value is located in memory. The address of the memory location is the sum of a value in a base register plus an oﬀset value.
	syntax: use the name of the register with parentheses around the name and the oﬀset value immediately before the left parenthesis.
	example: movl $0xaabbccdd, 12(%eax)

rip-relative:	The target is a memory address determined by adding an oﬀset to the current address in the rip register.
	syntax: a programmer-deﬁned label
	example: je somePlace

indexed:	The data value is located in memory. The address of the memory location is the sum of the value in the base_register plus scale times the value in the index_register, plus the oﬀset.
	syntax: place parentheses around the comma separated list (base_register, index_register, scale) and preface it with the oﬀset.
	example: movl $0x6789cdef, -16(%edx, %eax, 4)

function	eax	rdi	rsi	rdx	returns

read	0	ﬁle descriptor	pointer to	number of	number of
	0		storage area	bytes to read	bytes read

write	1	ﬁle descriptor	pointer to	number of	number of
	1		ﬁrst byte	bytes to write	bytes written

open	2	pointer to	ﬂags	mode	ﬁle descriptor
open	2	ﬁlename

close	3	ﬁle descriptor

exit	60

Chapter 15
Interrupts and Exceptions

15.1 Hardware Interrupts

15.2 Exceptions

15.3 Software Interrupts

15.4 CPU Response to an Interrupt or Exception

15.5 Return from Interrupt/Exception

15.6 The syscall and sysret Instructions

15.7 Summary

15.8 Instructions Introduced Thus Far

15.8.1 Instructions

15.8.2 Addressing Modes

15.9 Exercises

Chapter 15Interrupts and Exceptions

15.1 Hardware Interrupts

15.2 Exceptions

15.3 Software Interrupts

15.4 CPU Response to an Interrupt or Exception

15.5 Return from Interrupt/Exception

15.6 The syscall and sysret Instructions

15.7 Summary

15.8 Instructions Introduced Thus Far

15.8.1 Instructions

15.8.2 Addressing Modes

15.9 Exercises

Chapter 15
Interrupts and Exceptions