PART IV INSTRUCTION SET
PART IV INSTRUCTION SET
Chapter 17 80386 Instruction Set
Chapter 17 80386 Instruction Set
Chapter 17 80386 Instruction Set
----------------------------------------------------------------------------
This chapter presents instructions for the 80386 in alphabetical order. For
each instruction, the forms are given for each operand combination,
including object code produced, operands required, execution time, and a
description. For each instruction, there is an operational description and a
summary of exceptions generated.
17.1 Operand-Size and Address-Size Attributes
17.1 Operand-Size and Address-Size Attributes
When executing an instruction, the 80386 can address memory using either 16
or 32-bit addresses. Consequently, each instruction that uses memory
addresses has associated with it an address-size attribute of either 16 or
32 bits. 16-bit addresses imply both the use of a 16-bit displacement in
the instruction and the generation of a 16-bit address offset (segment
relative address) as the result of the effective address calculation.
32-bit addresses imply the use of a 32-bit displacement and the generation
of a 32-bit address offset. Similarly, an instruction that accesses words
(16 bits) or doublewords (32 bits) has an operand-size attribute of either
16 or 32 bits.
The attributes are determined by a combination of defaults, instruction
prefixes, and (for programs executing in protected mode) size-specification
bits in segment descriptors.
17.1.1 Default Segment Attribute
17.1.1 Default Segment Attribute
For programs executed in protected mode, the D-bit in executable-segment
descriptors determines the default attribute for both address size and
operand size. These default attributes apply to the execution of all
instructions in the segment. A value of zero in the D-bit sets the default
address size and operand size to 16 bits; a value of one, to 32 bits.
Programs that execute in real mode or virtual-8086 mode have 16-bit
addresses and operands by default.
17.1.2 Operand-Size and Address-Size Instruction Prefixes
17.1.2 Operand-Size and Address-Size Instruction Prefixes
The internal encoding of an instruction can include two byte-long prefixes:
the address-size prefix, 67H, and the operand-size prefix, 66H. (A later
section, "Instruction Format," shows the position of the prefixes in an
instruction's encoding.) These prefixes override the default segment
attributes for the instruction that follows. Table 17-1 shows the effect of
each possible combination of defaults and overrides.
See Also: Tab.17-1
17.1.3 Address-Size Attribute for Stack
17.1.3 Address-Size Attribute for Stack
Instructions that use the stack implicitly (for example: POP EAX also have
a stack address-size attribute of either 16 or 32 bits. Instructions with a
stack address-size attribute of 16 use the 16-bit SP stack pointer register;
instructions with a stack address-size attribute of 32 bits use the 32-bit
ESP register to form the address of the top of the stack.
The stack address-size attribute is controlled by the B-bit of the
data-segment descriptor in the SS register. A value of zero in the B-bit
selects a stack address-size attribute of 16; a value of one selects a stack
address-size attribute of 32.
See Also: Tab.17-1
17.2 Instruction Format
17.2 Instruction Format
All instruction encodings are subsets of the general instruction format
shown in Figure 17-1. Instructions consist of optional instruction
prefixes, one or two primary opcode bytes, possibly an address specifier
consisting of the ModR/M byte and the SIB (Scale Index Base) byte, a
displacement, if required, and an immediate data field, if required.
Smaller encoding fields can be defined within the primary opcode or
opcodes. These fields define the direction of the operation, the size of the
displacements, the register encoding, or sign extension; encoding fields
vary depending on the class of operation.
Most instructions that can refer to an operand in memory have an addressing
form byte following the primary opcode byte(s). This byte, called the ModR/M
byte, specifies the address form to be used. Certain encodings of the ModR/M
byte indicate a second addressing byte, the SIB (Scale Index Base) byte,
which follows the ModR/M byte and is required to fully specify the
addressing form.
Addressing forms can include a displacement immediately following either
the ModR/M or SIB byte. If a displacement is present, it can be 8-, 16- or
32-bits.
If the instruction specifies an immediate operand, the immediate operand
always follows any displacement bytes. The immediate operand, if specified,
is always the last field of the instruction.
The following are the allowable instruction prefix codes:
F3H REP prefix (used only with string instructions)
F3H REPE/REPZ prefix (used only with string instructions
F2H REPNE/REPNZ prefix (used only with string instructions)
F0H LOCK prefix
The following are the segment override prefixes:
2EH CS segment override prefix
36H SS segment override prefix
3EH DS segment override prefix
26H ES segment override prefix
64H FS segment override prefix
65H GS segment override prefix
66H Operand-size override
67H Address-size override
See Also: Fig.17-1
17.2.1 ModR/M and SIB Bytes
17.2.1 ModR/M and SIB Bytes
The ModR/M and SIB bytes follow the opcode byte(s) in many of the 80386
instructions. They contain the following information:
* The indexing type or register number to be used in the instruction
* The register to be used, or more information to select the instruction
* The base, index, and scale information
The ModR/M byte contains three fields of information:
* The mod field, which occupies the two most significant bits of the
byte, combines with the r/m field to form 32 possible values: eight
registers and 24 indexing modes
* The reg field, which occupies the next three bits following the mod
field, specifies either a register number or three more bits of opcode
information. The meaning of the reg field is determined by the first
(opcode) byte of the instruction.
* The r/m field, which occupies the three least significant bits of the
byte, can specify a register as the location of an operand, or can form
part of the addressing-mode encoding in combination with the field as
described above
The based indexed and scaled indexed forms of 32-bit addressing require the
SIB byte. The presence of the SIB byte is indicated by certain encodings of
the ModR/M byte. The SIB byte then includes the following fields:
* The ss field, which occupies the two most significant bits of the
byte, specifies the scale factor
* The index field, which occupies the next three bits following the ss
field and specifies the register number of the index register
* The base field, which occupies the three least significant bits of the
byte, specifies the register number of the base register
Figure 17-2 shows the formats of the ModR/M and SIB bytes.
The values and the corresponding addressing forms of the ModR/M and SIB
bytes are shown in Tables 17-2, 17-3, and 17-4. The 16-bit addressing
forms specified by the ModR/M byte are in Table 17-2. The 32-bit addressing
forms specified by ModR/M are in Table 17-3. Table 17-4 shows the 32-bit
addressing forms specified by the SIB byte
See Also: Fig.17-2 Tab.17-2 Tab.17-3 Tab.17-4
17.2.2 How to Read the Instruction Set Pages
17.2.2 How to Read the Instruction Set Pages
The following is an example of the format used for each 80386 instruction
description in this chapter:
CMC -- Complement Carry Flag
Opcode Instruction Clocks Description
F5 CMC 2 Complement carry flag
The above table is followed by paragraphs labelled "Operation,"
"Description," "Flags Affected," "Protected Mode Exceptions," "Real
Address Mode Exceptions," and, optionally, "Notes." The following sections
explain the notational conventions and abbreviations used in these
paragraphs of the instruction descriptions.
17.2.2.1 Opcode
17.2.2.1 Opcode
The "Opcode" column gives the complete object code produced for each form
of the instruction. When possible, the codes are given as hexadecimal bytes,
in the same order in which they appear in memory. Definitions of entries
other than hexadecimal bytes are as follows:
/digit: (digit is between 0 and 7) indicates that the ModR/M byte of the
instruction uses only the r/m (register or memory) operand. The reg field
contains the digit that provides an extension to the instruction's opcode.
/r: indicates that the ModR/M byte of the instruction contains both a
register operand and an r/m operand.
cb, cw, cd, cp: a 1-byte (cb), 2-byte (cw), 4-byte (cd) or 6-byte (cp)
value following the opcode that is used to specify a code offset and
possibly a new value for the code segment register.
ib, iw, id: a 1-byte (ib), 2-byte (iw), or 4-byte (id) immediate operand to
the instruction that follows the opcode, ModR/M bytes or scale-indexing
bytes. The opcode determines if the operand is a signed value. All words and
doublewords are given with the low-order byte first.
+rb, +rw, +rd: a register code, from 0 through 7, added to the hexadecimal
byte given at the left of the plus sign to form a single opcode byte. The
codes are--
rb rw rd
AL = 0 AX = 0 EAX = 0
CL = 1 CX = 1 ECX = 1
DL = 2 DX = 2 EDX = 2
BL = 3 BX = 3 EBX = 3
AH = 4 SP = 4 ESP = 4
CH = 5 BP = 5 EBP = 5
DH = 6 SI = 6 ESI = 6
BH = 7 DI = 7 EDI = 7
17.2.2.2 Instruction
17.2.2.2 Instruction
The "Instruction" column gives the syntax of the instruction statement as
it would appear in an ASM386 program. The following is a list of the symbols
used to represent operands in the instruction statements:
rel8: a relative address in the range from 128 bytes before the end of the
instruction to 127 bytes after the end of the instruction.
rel16, rel32: a relative address within the same code segment as the
instruction assembled. rel16 applies to instructions with an operand-size
attribute of 16 bits; rel32 applies to instructions with an operand-size
attribute of 32 bits.
ptr16:16, ptr16:32: a FAR pointer, typically in a code segment different
from that of the instruction. The notation 16:16 indicates that the value of
the pointer has two parts. The value to the right of the colon is a 16-bit
selector or value destined for the code segment register. The value to the
left corresponds to the offset within the destination segment. ptr16:16 is
used when the instruction's operand-size attribute is 16 bits; ptr16:32 is
used with the 32-bit attribute.
r8: one of the byte registers AL, CL, DL, BL, AH, CH, DH, or BH.
r16: one of the word registers AX, CX, DX, BX, SP, BP, SI, or DI.
r32: one of the doubleword registers EAX, ECX, EDX, EBX, ESP, EBP, ESI, or
EDI.
imm8: an immediate byte value. imm8 is a signed number between -128 and
+127 inclusive. For instructions in which imm8 is combined with a word or
doubleword operand, the immediate value is sign-extended to form a word or
doubleword. The upper byte of the word is filled with the topmost bit of the
immediate value.
imm16: an immediate word value used for instructions whose operand-size
attribute is 16 bits. This is a number between -32768 and +32767 inclusive.
imm32: an immediate doubleword value used for instructions whose
operand-size attribute is 32-bits. It allows the use of a number between
+2147483647 and -2147483648.
r/m8: a one-byte operand that is either the contents of a byte register
(AL, BL, CL, DL, AH, BH, CH, DH), or a byte from memory.
r/m16: a word register or memory operand used for instructions whose
operand-size attribute is 16 bits. The word registers are: AX, BX, CX, DX,
SP, BP, SI, DI. The contents of memory are found at the address provided by
the effective address computation.
r/m32: a doubleword register or memory operand used for instructions whose
operand-size attribute is 32-bits. The doubleword registers are: EAX, EBX,
ECX, EDX, ESP, EBP, ESI, EDI. The contents of memory are found at the
address provided by the effective address computation.
m8: a memory byte addressed by DS:SI or ES:DI (used only by string
instructions).
m16: a memory word addressed by DS:SI or ES:DI (used only by string
instructions).
m32: a memory doubleword addressed by DS:SI or ES:DI (used only by string
instructions).
m16:16, M16:32: a memory operand containing a far pointer composed of two
numbers. The number to the left of the colon corresponds to the pointer's
segment selector. The number to the right corresponds to its offset.
m16 & 32, m16 & 16, m32 & 32: a memory operand consisting of data item pairs
whose sizes are indicated on the left and the right side of the ampersand.
All memory addressing modes are allowed. m16 & 16 and m32 & 32 operands are
used by the BOUND instruction to provide an operand containing an upper and
lower bounds for array indices. m16 & 32 is used by LIDT and LGDT to
provide a word with which to load the limit field, and a doubleword with
which to load the base field of the corresponding Global and Interrupt
Descriptor Table Registers.
moffs8, moffs16, moffs32: (memory offset) a simple memory variable of type
BYTE, WORD, or DWORD used by some variants of the MOV instruction. The
actual address is given by a simple offset relative to the segment base. No
ModR/M byte is used in the instruction. The number shown with moffs
indicates its size, which is determined by the address-size attribute of the
instruction.
Sreg: a segment register. The segment register bit assignments are ES=0,
CS=1, SS=2, DS=3, FS=4, and GS=5.
17.2.2.3 Clocks
17.2.2.3 Clocks
The "Clocks" column gives the number of clock cycles the instruction takes
to execute. The clock count calculations makes the following assumptions:
* The instruction has been prefetched and decoded and is ready for
execution.
* Bus cycles do not require wait states.
* There are no local bus HOLD requests delaying processor access to the
bus.
* No exceptions are detected during instruction execution.
* Memory operands are aligned.
Clock counts for instructions that have an r/m (register or memory) operand
are separated by a slash. The count to the left is used for a register
operand; the count to the right is used for a memory operand.
The following symbols are used in the clock count specifications:
* n, which represents a number of repetitions.
* m, which represents the number of components in the next instruction
executed, where the entire displacement (if any) counts as one
component, the entire immediate data (if any) counts as one component,
and every other byte of the instruction and prefix(es) each counts as
one component.
* pm=, a clock count that applies when the instruction executes in
Protected Mode. pm= is not given when the clock counts are the same for
Protected and Real Address Modes.
When an exception occurs during the execution of an instruction and the
exception handler is in another task, the instruction execution time is
increased by the number of clocks to effect a task switch. This parameter
depends on several factors:
* The type of TSS used to represent the current task (386 TSS or 286
TSS).
* The type of TSS used to represent the new task.
* Whether the current task is in V86 mode.
* Whether the new task is in V86 mode.
Table 17-5 summarizes the task switch times for exceptions.
See Also: Tab.17-5
17.2.2.4 Description
17.2.2.4 Description
The "Description" column following the "Clocks" column briefly explains the
various forms of the instruction. The "Operation" and "Description" sections
contain more details of the instruction's operation.
17.2.2.5 Operation
17.2.2.5 Operation
The "Operation" section contains an algorithmic description of the
instruction which uses a notation similar to the Algol or Pascal language.
The algorithms are composed of the following elements:
Comments are enclosed within the symbol pairs "(*" and "*)".
Compound statements are enclosed between the keywords of the "if" statement
(IF, THEN, ELSE, FI) or of the "do" statement (DO, OD), or of the "case"
statement (CASE ... OF, ESAC).
A register name implies the contents of the register. A register name
enclosed in brackets implies the contents of the location whose address is
contained in that register. For example, ES:[DI] indicates the contents of
the location whose ES segment relative address is in register DI. [SI]
indicates the contents of the address contained in register SI relative to
SI's default segment (DS) or overridden segment.
Brackets also used for memory operands, where they mean that the contents
of the memory location is a segment-relative offset. For example, [SRC]
indicates that the contents of the source operand is a segment-relative
offset.
A = B; indicates that the value of B is assigned to A.
The symbols =, <>, >=, and <= are relational operators used to compare two
values, meaning equal, not equal, greater or equal, less or equal,
respectively. A relational expression such as A = B is TRUE if the value of
A is equal to B; otherwise it is FALSE.
The following identifiers are used in the algorithmic descriptions:
* OperandSize represents the operand-size attribute of the instruction,
which is either 16 or 32 bits. AddressSize represents the address-size
attribute, which is either 16 or 32 bits. For example,
IF instruction = CMPSW
THEN OperandSize = 16;
ELSE
IF instruction = CMPSD
THEN OperandSize = 32;
FI;
FI;
indicates that the operand-size attribute depends on the form of the CMPS
instruction used. Refer to the explanation of address-size and operand-size
attributes at the beginning of this chapter for general guidelines on how
these attributes are determined.
* StackAddrSize represents the stack address-size attribute associated
with the instruction, which has a value of 16 or 32 bits, as explained
earlier in the chapter.
* SRC represents the source operand. When there are two operands, SRC is
the one on the right.
* DEST represents the destination operand. When there are two operands,
DEST is the one on the left.
* LeftSRC, RightSRC distinguishes between two operands when both are
source operands.
* eSP represents either the SP register or the ESP register depending on
the setting of the B-bit for the current stack segment.
The following functions are used in the algorithmic descriptions:
* Truncate to 16 bits(value) reduces the size of the value to fit in 16
bits by discarding the uppermost bits as needed.
* Addr(operand) returns the effective address of the operand (the result
of the effective address calculation prior to adding the segment base).
* ZeroExtend(value) returns a value zero-extended to the operand-size
attribute of the instruction. For example, if OperandSize = 32,
ZeroExtend of a byte value of -10 converts the byte from F6H to
doubleword with hexadecimal value 000000F6H. If the value passed to
ZeroExtend and the operand-size attribute are the same size,
ZeroExtend returns the value unaltered.
* SignExtend(value) returns a value sign-extended to the operand-size
attribute of the instruction. For example, if OperandSize = 32,
SignExtend of a byte containing the value -10 converts the byte from
F6H to a doubleword with hexadecimal value FFFFFFF6H. If the value
passed to SignExtend and the operand-size attribute are the same size,
SignExtend returns the value unaltered.
* Push(value) pushes a value onto the stack. The number of bytes pushed
is determined by the operand-size attribute of the instruction. The
action of Push is as follows:
IF StackAddrSize = 16
THEN
IF OperandSize = 16
THEN
SP = SP - 2;
SS:[SP] = value; (* 2 bytes assigned starting at
byte address in SP *)
ELSE (* OperandSize = 32 *)
SP = SP - 4;
SS:[SP] = value; (* 4 bytes assigned starting at
byte address in SP *)
FI;
ELSE (* StackAddrSize = 32 *)
IF OperandSize = 16
THEN
ESP = ESP - 2;
SS:[ESP] = value; (* 2 bytes assigned starting at
byte address in ESP*)
ELSE (* OperandSize = 32 *)
ESP = ESP - 4;
SS:[ESP] = value; (* 4 bytes assigned starting at
byte address in ESP*)
FI;
FI;
* Pop(value) removes the value from the top of the stack and returns it.
The statement EAX = Pop( ); assigns to EAX the 32-bit value that Pop
took from the top of the stack. Pop will return either a word or a
doubleword depending on the operand-size attribute. The action of Pop
is as follows:
IF StackAddrSize = 16
THEN
IF OperandSize = 16
THEN
ret val = SS:[SP]; (* 2-byte value *)
SP = SP + 2;
ELSE (* OperandSize = 32 *)
ret val = SS:[SP]; (* 4-byte value *)
SP = SP + 4;
FI;
ELSE (* StackAddrSize = 32 *)
IF OperandSize = 16
THEN
ret val = SS:[ESP]; (* 2 bytes value *)
ESP = ESP + 2;
ELSE (* OperandSize = 32 *)
ret val = SS:[ESP]; (* 4 bytes value *)
ESP = ESP + 4;
FI;
FI;
RETURN(ret val); (*returns a word or doubleword*)
* Bit[BitBase, BitOffset] returns the address of a bit within a bit
string, which is a sequence of bits in memory or a register. Bits are
numbered from low-order to high-order within registers and within
memory bytes. In memory, the two bytes of a word are stored with the
low-order byte at the lower address.
If the base operand is a register, the offset can be in the range 0..31.
This offset addresses a bit within the indicated register. An example,
"BIT[EAX, 21]," is illustrated in Figure 17-3.
If BitBase is a memory address, BitOffset can range from -2 gigabits to 2
gigabits. The addressed bit is numbered (Offset MOD 8) within the byte at
address (BitBase + (BitOffset DIV 8)), where DIV is signed division with
rounding towards negative infinity, and MOD returns a positive number.
This is illustrated in Figure 17-4.
* I-O-Permission(I-O-Address, width) returns TRUE or FALSE depending on
the I/O permission bitmap and other factors. This function is defined as
follows:
IF TSS type is 286 THEN RETURN FALSE; FI;
Ptr = [TSS + 66]; (* fetch bitmap pointer *)
BitStringAddr = SHR (I-O-Address, 3) + Ptr;
MaskShift = I-O-Address AND 7;
CASE width OF:
BYTE: nBitMask = 1;
WORD: nBitMask = 3;
DWORD: nBitMask = 15;
ESAC;
mask = SHL (nBitMask, MaskShift);
CheckString = [BitStringAddr] AND mask;
IF CheckString = 0
THEN RETURN (TRUE);
ELSE RETURN (FALSE);
FI;
* Switch-Tasks is the task switching function described in Chapter 7.
See Also: Fig.17-3 Fig.17-4 7.5
17.2.2.6 Description
17.2.2.6 Description
The "Description" section contains further explanation of the instruction's
operation.
17.2.2.7 Flags Affected
17.2.2.7 Flags Affected
The "Flags Affected" section lists the flags that are affected by the
instruction, as follows:
* If a flag is always cleared or always set by the instruction, the
value is given (0 or 1) after the flag name. Arithmetic and logical
instructions usually assign values to the status flags in the uniform
manner described in Appendix C. Nonconventional assignments are
described in the "Operation" section.
* The values of flags listed as "undefined" may be changed by the
instruction in an indeterminate manner.
All flags not listed are unchanged by the instruction.
17.2.2.8 Protected Mode Exceptions
17.2.2.8 Protected Mode Exceptions
This section lists the exceptions that can occur when the instruction is
executed in 80386 Protected Mode. The exception names are a pound sign (#)
followed by two letters and an optional error code in parentheses. For
example, #GP(0) denotes a general protection exception with an error code of
0. Table 17-6 associates each two-letter name with the corresponding
interrupt number.
Chapter 9 describes the exceptions and the 80386 state upon entry to the
exception.
Application programmers should consult the documentation provided with
their operating systems to determine the actions taken when exceptions
occur.
See Also: Tab.17-6 9.6.1
17.2.2.9 Real Address Mode Exceptions
17.2.2.9 Real Address Mode Exceptions
Because less error checking is performed by the 80386 in Real Address Mode,
this mode has fewer exception conditions. Refer to Chapter 14 for further
information on these exceptions.
See Also: 14.6
17.2.2.10 Virtual-8086 Mode Exceptions
17.2.2.10 Virtual-8086 Mode Exceptions
Virtual 8086 tasks provide the ability to simulate Virtual 8086 machines.
Virtual 8086 Mode exceptions are similar to those for the 8086 processor,
but there are some differences. Refer to Chapter 15 for details.
See Also: 15.3.2