Encoding:

COP1

010001

fmt

0

00000

fs

fd

ABS

000101

6

5

5

5

5

6

Format:

ABS.fmt 

Floating Point Absolute Value

ABS.S fd, fs

MIPS32

Floating Point Absolute Value

ABS.D fd, fs

MIPS32

Floating Point Absolute Value

ABS.PS fd, fs

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Absolute Value

Purpose:

Floating Point Absolute Value

Description:

FPR[fd] = abs(FPR[fs])

The absolute value of the value in FPR fs is placed in FPR fd. The operand and result are values in format fmt.

ABS.PS takes the absolute value of the two values in FPR fs independently, and ORs together any generated exceptions.

The Cause bits are ORed into the Flag bits if no exception is taken.

If FIRHas2008=0 or FCSRABS2008=0 then this operation is arithmetic. For this case, any NaN operand signals invalid operation.

If FCSRABS2008=1 then this operation is non-arithmetic. For this case, both regular floating point numbers and NAN values are treated alike, only the sign bit is affected by this instruction. No IEEE exception can be generated for this case, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of ABS.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

ABS.PS is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

ABS.PS has been removed in Release 6.

Operation:

StoreFPR(fd, fmt, AbsoluteValue(ValueFPR(fs, fmt)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation

Encoding:

COP1

010001

fmt

ft

fs

fd

ADD

000000

6

5

5

5

5

6

Format:

ADD.fmt 

Floating Point Add

ADD.S fd, fs, ft

MIPS32

Floating Point Add

ADD.D fd, fs, ft

MIPS32

Floating Point Add

ADD.PS fd, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Add

Purpose:

Floating Point Add

To add floating point values.

Description:

FPR[fd] = FPR[fs] + FPR[ft]

The value in FPR ft is added to the value in FPR fs. The result is calculated to infinite precision, rounded by using to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt.

ADD.PS adds the upper and lower halves of FPR fs and FPR ft independently, and ORs together any generated exceptions.

The Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt. If the fields are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of ADD.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

ADD.PS is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

ADD.PS has been removed in Release 6.

Operation:

StoreFPR (fd, fmt, ValueFPR(fs, fmt) +fmt ValueFPR(ft, fmt))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Inexact, Overflow, Underflow

Encoding:

ADDI

001000

rs

rt

immediate

6

5

5

16

Format:

ADDI rt, rs, immediate

MIPS32, removed in Release 6

Add Immediate Word

Purpose:

Add Immediate Word

To add a constant to a 32-bit integer. If overflow occurs, then trap.

Description:

GPR[rt] = GPR[rs] + immediate

The 16-bit signed immediate is added to the 32-bit value in GPR rs to produce a 32-bit result.

Restrictions:

If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Availability and Compatibility:

This instruction has been removed in Release 6. The encoding has been reused for other instructions introduced by

Release 6.

Operation:

if NotWordValue(GPR[rs]) then 
   UNPREDICTABLE
endif
temp = (GPR[rs]31||GPR[rs]31..0) + sign_extend(immediate)
if temp32 != temp31 then
   SignalException(IntegerOverflow)
else
   GPR[rt] = sign_extend(temp31..0)
endif

Exceptions:

Integer Overflow

Programming Notes:

ADDIU performs the same arithmetic operation but does not trap on overflow.

Encoding:

PCREL

111011

rs

ADDIUPC

00

immediate

6

5

2

19

Format:

ADDIUPC rs,immediate

MIPS32 Release 6

Add Immediate to PC (unsigned - non-trapping)

Purpose:

Add Immediate to PC (unsigned - non-trapping)

Description:

GPR[rs] = ( PC + sign_extend( immediate << 2 ) )

This instruction performs a PC-relative address calculation. The 19-bit immediate is shifted left by 2 bits, signextended, and added to the address of the ADDIUPC instruction. The result is placed in GPR rs.

This instruction is both a 32-bit and a 64-bit instruction. The 64-bit result is sign-extended by the same rules that govern sign-extension of virtual addresses in the MIPS64 Architecture, as described by the function effective_address() in the Privileged Resource Architecture.

Restrictions:

None

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Operation:

GPR[rs] =  ( PC + sign_extend( immediate << 2 ) )

Exceptions:

None

Programming Notes:

The term "unsigned" in this instruction mnemonic is a misnomer. "Unsigned" here means "non-trapping". It does not trap on a signed 32-bit overflow. ADDIUPC corresponds to unsigned ADDIU, which does not trap on overflow, as opposed to ADDI, which does trap on overflow.

Encoding:

ADDIU

001001

rs

rt

immediate

6

5

5

16

Format:

ADDIU rt, rs, immediate

MIPS32

Add Immediate Unsigned Word

Purpose:

Add Immediate Unsigned Word

To add a constant to a 32-bit integer.

Description:

GPR[rt] = GPR[rs] + immediate

The 16-bit signed immediate is added to the 32-bit value in GPR rs and the 32-bit arithmetic result is sign-extended and placed into GPR rt.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rs]) then 
   UNPREDICTABLE 
endif
temp = GPR[rs] + sign_extend(immediate)
GPR[rt] = sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

The term "unsigned" in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. This instruction is appropriate for unsigned arithmetic, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as C language arithmetic.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

ADD

100000

6

5

5

5

5

6

Format:

ADD rd, rs, rt 

MIPS32

Add Word

Purpose:

Add Word

To add 32-bit integers. If an overflow occurs, then trap.

Description:

GPR[rd] = GPR[rs] + GPR[rt]

The 32-bit word value in GPR rt is added to the 32-bit value in GPR rs to produce a 32-bit result.

Restrictions:

If either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = (GPR[rs]31||GPR[rs]31..0) + (GPR[rt]31||GPR[rt]31..0)
if temp32 != temp31 then
   SignalException(IntegerOverflow)
else
   GPR[rd] = sign_extend(temp31..0)
endif

Exceptions:

Integer Overflow

Programming Notes:

ADDU performs the same arithmetic operation but does not trap on overflow.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

ADDU

100001

6

5

5

5

5

6

Format:

ADDU rd, rs, rt

MIPS32

Add Unsigned Word

Purpose:

Add Unsigned Word

To add 32-bit integers.

Description:

GPR[rd] = GPR[rs] + GPR[rt]

The 32-bit word value in GPR rt is added to the 32-bit value in GPR rs and the 32-bit arithmetic result is signextended and placed into GPR rd.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

If either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then 
   UNPREDICTABLE
endif
temp = GPR[rs] + GPR[rt]
GPR[rd] = sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

The term "unsigned" in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. This instruction is appropriate for unsigned arithmetic, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as C language arithmetic.

Encoding:

SPECIAL3

011111

rs

rt

rd

ALIGN

010

bp

BSHFL

100000

6

5

5

5

3

2

6

SPECIAL3

011111

rs

rt

rd

DALIGN

01

bp

DBSHFL

100100

6

5

5

5

2

3

6

Format:

ALIGN DALIGN 

Concatenate two GPRs, and extract a contiguous subset at a byte position

ALIGN rd,rs,rt,bp

MIPS32 Release 6

Concatenate two GPRs, and extract a contiguous subset at a byte position

DALIGN rd,rs,rt,bp

MIPS64 Release 6

Concatenate two GPRs, and extract a contiguous subset at a byte position

Purpose:

Concatenate two GPRs, and extract a contiguous subset at a byte position

Description:

GPR[rd] = (GPR[rt] << (8*bp)) or (GPR[rs] >> (GPRLEN-8*bp))

The input registers GPR rt and GPR rs are concatenated, and a register width contiguous subset is extracted, which is specified by the byte pointer bp.

The ALIGN instruction operates on 32-bit words, and has a 2-bit byte position field bp.

The DALIGN instruction operates on 64-bit doublewords, and has a 3-bit byte position field bp.

32-bit word in register rs is right shifted as a 32-bit value by (4-bp) byte positions. These shifts are logical shifts, zero-filling. The shifted values are then or-ed together to create a 32-bit result that is sign-extended to 64bits and written to destination GPR rd.

rd.

Restrictions:

Executing ALIGN and DALIGN with shift count bp=0 acts like a register to register move operation, and is redundant, and therefore discouraged. Software should not generate ALIGN or DALIGN with shift count bp=0.

DALIGN: A Reserved Instruction exception is signaled if access to 64-bit operations is not enabled.

Availability and Compatibility:

The ALIGN instruction is introduced by and required as of Release 6.

The DALIGN instruction is introduced by and required as of Release 6.

Programming Notes:

Release 6 ALIGN instruction corresponds to the pre-Release 6 DSP Module BALIGN instruction, except that

BALIGN with shift counts of 0 and 2 are specified as being UNPREDICTABLE, whereas ALIGN (and DALIGN) defines all bp values, discouraging only bp=0.

Operation:

ALIGN:
 tmp_rt_hi = unsigned_word(GPR[rt]) << (8*bp)
 tmp_rs_lo = unsigned_word(GPR[rs]) >> (8*(4-bp))
 tmp = tmp_rt_hi or tmp_rt_lo
ALIGN on a 32-bit CPU:
 GPR[rd] = tmp
ALIGN on a 64-bit CPU:
 GPR[rd] = sign_extend.32(tmp)
DALIGN:
 if not Are64bitOperationsEnabled()
   then SignalException(ReservedInstruction) endif
 tmp_rt_hi = unsigned_doubleword(GPR[rt]) << (8*bp)
 tmp_rs_lo = unsigned_doubleword(GPR[rs]) >> (8*(8-bp))
 tmp = tmp_rt_hi or tmp_rt_lo
 GPR[rd] = tmp
/* end of instruction */

Exceptions:

ALIGN: None

DALIGN: Reserved Instruction

Encoding:

COP1X

010011

rs

ft

fs

fd

ALNV.PS

011110

6

5

5

5

5

6

Format:

ALNV.PS fd, fs, ft, rs

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Align Variable

Purpose:

Floating Point Align Variable

To align a misaligned pair of paired single values.

Description:

FPR[fd] = ByteAlign(GPR[rs]2..0, FPR[fs], FPR[ft])

FPR fs is concatenated with FPR ft and this value is funnel-shifted by GPR rs2..0 bytes, and written into FPR fd. If

GPR rs2..0 is 0, FPR fd receives FPR fs. If GPR rs2..0 is 4, the operation depends on the current endianness.

Figure 3-1 illustrates the following example: for a big-endian operation and a byte alignment of 4, the upper half of

FPR fd receives the lower half of the paired single value in fs, and the lower half of FPR fd receives the upper half of the paired single value in FPR ft.

The move is non arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is

UNPREDICTABLE.

If GPR rs1..0 are non-zero, the results are UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. The instruction is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if GPR[rs]2..0 = 0 then
   StoreFPR(fd, PS,ValueFPR(fs,PS))
else if GPR[rs]2..0 != 4 then
   UNPREDICTABLE
else if BigEndianCPU then
   StoreFPR(fd, PS, ValueFPR(fs, PS)31..0 || ValueFPR(ft,PS)63..32)
else
   StoreFPR(fd, PS, ValueFPR(ft, PS)31..0 || ValueFPR(fs,PS)63..32)
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Programming Notes:

ALNV.PS is designed to be used with LUXC1 to load 8 bytes of data from any 4-byte boundary. For example:

/* Copy T2 bytes (a multiple of 16) of data T0 to T1, T0 unaligned, T1 aligned.
             Reads one dw beyond the end of T0. */
   LUXC1     F0, 0(T0) /* set up by reading 1st src dw */
   LI       T3, 0     /* index into src and dst arrays */
   ADDIU     T4, T0, 8 /* base for odd dw loads */
   ADDIU     T5, T1, -8/* base for odd dw stores */
LOOP:
   LUXC1     F1, T3(T4)
   ALNV.PS   F2, F0, F1, T0/* switch F0, F1 for little-endian */
   SDC1      F2, T3(T1)
   ADDIU     T3, T3, 16
   LUXC1     F0, T3(T0)
   ALNV.PS   F2, F1, F0, T0/* switch F1, F0 for little-endian */
   BNE       T3, T2, LOOP
   SDC1      F2, T3(T5)
DONE:

ALNV.PS is also useful with SUXC1 to store paired-single results in a vector loop to a possibly misaligned address:

/* T1[i] = T0[i] + F8, T0 aligned, T1 unaligned. */
      CVT.PS.S F8, F8, F8/* make addend paired-single */
/* Loop header computes 1st pair into F0, stores high half if T1 */
/* misaligned */
LOOP:
   LDC1      F2, T3(T4)/* get T0[i+2]/T0[i+3] */
   ADD.PS    F1, F2, F8/* compute T1[i+2]/T1[i+3] */
   ALNV.PS   F3, F0, F1, T1/* align to dst memory */
   SUXC1     F3, T3(T1)/* store to T1[i+0]/T1[i+1] */
   ADDIU     T3, 16    /* i = i + 4 */
   LDC1      F2, T3(T0)/* get T0[i+0]/T0[i+1] */
   ADD.PS    F0, F2, F8/* compute T1[i+0]/T1[i+1] */
   ALNV.PS   F3, F1, F0, T1/* align to dst memory */
   BNE      T3, T2, LOOP
   SUXC1     F3, T3(T5)/* store to T1[i+2]/T1[i+3] */
/* Loop trailer stores all or half of F0, depending on T1 alignment */

Encoding:

PCREL

111011

rs

ALUIPC

11111

immediate

6

5

5

16

Format:

ALUIPC rs,immediate

MIPS32 Release 6

Aligned Add Upper Immediate to PC

Purpose:

Aligned Add Upper Immediate to PC

Description:

GPR[rs] = ~0x0FFFF & ( PC + sign_extend( immediate << 16 ) )

This instruction performs a PC-relative address calculation. The 16-bit immediate is shifted left by 16 bits, signextended, and added to the address of the ALUIPC instruction. The low 16 bits of the result are cleared, that is the result is aligned on a 64K boundary. The result is placed in GPR rs.

This instruction is both a 32-bit and a 64-bit instruction. The 64-bit result is sign-extended by the same rules that govern sign-extension of virtual addresses in the MIPS64 Architecture, as described by the function effective_address() in the Privileged Resource Architecture.

Restrictions:

None

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Operation:

GPR[rs] = ~0x0FFFF &  ( PC + sign_extend( immediate << 16 ) )

Exceptions:

None

Encoding:

ANDI

001100

rs

rt

immediate

6

5

5

16

Format:

ANDI rt, rs, immediate

MIPS32

and immediate

Purpose:

and immediate

To do a bitwise logical AND with a constant

Description:

GPR[rt] = GPR[rs] and zero_extend(immediate)

The 16-bit immediate is zero-extended to the left and combined with the contents of GPR rs in a bitwise logical AND operation. The result is placed into GPR rt.

Restrictions:

None

Operation:

GPR[rt] = GPR[rs] and zero_extend(immediate)

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

AND

100100

6

5

5

5

5

6

Format:

AND rd, rs, rt

MIPS32

and

Purpose:

and

To do a bitwise logical AND.

Description:

GPR[rd] = GPR[rs] and GPR[rt]

The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical AND operation. The result is placed into GPR rd.

Restrictions:

None

Operation:

GPR[rd] = GPR[rs] and GPR[rt]

Exceptions:

None

Encoding:

AUI

001111

rs

rt

immediate

DAUI

011101

rs != 00000

rt

immediate

REGIMM

000001

rs

DAHI

00110

immediate

REGIMM

000001

rs

DATI

11110

immediate

6

5

5

16

Format:

AUI DAUI DAHI DATI 

Add Upper Immediate

AUI rt, rs immediate

MIPS32 Release 6

Add Upper Immediate

DAUI rt, rs immediate

MIPS64 Release 6

Doubleword Add Upper Immediate

DAHI rs, rs immediate

MIPS64 Release 6

Doubleword Add Higher Immediate

DATI rs, rs immediate

MIPS64 Release 6

Doubleword Add Top Immediate

Purpose:

Add Immediate to Upper Bits

AUI: Add Upper Immediate

DAUI: Doubleword Add Upper Immediate

DAHI: Doubleword Add Higher Immediate

DATI: Doubleword Add Top Immediate

Description:

AUI:  GPR[rt] = sign_extend.32( GPR[rs] + sign_extend(immediate << 16) )
DAUI: GPR[rt] = GPR[rs] + sign_extend(immediate << 16)
DAHI: GPR[rs] = GPR[rs] + sign_extend(immediate << 32)
DATI: GPR[rs] = GPR[rs] + sign_extend(immediate << 48)

AUI: The 16 bit immediate is shifted left 16 bits, sign-extended, and added to the register rs, storing the result in rt.

AUI is a 32-bit compatible instruction, so on a 64-bit CPU the result is sign extended as if a 32-bits signed address.

DAHI: The 16 bit immediate is shifted left 32 bits, sign-extended, and added to the register rs, overwriting rs with the result.

DATI: The 16 bit immediate is shifted left 48bits, sign-extended, and added to the register rs, overwriting rs with the result.

DAUI: The 16-bit immediate is shifted left 16 bits, sign-extended, and added to the register rs; the results are stored in rt.

In Release 6, LUI is an assembly idiom for AUI with rs=0.

Restrictions:

DAUI: rs cannot be r0, the zero register. The encoding may be used for other instructions or must signal a Reserved

Instruction exception.

DAUI, DAHI, DATI: Reserved Instruction exception if 64-bit instructions are not enabled.

Availability and Compatibility:

AUI is introduced by and required as of Release 6.

DAUI is introduced by and required as of MIPS64 Release 6.

DAUI reuses the primary opcode of JALX. DAUI cannot be trapped (and possibly emulated) on pre-Release 6 systems, while JALX cannot be trapped (and possibly emulated) on Release 6 systems.

DAHI is introduced by and required as of MIPS64 Release 6.

DATI is introduced by and required as of MIPS64 Release 6.

Operation:

AUI:  GPR[rt] = sign_extend.32( GPR[rs] + sign_extend(immediate << 16) )
DAUI: GPR[rt] = GPR[rs] + sign_extend(immediate << 16)
DAHI: GPR[rs] = GPR[rs] + sign_extend(immediate << 32)
DATI: GPR[rs] = GPR[rs] + sign_extend(immediate << 48)

Exceptions:

AUI: None.

DAUI, DAHI, DATI: Reserved Instruction

Programming Notes:

AUI (and DAUI, DAHI and DATI on MIPS64 Release 6) can be used to synthesize large constants in situations where it is not convenient to load a large constant from memory. To simplify hardware that may recognize sequences of instructions as generating large constants, AUI/DAUI/DAHI/DATI should be used in a stylized manner.

To create an integer:

LUI rd, imm_low(rtmp)
ORI rd, rd, imm_upper
DAHI rd, imm_high
DATI rd, imm_top

To create a large offset for a memory access whose address is of the form rbase+large_offset:

AUI rtmp, rbase, imm_upper
DAHI rtmp, imm_high
DATI rtmp, imm_top
LW rd, (rtmp)imm_low

To create a large constant operand for an instruction of the form rd:=rs+large_immediate or rd:=rs-large_immediate:

32-bits:

AUI  rtmp, rs,  imm_upper
ADDIU rd,  rtmp, imm_low

64-bits:

AUI  rtmp, rs, imm_upper
DAHI rtmp, imm_high
DATI rtmp, imm_top
DADDUI rd, rtmp,imm_low

Encoding:

PCREL

111011

rs

AUIPC

11110

immediate

6

5

5

16

Format:

AUIPC rs, immediate

MIPS32 Release 6

Add Upper Immediate to PC

Purpose:

Add Upper Immediate to PC

Description:

GPR[rs] =  ( PC + sign_extend( immediate << 16 ) )

This instruction performs a PC-relative address calculation. The 16-bit immediate is shifted left by 16 bits, signextended, and added to the address of the AUIPC instruction. The result is placed in GPR rs.

In a MIPS64 implementation, the 32-bit result is sign extended from bit 31 to bit 63.

This instruction is both a 32-bit and a 64-bit instruction. The 64-bit result is sign-extended by the same rules that govern sign-extension of virtual address in the MIPS64 Architecture, as described by the function effective_address() in the Privileged Resource Architecture.

Restrictions:

None

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Operation:

GPR[rs] =  ( PC + sign_extend( immediate << 16 ) )

Exceptions:

None

Encoding:

BALC

111010

offset

6

26

Format:

BALC offset

MIPS32 Release 6

Branch and Link, Compact

Purpose:

Branch and Link, Compact

To do an unconditional PC-relative procedure call.

Description:

procedure_call (no delay slot)

Place the return address link in GPR 31. The return link is the address of the instruction immediately following the branch, where execution continues after a procedure call. (Because compact branches have no delay slots, see below.)

A 28-bit signed offset (the 26-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address.

Compact branches do not have delay slots. The instruction after the branch is NOT executed when the branch is taken.

Restrictions:

This instruction is an unconditional, always taken, compact branch. It does not have a forbidden slot, that is, a

Reserved Instruction exception is not caused by a Control Transfer Instruction placed in the slot following the branch.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Release 6 instruction BALC occupies the same encoding as pre-Release 6 instruction SWC2. The SWC2 instruction has been moved to the COP2 major opcode in MIPS Release 6.

Exceptions:

None

Operation:

target_offset = sign_extend( offset || 02 )
GPR[31] = PC+4
PC = PC+4 + sign_extend(target_offset)

Encoding:

pre-Release 6:

REGIMM

000001

00000

BGEZAL

10001

offset

6

5

5

16

Release 6:

REGIMM

000001

0

00000

BAL

10001

offset

6

5

5

16

Format:

BAL offset

Assembly Idiom MIPS32, MIPS32 Release 6

Branch and Link

Purpose:

Branch and Link

To do an unconditional PC-relative procedure call.

Description:

procedure_call

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

An 18-bit signed offset (the 16-bit offset field shifted left 2-bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Availability and Compatibility:

Pre-Release 6: BAL offset is the assembly idiom used to denote an unconditional branch. The actual instruction is interpreted by the hardware as BGEZAL r0, offset.

Release 6 keeps the BAL special case of BGEZAL, but removes all other instances of BGEZAL. BGEZAL with rs any register other than GPR[0] is required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
      GPR[31] = PC + 8
I+1:  PC = PC + target_offset

Exceptions:

None

Programming Notes:

BAL without a corresponding return should NOT be used to read the PC. Doing so is likely to cause a performance loss on processors with a return address predictor.

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to addresses outside this range.

Encoding:

COP1

010001

BC1EQZ

01001

ft

offset

COP1

010001

BC1NEZ

01101

ft

offset

6

5

5

16

Format:

BC1EQZ BC1NEZ 

Branch if Coprocessor 1 (FPU) Register Bit 0 is Equal to Zero

BC1EQZ ft, offset

MIPS32 Release 6

Branch if Coprocessor 1 (FPU) Register Bit 0 is Equal to Zero

BC1NEZ ft, offset

MIPS32 Release 6

Branch if Coprocessor 1 (FPR) Register Bit 0 is Not Equal to Zero

Purpose:

Branch if Coprocessor 1 (FPU) Register Bit 0 Equal/Not Equal to Zero

BC1EQZ: Branch if Coprocessor 1 (FPU) Register Bit 0 is Equal to Zero

BC1NEZ: Branch if Coprocessor 1 (FPR) Register Bit 0 is Not Equal to Zero

Description:

BC1EQZ:  if FPR[ft] & 1 = 0 then branch
BC1NEZ:  if FPR[ft] & 1 != 0 then branch

The condition is evaluated on FPU register ft.

If the condition is false, the branch is not taken, and execution continues with the next instruction.

A 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address. Execute the instruction in the delay slot before the instruction at the target.

Restrictions:

If access to Coprocessor 1 is not enabled, a Coprocessor Unusable Exception is signaled.

Because these instructions BC1EQZ and BC1NEZ do not depend on a particular floating point data type, they operate whenever Coprocessor 1 is enabled.

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

Exceptions:

Coprocessor Unusable1

Operation:

   tmp = ValueFPR(ft, UNINTERPRETED_WORD)
   BC1EQZ: cond = tmp & 1 = 0
   BC1NEZ: cond = tmp & 1 != 0
   if cond then
      I:    target_PC = ( PC+4 + sign_extend( offset << 2 )
      I+1:  PC = target_PC

Programming Notes:

Release 6: These instructions, BC1EQZ and BC1NEZ, replace the pre-Release 6 instructions BC1F and BC1T. These

Release 6 FPU branches depend on bit 0 of the scalar FPU register.

Note: BC1EQZ and BC1NEZ do not have a format or data type width. The same instructions are used for branches based on conditions involving any format, including 32-bit S (single precision) and W (word) format, and 64-bit D

(double precision) and L (longword) format, as well as 128-bit MSA. The FPU scalar comparison instructions

CMP.condn fmt produce an all ones or all zeros truth mask of their format width with the upper bits (where applicable) UNPREDICTABLE. BC1EQZ and BC1NEZ consume only bit 0 of the CMP.condn fmt output value, and therefore operate correctly independent of fmt.

Encoding:

COP1

010001

BC

01000

cc

nd

0

tf

0

offset

6

5

3

1

1

16

Format:

BC1F cc, offset

MIPS32, removed in Release 6

Branch on FP False

Purpose:

Branch on FP False

To test an FP condition code and do a PC-relative conditional branch.

Description:

if FPConditionCode(cc) = 0 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the FP condition code bit cc is false (0), the program branches to the effective target address after the instruction in the delay slot is executed. An FP condition code is set by the FP compare instruction, C.cond fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    condition = FPConditionCode(cc) = 0
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

This instruction has been removed in Release 6 and has been replaced by the BC1EQZ instruction. Refer to the

'BC1EQZ' instruction in this manual for more information.

Historical Information:

The MIPS I architecture defines a single floating point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control/Status register. MIPS I, II, and III architectures must have the CC field set to 0, which is implied by the first format in the "Format" section.

The MIPS IV and MIPS32 architectures add seven more Condition Code bits to the original condition code 0. FP compare and conditional branch instructions specify the Condition Code bit to set or test. Both assembler formats are valid for MIPS IV and MIPS32.

Encoding:

COP1

010001

BC

01000

cc

nd

1

tf

0

offset

6

5

3

1

1

16

Format:

BC1FL cc, offset

MIPS32, removed in Release 6

Branch on FP False Likely

Purpose:

Branch on FP False Likely

To test an FP condition code and make a PC-relative conditional branch; execute the instruction in the delay slot only if the branch is taken.

Description:

if FPConditionCode(cc) = 0 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the FP Condition Code bit cc is false (0), the program branches to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

An FP condition code is set by the FP compare instruction, C.cond fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

This operation specification is for the general Branch On Condition operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC1F, BC1FL, BC1T, and BC1TL have specific values for

tf and nd.

I:    condition = FPConditionCode(cc) = 0
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BC1F instruction instead.

Historical Information:

The MIPS I architecture defines a single floating point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control/Status register. MIPS I, II, and III architectures must have the CC field set to 0, which is implied by the first format in the "Format" section.

The MIPS IV and MIPS32 architectures add seven more Condition Code bits to the original condition code 0. FP compare and conditional branch instructions specify the Condition Code bit to set or test. Both assembler formats are valid for MIPS IV and MIPS32.

Encoding:

COP1

010001

BC

01000

cc

nd

0

tf

1

offset

6

5

3

1

1

16

Format:

BC1T cc, offset

MIPS32, removed in Release 6

Branch on FP True

Purpose:

Branch on FP True

To test an FP condition code and do a PC-relative conditional branch.

Description:

 if FPConditionCode(cc) = 1 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the FP condition code bit cc is true (1), the program branches to the effective target address after the instruction in the delay slot is executed. An FP condition code is set by the FP compare instruction, C.cond fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    condition = FPConditionCode(cc) = 1
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

This instruction has been replaced by the BC1NEZ instruction. Refer to the ‘BC1NEZ’ instruction in this manual for more information.

Historical Information:

The MIPS I architecture defines a single floating point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control/Status register. MIPS I, II, and III architectures must have the CC field set to 0, which is implied by the first format in the “Format” section.

The MIPS IV and MIPS32 architectures add seven more Condition Code bits to the original condition code 0. FP compare and conditional branch instructions specify the Condition Code bit to set or test. Both assembler formats are valid for MIPS IV and MIPS32.

Encoding:

COP1

010001

BC

01000

cc

nd

1

tf

1

offset

6

5

3

1

1

16

Format:

BC1TL cc, offset

MIPS32, removed in Release 6

Branch on FP True Likely

Purpose:

Branch on FP True Likely

To test an FP condition code and do a PC-relative conditional branch; execute the instruction in the delay slot only if the branch is taken.

Description:

if FPConditionCode(cc) = 1 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the FP Condition Code bit cc is true (1), the program branches to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

An FP condition code is set by the FP compare instruction, C.cond fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

This operation specification is for the general Branch On Condition operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC1F, BC1FL, BC1T, and BC1TL have specific values for

tf and nd.

I:    condition = FPConditionCode(cc) = 1
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BC1T instruction instead.

Historical Information:

The MIPS I architecture defines a single floating point condition code, implemented as the coprocessor 1 condition signal (Cp1Cond) and the C bit in the FP Control/Status register. MIPS I, II, and III architectures must have the CC field set to 0, which is implied by the first format in the "Format" section.

The MIPS IV and MIPS32 architectures add seven more Condition Code bits to the original condition code 0. FP compare and conditional branch instructions specify the Condition Code bit to set or test. Both assembler formats are valid for MIPS IV and MIPS32.

Encoding:

COP2

010010

BC2EQZ

01001

ct

offset

COP2

010010

BC2NEZ

01101

ct

offset

6

5

5

16

Format:

BC2EQZ BC2NEZ 

Branch if Coprocessor 2 Condition (Register) is Equal to Zero

BC2EQZ ct, offset

MIPS32 Release 6

Branch if Coprocessor 2 Condition (Register) is Equal to Zero

BC2NEZ ct, offset

MIPS32 Release 6

Branch if Coprocessor 2 Condition (Register) is Not Equal to Zero

Purpose:

Branch if Coprocessor 2 Condition (Register) Equal/Not Equal to Zero

BC2EQZ: Branch if Coprocessor 2 Condition (Register) is Equal to Zero

BC2NEZ: Branch if Coprocessor 2 Condition (Register) is Not Equal to Zero

Description:

BC2EQZ:  if COP2Condition[ct] = 0 then branch
BC2NEZ:  if COP2Condition[ct] != 0 then branch

The 5-bit field ct specifies a coprocessor 2 condition.

A 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address. Execute the instruction in the delay slot before the instruction at the target.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

If access to Coprocessor 2 is not enabled, a Coprocessor Unusable Exception is signaled.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

Exceptions:

Coprocessor Unusable, Reserved Instruction

Operation:

   tmpcond = Coprocessor2Condition(ct)
   if BC2EQZ then 
    tmpcond = not(tmpcond)
   endif
   if tmpcond then
      PC = PC+4 + sign_extend( immediate << 2 ) )
   endif

Implementation Notes:

As of Release 6 these instructions, BC2EQZ and BC2NEZ, replace the pre-Release 6 instructions BC2F and BC2T, which had a 3-bit condition code field (as well as nullify and true/false bits). Release 6 makes all 5 bits of the ct condition code available to the coprocessor designer as a condition specifier.

A customer defined coprocessor instruction set can implement any sort of condition it wants. For example, it could implement up to 32 single-bit flags, specified by the 5-bit field ct. It could also implement conditions encoded as values in a coprocessor register (such as testing the least significant bit of a coprocessor register) as done by Release

6 instructions BC1EQZ/BC1NEZ.

Encoding:

COP2

010010

BC

01000

cc

nd

0

tf

0

offset

6

5

3

1

1

16

Format:

BC2F cc, offset

MIPS32, removed in Release 6

Branch on COP2 False

Purpose:

Branch on COP2 False

To test a COP2 condition code and do a PC-relative conditional branch.

Description:

if COP2Condition(cc) = 0 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the COP2 condition specified by cc is false (0), the program branches to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    condition = COP2Condition(cc) = 0
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

This instruction has been replaced by the BC2EQZ instruction. Refer to the 'BC2EQZ' instruction in this manual for more information.

Encoding:

COP2

010010

BC

01000

cc

nd

1

tf

0

offset

6

5

3

1

1

16

Format:

BC2FL cc, offset

MIPS32, removed in Release 6

Branch on COP2 False Likely

Purpose:

Branch on COP2 False Likely

To test a COP2 condition code and make a PC-relative conditional branch; execute the instruction in the delay slot only if the branch is taken.

Description:

if COP2Condition(cc) = 0 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the COP2 condition specified by cc is false (0), the program branches to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

This operation specification is for the general Branch On Condition operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC2F, BC2FL, BC2T, and BC2TL have specific values for

tf and nd.

I:    condition = COP2Condition(cc) = 0
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BC2F instruction instead.

Encoding:

COP2

010010

BC

01000

cc

nd

0

tf

1

offset

6

5

3

1

1

16

Format:

BC2T cc, offset

MIPS32, removed in Release 6

Branch on COP2 True

Purpose:

Branch on COP2 True

To test a COP2 condition code and do a PC-relative conditional branch.

Description:

 if COP2Condition(cc) = 1 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the COP2 condition specified by cc is true (1), the program branches to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    condition = COP2Condition(cc) = 1
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

This instruction has been replaced by the BC2NEZ instruction. Refer to the ‘BC2NEZ’ instruction in this manual for more information.

Encoding:

COP2

010010

BC

01000

cc

nd

1

tf

1

offset

6

5

3

1

1

16

Format:

BC2TL cc, offset

MIPS32, removed in Release 6

Branch on COP2 True Likely

Purpose:

Branch on COP2 True Likely

To test a COP2 condition code and do a PC-relative conditional branch; execute the instruction in the delay slot only if the branch is taken.

Description:

 if COP2Condition(cc) = 1 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If the COP2 condition specified by cc is true (1), the program branches to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

This operation specification is for the general Branch On Condition operation with the tf (true/false) and nd (nullify delay slot) fields as variables. The individual instructions BC2F, BC2FL, BC2T, and BC2TL have specific values for

tf and nd.

I:    condition = COP2Condition(cc) = 1
      target_offset = (offset15)GPRLEN-(16+2) || offset || 02
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BC2T instruction instead.

Encoding:

BC

110010

offset

6

26

Format:

BC offset 

MIPS32 Release 6

Branch, Compact

Purpose:

Branch, Compact

Description:

PC = PC+4 + sign_extend( offset << 2)

A 28-bit signed offset (the 26-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address.

Compact branches have no delay slot: the instruction after the branch is NOT executed when the branch is taken.

Restrictions:

This instruction is an unconditional, always taken, compact branch. It does not have a forbidden slot, that is, a

Reserved Instruction exception is not caused by a Control Transfer Instruction placed in the slot following the branch.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Release 6 instruction BC occupies the same encoding as pre-Release 6 instruction LWC2. The LWC2 instruction has been moved to the COP2 major opcode in MIPS Release 6.

Exceptions:

None

Operation:

target_offset = sign_extend( offset || 02 )
PC = ( PC+4 + sign_extend(target_offset) )

Encoding:

BEQL

010100

rs

rt

offset

6

5

5

16

Format:

BEQL rs, rt, offset

MIPS32, removed in Release 6

Branch on Equal Likely

Purpose:

Branch on Equal Likely

To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if the branch is taken.

Description:

if GPR[rs] = GPR[rt] then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs and GPR rt are equal, branch to the target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
         condition = (GPR[rs] = GPR[rt])
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BEQ instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

BEQ

000100

rs

rt

offset

6

5

5

16

Format:

BEQ rs, rt, offset

MIPS32

Branch on Equal

Purpose:

Branch on Equal

To compare GPRs then do a PC-relative conditional branch.

Description:

if GPR[rs] = GPR[rt] then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs and GPR rt are equal, branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
         condition = (GPR[rs] = GPR[rt])
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

BEQ r0, r0 offset, expressed as B offset, is the assembly idiom used to denote an unconditional branch.

Encoding:

REGIMM

000001

rs

BGEZALL

10011

offset

6

5

5

16

Format:

BGEZALL rs, offset

MIPS32, removed in Release 6

Branch on Greater Than or Equal to Zero and Link Likely

Purpose:

Branch on Greater Than or Equal to Zero and Link Likely

To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] >= 0 then procedure_call_likely

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Branch-and-link Restartability: GPR 31 must not be used for the source register rs, because such an instruction does

not have the same effect when reexecuted. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by reexecuting the branch when an exception occurs in the branch delay slot.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] >= 0GPRLEN
      GPR[31] = PC + 8
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to addresses outside this range.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BGEZAL instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

REGIMM

000001

rs

BGEZAL

10001

offset

6

5

5

16

Format:

BGEZAL rs, offset

MIPS32, removed in Release 6

Branch on Greater Than or Equal to Zero and Link

Purpose:

Branch on Greater Than or Equal to Zero and Link

To test a GPR then do a PC-relative conditional procedure call

Description:

 if GPR[rs] >= 0 then procedure_call

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed.

Availability and Compatibility

This instruction has been removed in Release 6 with the exception of special case BAL (unconditional Branch and

Link) which was an alias for BGEZAL with rs=0.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Branch-and-link Restartability: GPR 31 must not be used for the source register rs, because such an instruction does

not have the same effect when reexecuted. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by reexecuting the branch when an exception occurs in the branch delay slot or forbidden slot.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] >= 0GPRLEN
      GPR[31] = PC + 8
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to addresses outside this range.

BGEZAL r0, offset, expressed as BAL offset, is the assembly idiom used to denote a PC-relative branch and link.

BAL is used in a manner similar to JAL, but provides PC-relative addressing and a more limited target PC range.

Encoding:

REGIMM

000001

rs

BGEZL

00011

offset

6

5

5

16

Format:

BGEZL rs, offset

MIPS32, removed in Release 6

Branch on Greater Than or Equal to Zero Likely

Purpose:

Branch on Greater Than or Equal to Zero Likely

To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] >= 0 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] >= 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BGEZ instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

REGIMM

000001

rs

BGEZ

00001

offset

6

5

5

16

Format:

BGEZ rs, offset

MIPS32

Branch on Greater Than or Equal to Zero

Purpose:

Branch on Greater Than or Equal to Zero

To test a GPR then do a PC-relative conditional branch

Description:

 if GPR[rs] >= 0 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are greater than or equal to zero (sign bit is 0), branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] >= 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

Encoding:

BGTZL

010111

rs

0

00000

offset

6

5

5

16

Format:

BGTZL rs, offset

MIPS32, removed in Release 6

Branch on Greater Than Zero Likely

Purpose:

Branch on Greater Than Zero Likely

To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] > 0 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are greater than zero (sign bit is 0 but value not zero), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] > 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BGTZ instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

BGTZ

000111

rs

0

00000

offset

6

5

5

16

Format:

BGTZ rs, offset

MIPS32

Branch on Greater Than Zero

Purpose:

Branch on Greater Than Zero

To test a GPR then do a PC-relative conditional branch.

Description:

 if GPR[rs] > 0 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are greater than zero (sign bit is 0 but value not zero), branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] > 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

Encoding:

SPECIAL3

011111

00000

rt

rd

BITSWAP

00000

BSHFL

100000

SPECIAL3

011111

00000

rt

rd

DBITSWAP

00000

DBSHFL

100100

6

5

5

5

5

6

Format:

BITSWAP DBITSWAP 

Swaps (reverses) bits in each byte

BITSWAP rd,rt

MIPS32 Release 6

Swaps (reverses) bits in each byte

DBITSWAP rd,rt

MIPS64 Release 6

Swaps (reverses) bits in each byte

Purpose:

Swaps (reverses) bits in each byte

Description:

GPR[rd].byte(i) =reverse_bits_in_byte(GPR[rt].byte(i)), for all bytes i

Each byte in input GPR rt is moved to the same byte position in output GPR rd, with bits in each byte reversed.

BITSWAP is a 32-bit instruction. BITSWAP operates on all 4 bytes of a 32-bit GPR on a 32-bit CPU. On a 64-bit

CPU, BITSWAP operates on the low 4 bytes, sign extending to 64-bits.

DBITSWAP operates on all 8 bytes of a 64-bit GPR on a 64-bit CPU.

Restrictions:

BITSWAP: None.

Availability and Compatibility:

The BITSWAP instruction is introduced by and required as of Release 6.

The DBITSWAP instruction is introduced by and required as of Release 6.

Operation:

BITSWAP:
   for i in 0 to 3 do  /* for all bytes in 32-bit GPR width */
      tmp.byte(i) = reverse_bits_in_byte( GPR[rt].byte(i) )
   endfor
   GPR[rd] = sign_extend.32( tmp )
DBITSWAP: 
   for i in 0 to 7 do  /* for all bytes in 64-bit GPR width */
      tmp.byte(i) = reverse_bits_in_byte( GPR[rt].byte(i) )
   endfor
   GPR[rd] = tmp
   where
      function reverse_bits_in_byte(inbyte)
         outbyte7 = inbyte0
         outbyte6 = inbyte1
         outbyte5 = inbyte2
         outbyte4 = inbyte3
         outbyte3 = inbyte4
         outbyte2 = inbyte5
         outbyte1 = inbyte6
         outbyte0 = inbyte7
         return outbyte
      end function

Exceptions:

BITSWAP: None

DBITSWAP: Reserved Instruction.

Programming Notes:

The Release 6 BITSWAP instruction corresponds to the DSP Module BITREV instruction, except that the latter bitreverses the least-significant 16-bit halfword of the input register, zero extending the rest, while BITSWAP operates on 32-bits.

Encoding:

BLEZL

010110

rs

0

00000

offset

6

5

5

16

Format:

BLEZL rs, offset

MIPS32, removed in Release 6

Branch on Less Than or Equal to Zero Likely

Purpose:

Branch on Less Than or Equal to Zero Likely

To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] <= 0 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are less than or equal to zero (sign bit is 1 or value is zero), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] <= 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BLEZ instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

BLEZ

000110

rs

0

00000

offset

6

5

5

16

Format:

BLEZ rs, offset

MIPS32

Branch on Less Than or Equal to Zero

Purpose:

Branch on Less Than or Equal to Zero

To test a GPR then do a PC-relative conditional branch.

Description:

 if GPR[rs] <= 0 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are less than or equal to zero (sign bit is 1 or value is zero), branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] <= 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

Encoding:

REGIMM

000001

rs

BLTZALL

10010

offset

6

5

5

16

Format:

BLTZALL rs, offset

MIPS32, removed in Release 6

Branch on Less Than Zero and Link Likely

Purpose:

Branch on Less Than Zero and Link Likely

To test a GPR then do a PC-relative conditional procedure call; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] < 0 then procedure_call_likely

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Branch-and-link Restartability: GPR 31 must not be used for the source register rs, because such an instruction does

not have the same effect when reexecuted. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by reexecuting the branch when an exception occurs in the branch delay slot.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] < 0GPRLEN
      GPR[31] = PC + 8
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BLTZAL instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

REGIMM

000001

rs

BLTZAL

10000

offset

6

5

5

16

Format:

BLTZAL rs, offset

MIPS32, removed in Release 6

Branch on Less Than Zero and Link

Purpose:

Branch on Less Than Zero and Link

To test a GPR then do a PC-relative conditional procedure call.

Description:

 if GPR[rs] < 0 then procedure_call

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed.

Availability and Compatibility:

This instruction has been removed in Release 6.

The special case BLTZAL r0, offset, has been retained as NAL in Release 6.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Branch-and-link Restartability: GPR 31 must not be used for the source register rs, because such an instruction does

not have the same effect when re-executed. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by re-executing the branch when an exception occurs in the branch delay slot.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] < 0GPRLEN
      GPR[31] = PC + 8
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to addresses outside this range.

Encoding:

REGIMM

000001

rs

BLTZL

00010

offset

6

5

5

16

Format:

BLTZL rs, offset

MIPS32, removed in Release 6

Branch on Less Than Zero Likely

Purpose:

Branch on Less Than Zero Likely

To test a GPR then do a PC-relative conditional branch; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] < 0 then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] < 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BLTZ instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

REGIMM

000001

rs

BLTZ

00000

offset

6

5

5

16

Format:

BLTZ rs, offset

MIPS32

Branch on Less Than Zero

Purpose:

Branch on Less Than Zero

To test a GPR then do a PC-relative conditional branch.

Description:

 if GPR[rs] < 0 then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs are less than zero (sign bit is 1), branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = GPR[rs] < 0GPRLEN
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump and link (JAL) or jump and link register (JALR) instructions for procedure calls to addresses outside this range.

Encoding:

BNEL

010101

rs

rt

offset

6

5

5

16

Format:

BNEL rs, rt, offset

MIPS32, removed in Release 6

Branch on Not Equal Likely

Purpose:

Branch on Not Equal Likely

To compare GPRs then do a PC-relative conditional branch; execute the delay slot only if the branch is taken.

Description:

 if GPR[rs] != GPR[rt] then branch_likely

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs and GPR rt are not equal, branch to the effective target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = (GPR[rs] != GPR[rt])
I+1:  if condition then
         PC = PC + target_offset
      else
         NullifyCurrentInstruction()
      endif

Exceptions:

None

Implementation Note:

Some implementations always predict that the branch will be taken, and do not use nor do they update the branch internal processor branch prediction tables for this instruction. To maintain performance compatibility, future implementations are encouraged to do the same.

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

In Pre-Release 6 implementations, software is strongly encouraged to avoid the use of the Branch Likely instructions, as they will be removed from a future revision of the MIPS Architecture.

Some implementations always predict the branch will be taken, so there is a significant penalty if the branch is not taken. Software should only use this instruction when there is a very high probability (98% or more) that the branch will be taken. If the branch is not likely to be taken or if the probability of a taken branch is unknown, software is encouraged to use the BNE instruction instead.

Historical Information:

In the MIPS I architecture, this instruction signaled a Reserved Instruction exception.

Encoding:

BNE

000101

rs

rt

offset

6

5

5

16

Format:

BNE rs, rt, offset

MIPS32

Branch on Not Equal

Purpose:

Branch on Not Equal

To compare GPRs then do a PC-relative conditional branch

Description:

 if GPR[rs] != GPR[rt] then branch

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

If the contents of GPR rs and GPR rt are not equal, branch to the effective target address after the instruction in the delay slot is executed.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
      condition = (GPR[rs] != GPR[rt])
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) to branch to addresses outside this range.

Encoding:

BEQ

000100

0

00000

0

00000

offset

6

5

5

16

Format:

B offset 

MIPS32, Assembly Idiom

Unconditional Branch

Purpose:

Unconditional Branch

To do an unconditional branch.

Description: branch

B offset is the assembly idiom used to denote an unconditional branch. The actual instruction is interpreted by the hardware as BEQ r0, r0, offset.

An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:    target_offset = sign_extend(offset || 02)
I+1:  PC = PC + target_offset

Exceptions:

None

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 Kbytes. Use jump (J) or jump register

(JR) or the Release 6 branch compact (BC) instructions to branch to addresses outside this range.

Encoding:

POP10

001000

BOVC rs >= rt

rs

rt

offset

POP30

011000

BNVC rs >= rt

rs

rt

offset

6

5

5

16

Format:

BOVC BNVC 

Detect overflow for add (signed 32 bits) and branch if overflow.

BOVC rs,rt,offset

MIPS32 Release 6

Detect overflow for add (signed 32 bits) and branch if overflow.

BNVC rs,rt,offset

MIPS32 Release 6

Detect overflow for add (signed 32 bits) and branch if no overflow.

Purpose:

Branch on Overflow, Compact; Branch on No Overflow, Compact

BOVC: Detect overflow for add (signed 32 bits) and branch if overflow.

BNVC: Detect overflow for add (signed 32 bits) and branch if no overflow.

Description:

branch if/if-not  NotWordValue(GPR[rs]+GPR[rt]) 

BOVC and BNVC are compact branches-they have no branch delay slots, but do have a forbidden slot.

A 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address.

On 64-bit processors, BOVC and BNVC detect signed 32-bit overflow on the input registers as well as the output.

This checking is performed even if 64-bit operations are not enabled.

The special case with rt=0 (for example, GPR[0]) is allowed. On MIPS64, this checks that the input value of rs is a well-formed signed 32-bit integer: BOVC rs,r0,offset branches if rs is not a 32-bit integer, and BNVC rs, r0 offset branches if rs is a 32-bit integer. On MIPS32, BOVC rs,r0 offset never branches, while BNVC rs,r0 offset always branches.

The special case of rs=0 and rt=0 is allowed. BOVC never branches, while BNVC always branches.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

If a control transfer instruction (CTI) is executed in the forbidden slot of a compact branch, Release 6 implementations are required to signal a Reserved Instruction exception, but only when the branch is not taken.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

See section A.4 on page 591 in Volume II for a complete overview of Release 6 instruction encodings. Brief notes related to these instructions:

Operation:

input_overflow = NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])
temp1 = sign_extend.32( GPR[rs]31..0 )
temp2 = sign_extend.32( GPR[rt]31..0 )
tempd = temp1 + temp2 // wider than 32-bit precision
sum_overflow = (tempd32 != tempd31)
BOVC: cond = sum_overflow or input_overflow
BNVC: cond = not( sum_overflow or input_overflow )
if cond then
   PC = ( PC+4 + sign_extend( offset << 2 ) )
endif

Exceptions:

None

Encoding:

SPECIAL

000000

code

BREAK

001101

6

20

6

Format:

BREAK 

MIPS32

Breakpoint

Purpose:

Breakpoint

To cause a Breakpoint exception

Description:

A breakpoint exception occurs, immediately and unconditionally transferring control to the exception handler. The

code field is available for use as software parameters, but is retrieved by the exception handler only by loading the

contents of the memory word containing the instruction.

Restrictions:

None

Operation:

SignalException(Breakpoint)

Exceptions:

Breakpoint

Encoding:

POP26

010110

BLEZC

00000

rt != 00000

offset

POP26

010110

BGEZC rs = rt

rs != 00000

rt != 00000

offset

POP26

010110

BGEC (BLEC) rs != rt

rs != 00000

rt != 00000

offset

POP27

010111

BGTZC

00000

rt != 00000

offset

POP27

010111

BLTZC rs = rt

rs != 00000

rt != 00000

offset

POP27

010111

BLTC (BGTC) rs != rt

rs != 00000

rt != 00000

offset

POP06

000110

BGEUC (BLEUC) rs != rt

rs != 00000

rt != 00000

offset

POP07

000111

BLTUC (BGTUC) rs != rt

rs != 00000

rt != 00000

offset

POP10

001000

BEQC rs < rt

rs != 00000

rt != 00000

offset

POP30

011000

BNEC rs < rt

rs != 00000

rt != 00000

offset

6

5

5

16

POP66

110110

BEQZC

rs != 00000

rs

offset

POP76

111110

BNEZC

rs != 00000

rs

offset

6

5

21

Format:

B<cond>C rs, rt, offset

Compact Compare-and-Branch Instructions

BEQC rs, rt, offset

MIPS32 Release 6

Equal/Not-Equal register-register compare and branch with 16-bit offset:

BNEC rs, rt, offset

MIPS32 Release 6

Equal/Not-Equal register-register compare and branch with 16-bit offset:

BLTC rs, rt, offset

MIPS32 Release 6

Signed register-register compare and branch with 16-bit offset:

BGEC rs, rt, offset

MIPS32 Release 6

Signed register-register compare and branch with 16-bit offset:

BLTUC rs, rt, offset

MIPS32 Release 6

Unsigned register-register compare and branch with 16-bit offset:

BGEUC rs, rt, offset

MIPS32 Release 6

Unsigned register-register compare and branch with 16-bit offset:

BGTC  rt, rs, offset

Assembly Idiom

Assembly idioms with reversed operands for signed/unsigned compare-and-branch:

BLEC  rt, rs, offset

Assembly Idiom

Assembly idioms with reversed operands for signed/unsigned compare-and-branch:

BGTUC rt, rs, offset

Assembly Idiom

Assembly idioms with reversed operands for signed/unsigned compare-and-branch:

BLEUC rt, rs, offset

Assembly Idiom

Assembly idioms with reversed operands for signed/unsigned compare-and-branch:

BLTZC rt, offset

MIPS32 Release 6

Signed Compare register to Zero and branch with 16-bit offset:

BLEZC rt, offset 

MIPS32 Release 6

Signed Compare register to Zero and branch with 16-bit offset:

BGEZC rt, offset

MIPS32 Release 6

Signed Compare register to Zero and branch with 16-bit offset:

BGTZC rt, offset 

MIPS32 Release 6

Signed Compare register to Zero and branch with 16-bit offset:

BEQZC rs, offset 

MIPS32 Release 6

Equal/Not-equal Compare register to Zero and branch with 21-bit offset:

BNEZC rs, offset 

MIPS32 Release 6

Equal/Not-equal Compare register to Zero and branch with 21-bit offset:

Purpose:

Compact Compare-and-Branch Instructions

Description:

 if condition(GPR[rs] and/or GPR[rt]) then compact branch (no delay slot)

The condition is evaluated. If the condition is true, the branch is taken.

An 18/23-bit signed offset (the 16/21-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address.

The offset is 16 bits for most compact branches, including BLTC, BLEC, BGEC, BGTC, BNEQC, BNEC, BLTUC,

BLEUC, BGEUC, BGTC, BLTZC, BLEZC, BGEZC, BGTZC. The offset is 21 bits for BEQZC and BNEZC.

Compact branches have no delay slot: the instruction after the branch is NOT executed if the branch is taken.

The conditions are as follows:

Equal/Not-equal register-register compare-and-branch with 16-bit offset:

BEQC: Compact branch if GPRs are equal

BNEC: Compact branch if GPRs are not equal

Signed register-register compare and branch with 16-bit offset:

BLTC: Compact branch if GPR rs is less than GPR rt

BGEC: Compact branch if GPR rs is greater than or equal to GPR rt

Unsigned register-register compare and branch with 16-bit offset:

BLTUC: Compact branch if GPR rs is less than GPR rt, unsigned

BGEUC: Compact branch if GPR rs is greater than or equal to GPR rt, unsigned

Assembly Idioms with Operands Reversed:

BLEC: Compact branch if GPR rt is less than or equal to GPR rs (alias for BGEC)

BGTC: Compact branch if GPR rt is greater than GPR rs (alias for BLTC)

BLEUC: Compact branch if GPR rt is less than or equal to GPR rt, unsigned (alias for BGEUC)

BGTUC: Compact branch if GPR rt is greater than GPR rs, unsigned (alias for BLTUC)

Compare register to zero and branch with 16-bit offset:

BLTZC: Compact branch if GPR rt is less than zero

BLEZC: Compact branch if GPR rt is less than or equal to zero

BGEZC: Compact branch if GPR rt is greater than or equal to zero

BGTZC: Compact branch if GPR rt is greater than zero

Compare register to zero and branch with 21-bit offset:

BEQZC: Compact branch if GPR rs is equal to zero

BNEZC: Compact branch if GPR rs is not equal to zero

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

If a control transfer instruction (CTI) is placed in the forbidden slot of a compact branch, Release 6 implementations are required to signal a Reserved Instruction exception, but only when the branch is not taken.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

Exceptions:

None

Operation:

target_offset = sign_extend( offset || 02 )
/* Register-register compare and branch, 16 bit offset: */
/* Equal / Not-Equal */
BEQC: cond = GPR[rs] = GPR[rt]
BNEC: cond = GPR[rs] != GPR[rt]
/* Signed */
BLTC: cond = GPR[rs] < GPR[rt]
BGEC: cond = GPR[rs] >= GPR[rt]
/* Unsigned: */
BLTUC: cond = unsigned(GPR[rs]) < unsigned(GPR[rt])
BGEUC: cond = unsigned(GPR[rs]) >= unsigned(GPR[rt])
/* Compare register to zero, small offset: */
BLTZC: cond = GPR[rt] < 0
BLEZC: cond = GPR[rt] <= 0
BGEZC: cond = GPR[rt] >= 0
BGTZC: cond = GPR[rt] > 0
/* Compare register to zero, large offset: */
BEQZC: cond = GPR[rs] = 0
BNEZC: cond = GPR[rs] != 0
if cond then
  PC = ( PC+4+ sign_extend( offset ) )
end if

Programming Notes:

Legacy software that performs incomplete instruction decode may incorrectly decode these new instructions, because of their very tight encoding. For example, a disassembler that looks only at the primary opcode field (instruction bits

31-26) to decode BLEZL without checking that the "rt" field is zero violates the pre-Release 6 architecture specification. Complete instruction decode allows reuse of pre-Release 6 BLEZL opcode for Release 6 conditional branches.

Encoding:

POP06

000110

BLEZALC

00000

rt != 00000

offset

POP06

000110

BGEZALC

rs = rt != 00000

rs

rt

offset

POP07

000111

BGTZALC

00000

rt != 00000

offset

POP07

000111

BLTZALC

rs = rt != 00000

rs

rt

offset

POP10

001000

BEQZALC

rs < rt

00000

rt != 00000

offset

POP30

011000

BNEZALC

rs < rt

00000

rt != 00000

offset

6

5

5

16

Format:

B{LE,GE,GT,LT,EQ,NE}ZALC 

Compact Zero-Compare and Branch-and-Link Instructions

BLEZALC rt, offset

MIPS32 Release 6

Compact branch-and-link if GPR rt is less than or equal to zero

BGEZALC rt, offset

MIPS32 Release 6

Compact branch-and-link if GPR rt is greater than or equal to zero

BGTZALC rt, offset

MIPS32 Release 6

Compact branch-and-link if GPR rt is greater than zero

BLTZALC rt, offset

MIPS32 Release 6

Compact branch-and-link if GPR rt is less than to zero

BEQZALC rt, offset

MIPS32 Release 6

Compact branch-and-link if GPR rt is equal to zero

BNEZALC rt, offset

MIPS32 Release 6

Compact branch-and-link if GPR rt is not equal to zero

Purpose:

Compact Zero-Compare and Branch-and-Link Instructions

BLEZALC: Compact branch-and-link if GPR rt is less than or equal to zero

BGEZALC: Compact branch-and-link if GPR rt is greater than or equal to zero

BGTZALC: Compact branch-and-link if GPR rt is greater than zero

BLTZALC: Compact branch-and-link if GPR rt is less than to zero

BEQZALC: Compact branch-and-link if GPR rt is equal to zero

BNEZALC: Compact branch-and-link if GPR rt is not equal to zero

Description:

if condition(GPR[rt]) then procedure_call branch (no delay slot)

The condition is evaluated. If the condition is true, the branch is taken.

Places the return address link in GPR 31. The return link is the address of the instruction immediately following the branch, where execution continues after a procedure call.

The return address link is unconditionally updated.

A 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), to form a PC-relative effective target address.

BLEZALC: the condition is true if and only if GPR rt is less than or equal to zero.

BGEZALC: the condition is true if and only if GPR rt is greater than or equal to zero.

BLTZALC: the condition is true if and only if GPR rt is less than zero.

BGTZALC: the condition is true if and only if GPR rt is greater than zero.

BEQZALC: the condition is true if and only if GPR rt is equal to zero.

BNEZALC: the condition is true if and only if GPR rt is not equal to zero.

Compact branches do not have delay slots. The instruction after a compact branch is only executed if the branch is not taken.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

If a control transfer instruction (CTI) is executed in the forbidden slot of a compact branch, Release 6 implementations are required to signal a Reserved Instruction exception, but only when the branch is not taken.

Branch-and-link Restartability: GPR 31 must not be used for the source registers, because such an instruction does

not have the same effect when reexecuted. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by reexecuting the branch when an exception occurs in the branch delay slot or forbidden slot.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

These instructions occupy primary opcode spaces originally allocated to other instructions. BLEZALC and

BGEZALC have the same primary opcode as BLEZ, and are distinguished by rs and rt register numbers. Similarly,

BGTZALC and BLTZALC have the same primary opcode as BGTZ, and are distinguished by register fields.

BEQZALC and BNEZALC reuse the primary opcodes ADDI and DADDI.

Exceptions:

None

Operation:

GPR[31] = PC+4
target_offset = sign_extend( offset || 02 )
BLTZALC: cond = GPR[rt] < 0
BLEZALC: cond = GPR[rt] <= 0
BGEZALC: cond = GPR[rt] >= 0
BGTZALC: cond = GPR[rt] > 0
BEQZALC: cond = GPR[rt] = 0
BNEZALC: cond = GPR[rt] != 0
if cond then
  PC = ( PC+4+ sign_extend( target_offset ) )
endif

Programming Notes:

Software that performs incomplete instruction decode may incorrectly decode these new instructions, because of their very tight encoding. For example, a disassembler might look only at the primary opcode field, instruction bits 31-26, to decode BLEZL without checking that the "rt" field is zero. Such software violated the pre-Release 6 architecture specification.

With the 16-bit offset shifted left 2 bits and sign extended, the conditional branch range is ± 128 KBytes. Other instructions such as pre-Release 6 JAL and JALR, or Release 6 JIALC and BALC have larger ranges. In particular,

BALC, with a 26-bit offset shifted by 2 bits, has a 28-bit range, ± 128 MBytes. Code sequences using AUIPC, DAHI,

DATI, and JIALC allow still greater PC-relative range.

Encoding:

COP1

010001

fmt

ft

fs

cc

0

A

0

FC

11

cond

6

5

5

5

3

1

1

2

4

Format:

C.cond.fmt 

Floating Point Compare

C.cond.S cc, fs, ft

MIPS32, removed in Release 6

Floating Point Compare

C.cond.D cc, fs, ft

MIPS32, removed in Release 6

Floating Point Compare

C.cond.PS cc, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Compare

Purpose:

Floating Point Compare

To compare FP values and record the Boolean result in a condition code.

Description:

 FPConditionCode(cc) = FPR[fs] compare_cond FPR[ft]

The value in FPR fs is compared to the value in FPR ft; the values are in format fmt. The comparison is exact and neither overflows nor underflows.

If the comparison specified by the cond field of the instruction is true for the operand values, the result is true; otherwise, the result is false. If no exception is taken, the result is written into condition code CC; true is 1 and false is 0.

In the cond field of the instruction: cond2..1 specify the nature of the comparison (equals, less than, and so on). cond0 specifies whether the comparison is ordered or unordered, that is, false or true if any operand is a NaN; cond3 indicates whether the instruction should signal an exception on QNaN inputs, or not (see Table 3.2).

C.cond.PS compares the upper and lower halves of FPR fs and FPR ft independently and writes the results into condition codes CC +1 and CC respectively. The CC number must be even. If the number is not even the operation of the instruction is UNPREDICTABLE.

If one of the values is an SNaN, or cond3 is set and at least one of the values is a QNaN, an Invalid Operation condition is raised and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written and an Invalid Operation exception is taken immediately. Otherwise, the Boolean result is written into condition code CC.

There are four mutually exclusive ordering relations for comparing floating point values; one relation is always true and the others are false. The familiar relations are greater than, less than, and equal. In addition, the IEEE floating point standard defines the relation unordered, which is true when at least one operand value is NaN; NaN compares unordered with everything, including itself. Comparisons ignore the sign of zero, so +0 equals -0.

The comparison condition is a logical predicate, or equation, of the ordering relations such as less than or equal,

equal, not less than, or unordered or equal. Compare distinguishes among the 16 comparison predicates. The Boolean result of the instruction is obtained by substituting the Boolean value of each ordering relation for the two FP values in the equation. If the equal relation is true, for example, then all four example predicates above yield a true

result. If the unordered relation is true then only the final predicate, unordered or equal, yields a true result.

Restrictions:

The fields fs and ft must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of C.cond.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU,.

The result of C.cond.PS is UNPREDICTABLE if the condition code number is odd.

Availability and Compatibility:

This instruction has been removed in Release 6 and has been replaced by the ‘CMP.cond fmt’ instruction. Refer to the

CMP.cond fmt instruction in this manual for more information. Release 6 does not support Paired Single (PS).

Operation:

if SNaN(ValueFPR(fs, fmt)) or SNaN(ValueFPR(ft, fmt)) or
   QNaN(ValueFPR(fs, fmt)) or QNaN(ValueFPR(ft, fmt)) then
   less = false
   equal = false
   unordered = true
   if (SNaN(ValueFPR(fs,fmt)) or SNaN(ValueFPR(ft,fmt))) or
   (cond3 and (QNaN(ValueFPR(fs,fmt)) or QNaN(ValueFPR(ft,fmt)))) then
      SignalException(InvalidOperation)
   endif
else
   less = ValueFPR(fs, fmt) <fmt ValueFPR(ft, fmt)
   equal = ValueFPR(fs, fmt) =fmt ValueFPR(ft, fmt)
   unordered = false
endif
condition = (cond2 and less) or (cond1 and equal)
      or (cond0 and unordered)
SetFPConditionCode(cc, condition)

For C.cond.PS, the pseudo code above is repeated for both halves of the operand registers, treating each half as an independent single-precision values. Exceptions on the two halves are logically ORed and reported together. The results of the lower half comparison are written to condition code CC; the results of the upper half comparison are written to condition code CC+1.

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation

Programming Notes:

FP computational instructions, including compare, that receive an operand value of Signaling NaN raise the Invalid

Operation condition. Comparisons that raise the Invalid Operation condition for Quiet NaNs in addition to SNaNs permit a simpler programming model if NaNs are errors. Using these compares, programs do not need explicit code to check for QNaNs causing the unordered relation. Instead, they take an exception and allow the exception handling system to deal with the error when it occurs. For example, consider a comparison in which we want to know if two numbers are equal, but for which unordered would be an error.

# comparisons using explicit tests for QNaN
   c.eq.d $f2,$f4   # check for equal
   nop
   bc1t  L2       # it is equal
   c.un.d $f2,$f4   # it is not equal, 
                   # but might be unordered
   bc1t  ERROR     # unordered goes off to an error handler
# not-equal-case code here
   ...
# equal-case code here
L2:
# -------------------------------------------------------------# comparison using comparisons that signal QNaN
   c.seq.d $f2,$f4 # check for equal
   nop
   bc1t  L2       # it is equal
   nop
# it is not unordered here
   ...
# not-equal-case code here
   ...
# equal-case code here

Encoding:

SPECIAL3

011111

base

op

offset

0

CACHEE

011011

6

5

5

9

1

6

Format:

CACHEE op, offset(base)

MIPS32

Perform Cache Operation EVA

Purpose:

Perform Cache Operation EVA

To perform the cache operation specified by op using a user mode virtual address while in kernel mode.

Description:

The 9-bit offset is sign-extended and added to the contents of the base register to form an effective address. The effective address is used in one of the following ways based on the operation to be performed and the type of cache as described in the following table.

A TLB Refill and TLB Invalid (both with cause code equal TLBL) exception can occur on any operation. For index operations (where the address is used to index the cache but need not match the cache tag) software should use unmapped addresses to avoid TLB exceptions. This instruction never causes TLB Modified exceptions nor TLB

Refill exceptions with a cause code of TLBS. This instruction never causes Execute-Inhibit nor Read-Inhibit exceptions.

The effective address may be an arbitrarily-aligned by address. The CACHEE instruction never causes an Address

Error Exception due to an non-aligned address.

A Cache Error exception may occur as a by-product of some operations performed by this instruction. For example, if a Writeback operation detects a cache or bus error during the processing of the operation, that error is reported via a

Cache Error exception. Similarly, a Bus Error Exception may occur if a bus operation invoked by this instruction is terminated in an error. However, cache error exceptions must not be triggered by an Index Load Tag or Index Store tag operation, as these operations are used for initialization and diagnostic purposes.

An Address Error Exception (with cause code equal AdEL) may occur if the effective address references a portion of the kernel address space which would normally result in such an exception. It is implementation dependent whether such an exception does occur.

It is implementation dependent whether a data watch is triggered by a cache instruction whose address matches the

Watch register address match conditions.

The CACHEE instruction and the memory transactions which are sourced by the CACHEE instruction, such as cache refill or cache writeback, obey the ordering and completion rules of the SYNC instruction.

Bits [17:16] of the instruction specify the cache on which to perform the operation, as follows:

Table 3.7 Encoding of Bits[17:16] of CACHEE Instruction

Code

Name

Cache

0b00

I

Primary Instruction

0b01

D

Primary Data or Unified Primary

0b10

T

Tertiary

0b11

S

Secondary

Bits [20:18] of the instruction specify the operation to perform. To provide software with a consistent base of cache operations, certain encodings must be supported on all processors. The remaining encodings are recommended

When implementing multiple level of caches and where the hardware maintains the smaller cache as a proper subset of a larger cache, it is recommended that the CACHEE instructions must first operate on the smaller, inner-level cache. For example, a Hit_Writeback _Invalidate operation targeting the Secondary cache, must first operate on the primary data cache first. If the CACHEE instruction implementation does not follow this policy then any software which flushes the caches must mimic this behavior. That is, the software sequences must first operate on the inner cache then operate on the outer cache. The software must place a SYNC instruction after the CACHEE instruction whenever there are possible writebacks from the inner cache to ensure that the writeback data is resident in the outer cache before operating on the outer cache. If neither the CACHEE instruction implementation nor the software cache flush sequence follow this policy, then the inclusion property of the caches can be broken, which might be a condition that the cache management hardware cannot properly deal with.

When implementing multiple level of caches without the inclusion property, you must use SYNC instruction after the

CACHEE instruction whenever writeback data has to be resident in the next level of memory hierarchy.

For multiprocessor implementations that maintain coherent caches, some of the Hit type of CACHEE instruction operations may optionally affect all coherent caches within the implementation. If the effective address uses a coherent Cache Coherency Attribute (CCA), then the operation is globalized, meaning it is broadcast to all of the coherent caches within the system. If the effective address does not use one of the coherent CCAs, there is no broadcast of the operation. If multiple levels of caches are to be affected by one CACHEE instruction, all of the affected cache levels must be processed in the same manner - either all affected cache levels use the globalized behavior or all affected cache levels use the non-globalized behavior.

The CACHEE instruction functions the same as the CACHE instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible . Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to 1.

Table 3.8 Encoding of Bits [20:18] of the CACHEE Instruction

Code

Caches

Name

Effective

Address

Operand

Type

Operation

Compliance

Implemented

0b000

I

Index Invalidate

Index

Set the state of the cache block at the specified index to invalid. This required encoding may be used by software to invalidate the entire instruction cache by stepping through all valid indices.

Required

D

Index Writeback Invalidate / Index

Invalidate

Index

For a write-back cache: If the state of the cache block at the specified index is valid and dirty, write the block back to the memory address specified by the cache tag. After that operation

Required

S, T

Index Writeback Invalidate / Index

Invalidate

Index

is completed, set the state of the cache block to invalid. If the block is valid but not dirty, set the state of the block to invalid. For a write-through cache: Set the state of the cache block at the specified index to invalid. This required encoding may be used by software to invalidate the entire data cache by stepping through all valid indices. Note that Index Store Tag should be used to initialize the cache at power up.

Required if S, T cache is implemented

0b001

All

Index Load Tag

Index

Read the tag for the cache block at the specified index into the TagLo and TagHi Coprocessor 0 registers. If the DataLo and DataHi registers are implemented, also read the data corresponding to the byte index into the DataLo and DataHi registers. This operation must not cause a Cache Error Exception. The granularity and alignment of the data read into the DataLo and DataHi registers is implementation-dependent, but is typically the result of an aligned access to the cache, ignoring the appropriate low-order bits of the byte index.

Recommended

0b010

All

Index Store Tag

Index

Write the tag for the cache block at the specified index from the TagLo and TagHi Coprocessor 0 registers. This operation must not cause a Cache Error Exception. This required encoding may be used by software to initialize the entire instruction or data caches by stepping through all valid indices. Doing so requires that the TagLo and TagHi registers associated with the cache be initialized first.

Required

0b011

All

Implementation

Dependent

Unspecified

Available for implementation-dependent operation.

Optional

0b100

I, D

Hit Invalidate

Address

If the cache block contains the specified address, set the state of the cache block to invalid. This required encoding may be used by software to invalidate a range of addresses from the

Required (Instruction

Cache Encoding

Only), Recommended otherwise

S, T

Hit Invalidate

Address

instruction cache by stepping through the address range by the line size of the cache. In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system.

Optional, if

Hit_Invalidate_D is implemented, the S and T variants are recommended.

0b101

I

Fill

Address

Fill the cache from the specified address.

Recommended

D

Hit Writeback Invalidate / Hit Invalidate

Address

For a write-back cache: If the cache block contains the specified address and it is valid and dirty, write the contents back to memory. After

Required

S, T

Hit Writeback Invalidate / Hit Invalidate

Address

that operation is completed, set the state of the cache block to invalid. If the block is valid but not dirty, set the state of the block to invalid. For a write-through cache: If the cache block contains the specified address, set the state of the cache block to invalid. This required encoding may be used by software to invalidate a range of addresses from the data cache by stepping through the address range by the line size of the cache. In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system.

Required if S, T cache is implemented

Restrictions:

The operation of this instruction is UNDEFINED for any operation/cache combination that is not implemented. In

Release 6, the instruction in this case should perform no operation.

The operation of this instruction is UNDEFINED if the operation requires an address, and that address is uncacheable. In Release 6, the instruction in this case should perform no operation.

The operation of the instruction is UNPREDICTABLE if the cache line that contains the CACHEE instruction is the target of an invalidate or a writeback invalidate.

If this instruction is used to lock all ways of a cache at a specific cache index, the behavior of that cache to subsequent cache misses to that cache index is UNDEFINED.

Any use of this instruction that can cause cacheline writebacks should be followed by a subsequent SYNC instruction to avoid hazards where the writeback data is not yet visible at the next level of the memory hierarchy.

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

This instruction does not produce an exception for a misaligned memory address, since it has no memory access size.

Operation:

vAddr = GPR[base] + sign_extend(offset)
(pAddr, uncached) = AddressTranslation(vAddr, DataReadReference)
CacheOp(op, vAddr, pAddr)

Exceptions:

TLB Refill Exception.

TLB Invalid Exception

Coprocessor Unusable Exception

Reserved Instruction

Address Error Exception

Cache Error Exception

Bus Error Exception

Programming Notes:

For cache operations that require an index, it is implementation dependent whether the effective address or the translated physical address is used as the cache index. Therefore, the index value should always be converted to a kseg0 address by ORing the index with 0x80000000 before being used by the cache instruction. For example, the following code sequence performs a data cache Index Store Tag operation using the index passed in GPR a0:

li    a1, 0x80000000         /* Base of kseg0 segment */
or    a0, a0, a1             /* Convert index to kseg0 address */
cache DCIndexStTag, 0(a1)    /* Perform the index store tag operation */

Encoding:

pre-Release 6

CACHE

101111

base

op

offset

6

5

5

16

Release 6

SPECIAL3

011111

base

op

offset

0

CACHE

100101

6

5

5

9

1

6

Format:

CACHE op, offset(base)

MIPS32

Perform Cache Operation

Purpose:

Perform Cache Operation

To perform the cache operation specified by op.

Description:

The 9-bit offset is sign-extended and added to the contents of the base register to form an effective address. The effective address is used in one of the following ways based on the operation to be performed and the type of cache as described in the following table.

Table 3.3 Usage of Effective Address

Operation

Requires an

Type of

Cache

Usage of Effective Address

Address

Virtual

The effective address is used to address the cache. An address translation may or may not be performed on the effective address (with the possibility that a TLB Refill or TLB Invalid exception might occur)

Address

Physical

The effective address is translated by the MMU to a physical address. The physical address is then used to address the cache

Index

N/A

The effective address is translated by the MMU to a physical address. It is implementation dependent whether the effective address or the translated physical address is used to index the cache. As such, an unmapped address (such as within kseg0) should always be used for cache operations that require an index. See the Programming Notes section below. Assuming that the total cache size in bytes is CS, the associativity is A, and the number of bytes per tag is BPT, the following calculations give the fields of the address which specify the way and the index:

OffsetBit = Log2(BPT)
 IndexBit = Log2(CS / A)
 WayBit = IndexBit + Ceiling(Log2(A))
 Way = AddrWayBit-1..IndexBit
 Index = AddrIndexBit-1..OffsetBit

For a direct-mapped cache, the Way calculation is ignored and the Index value fully specifies the cache tag. This is shown symbolically in the figure below.

Figure 3.4 Usage of Address Fields to Select Index and Way

A TLB Refill and TLB Invalid (both with cause code equal TLBL) exception can occur on any operation. For index operations (where the address is used to index the cache but need not match the cache tag), software must use unmapped addresses to avoid TLB exceptions. This instruction never causes TLB Modified exceptions nor TLB

Refill exceptions with a cause code of TLBS. This instruction never causes Execute-Inhibit nor Read-Inhibit exceptions.

The effective address may be an arbitrarily-aligned by address. The CACHE instruction never causes an Address

Error Exception due to an non-aligned address.

As a result, a Cache Error exception may occur because of some operations performed by this instruction. For example, if a Writeback operation detects a cache or bus error during the processing of the operation, that error is reported via a Cache Error exception. Also, a Bus Error Exception may occur if a bus operation invoked by this instruction is terminated in an error. However, cache error exceptions must not be triggered by an Index Load Tag or Index Store tag operation, as these operations are used for initialization and diagnostic purposes.

An Address Error Exception (with cause code equal AdEL) may occur if the effective address references a portion of the kernel address space which would normally result in such an exception. It is implementation dependent whether such an exception does occur.

It is implementation dependent whether a data watch is triggered by a cache instruction whose address matches the

Watch register address match conditions.

The CACHE instruction and the memory transactions which are sourced by the CACHE instruction, such as cache refill or cache writeback, obey the ordering and completion rules of the SYNC instruction.

Bits [17:16] of the instruction specify the cache on which to perform the operation, as follows:

Table 3.4 Encoding of Bits[17:16] of CACHE Instruction

Code

Name

Cache

0b00

I

Primary Instruction

0b01

D

Primary Data or Unified Primary

0b10

T

Tertiary

0b11

S

Secondary

Bits [20:18] of the instruction specify the operation to perform. To provide software with a consistent base of cache operations, certain encodings must be supported on all processors. The remaining encodings are recommended

When implementing multiple level of caches and where the hardware maintains the smaller cache as a proper subset of a larger cache (every address which is resident in the smaller cache is also resident in the larger cache; also known as the inclusion property). It is recommended that the CACHE instructions which operate on the larger, outer-level cache; must first operate on the smaller, inner-level cache. For example, a Hit_Writeback _Invalidate operation targeting the Secondary cache, must first operate on the primary data cache first. If the CACHE instruction implementation does not follow this policy then any software which flushes the caches must mimic this behavior. That is, the software sequences must first operate on the inner cache then operate on the outer cache. The software must place a

SYNC instruction after the CACHE instruction whenever there are possible writebacks from the inner cache to ensure that the writeback data is resident in the outer cache before operating on the outer cache. If neither the CACHE instruction implementation nor the software cache flush sequence follow this policy, then the inclusion property of the caches can be broken, which might be a condition that the cache management hardware cannot properly deal with.

When implementing multiple level of caches without the inclusion property, the use of a SYNC instruction after the

CACHE instruction is still needed whenever writeback data has to be resident in the next level of memory hierarchy.

For multiprocessor implementations that maintain coherent caches, some of the Hit type of CACHE instruction operations may optionally affect all coherent caches within the implementation. If the effective address uses a coherent

Cache Coherency Attribute (CCA), then the operation is globalized, meaning it is broadcast to all of the coherent caches within the system. If the effective address does not use one of the coherent CCAs, there is no broadcast of the operation. If multiple levels of caches are to be affected by one CACHE instruction, all of the affected cache levels must be processed in the same manner - either all affected cache levels use the globalized behavior or all affected cache levels use the non-globalized behavior.

Table 3.5 Encoding of Bits [20:18] of the CACHE Instruction

Code

Caches

Name

Effective

Address

Operand

Type

Operation

Compliance

Implemented

0b000

I

Index Invalidate

Index

Set the state of the cache block at the specified index to invalid. This required encoding may be used by software to invalidate the entire instruction cache by stepping through all valid indices.

Required

D

Index Writeback Invalidate / Index

Invalidate

Index

For a write-back cache: If the state of the cache block at the specified index is valid and dirty, write the block back to the memory address specified by the cache tag. After that operation

Required

S, T

Index Writeback Invalidate / Index

Invalidate

Index

is completed, set the state of the cache block to invalid. If the block is valid but not dirty, set the state of the block to invalid. For a write-through cache: Set the state of the cache block at the specified index to invalid. This required encoding may be used by software to invalidate the entire data cache by stepping through all valid indices. The Index Store Tag must be used to initialize the cache at power up.

Required if S, T cache is implemented

0b001

All

Index Load Tag

Index

Read the tag for the cache block at the specified index into the TagLo and TagHi Coprocessor 0 registers. If the DataLo and DataHi registers are implemented, also read the data corresponding to the byte index into the DataLo and DataHi registers. This operation must not cause a Cache Error Exception. The granularity and alignment of the data read into the DataLo and DataHi registers is implementation-dependent, but is typically the result of an aligned access to the cache, ignoring the appropriate low-order bits of the byte index.

Recommended

0b010

All

Index Store Tag

Index

Write the tag for the cache block at the specified index from the TagLo and TagHi Coprocessor 0 registers. This operation must not cause a Cache Error Exception. This required encoding may be used by software to initialize the entire instruction or data caches by stepping through all valid indices. Doing so requires that the TagLo and TagHi registers associated with the cache be initialized first.

Required

0b011

All

Implementation

Dependent

Unspecified

Available for implementation-dependent operation.

Optional

0b100

I, D

Hit Invalidate

Address

If the cache block contains the specified address, set the state of the cache block to invalid. This required encoding may be used by software to invalidate a range of addresses from the

Required (Instruction

Cache Encoding

Only), Recommended otherwise

S, T

Hit Invalidate

Address

instruction cache by stepping through the address range by the line size of the cache. In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system.

Optional, if

Hit_Invalidate_D is implemented, the S and T variants are recommended.

0b101

I

Fill

Address

Fill the cache from the specified address.

Recommended

D

Hit Writeback Invalidate / Hit Invalidate

Address

For a write-back cache: If the cache block contains the specified address and it is valid and dirty, write the contents back to memory. After

Required

S, T

Hit Writeback Invalidate / Hit Invalidate

Address

that operation is completed, set the state of the cache block to invalid. If the block is valid but not dirty, set the state of the block to invalid. For a write-through cache: If the cache block contains the specified address, set the state of the cache block to invalid. This required encoding may be used by software to invalidate a range of addresses from the data cache by stepping through the address range by the line size of the cache. In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system.

Required if S, T cache is implemented

Restrictions:

The operation of this instruction is UNDEFINED for any operation/cache combination that is not implemented. In

Release 6, the instruction in this case should perform no operation.

The operation of this instruction is UNDEFINED if the operation requires an address, and that address is uncacheable. In Release 6, the instruction in this case should perform no operation.

The operation of the instruction is UNPREDICTABLE if the cache line that contains the CACHE instruction is the target of an invalidate or a writeback invalidate.

If this instruction is used to lock all ways of a cache at a specific cache index, the behavior of that cache to subsequent cache misses to that cache index is UNDEFINED.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Any use of this instruction that can cause cacheline writebacks should be followed by a subsequent SYNC instruction to avoid hazards where the writeback data is not yet visible at the next level of the memory hierarchy.

This instruction does not produce an exception for a misaligned memory address, since it has no memory access size.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

vAddr = GPR[base] + sign_extend(offset)
(pAddr, uncached) = AddressTranslation(vAddr, DataReadReference)
CacheOp(op, vAddr, pAddr)

Exceptions:

TLB Refill Exception.

TLB Invalid Exception

Coprocessor Unusable Exception

Address Error Exception

Cache Error Exception

Bus Error Exception

Programming Notes:

Release 6 architecture implements a 9-bit offset, whereas all release levels lower than Release 6 implement a 16-bit offset.

For cache operations that require an index, it is implementation dependent whether the effective address or the translated physical address is used as the cache index. Therefore, the index value should always be converted to an unmapped address (such as an kseg0 address - by ORing the index with 0x80000000 before being used by the cache instruction). For example, the following code sequence performs a data cache Index Store Tag operation using the index passed in GPR a0:

li    a1, 0x80000000         /* Base of kseg0 segment */
or    a0, a0, a1             /* Convert index to kseg0 address */
cache DCIndexStTag, 0(a1)    /* Perform the index store tag operation */

Encoding:

COP1

010001

fmt

0

00000

fs

fd

CEIL.L

001010

6

5

5

5

5

6

Format:

CEIL.L.fmt 

Fixed Point Ceiling Convert to Long Fixed Point

CEIL.L.S   fd, fs

MIPS64, MIPS32 Release 2

Fixed Point Ceiling Convert to Long Fixed Point

CEIL.L.D   fd, fs

MIPS64, MIPS32 Release 2

Fixed Point Ceiling Convert to Long Fixed Point

Purpose:

Fixed Point Ceiling Convert to Long Fixed Point

To convert an FP value to 64-bit fixed point, rounding up.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 64-bit long fixed point format and rounding toward +<=

(rounding mode 2). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

263–1. On cores with FCSRNAN2008=1, the default result is:

263–1 when the input value is +inf or rounds to a number larger than 263–1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for long fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Operation:

StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact

Encoding:

COP1

010001

fmt

0

00000

fs

fd

CEIL.W

001110

6

5

5

5

5

6

Format:

CEIL.W.fmt 

Floating Point Ceiling Convert to Word Fixed Point

CEIL.W.S   fd, fs

MIPS32

Floating Point Ceiling Convert to Word Fixed Point

CEIL.W.D   fd, fs

MIPS32

Floating Point Ceiling Convert to Word Fixed Point

Purpose:

Floating Point Ceiling Convert to Word Fixed Point

To convert an FP value to 32-bit fixed point, rounding up

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 32-bit word fixed point format and rounding toward +<=

(rounding mode 2). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

231-1. On cores with FCSRNAN2008=1, the default result is:

231-1 when the input value is +inf or rounds to a number larger than 231-1

The fields fs and fd must specify valid FPRs; fs for type fmt and fd for word fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact

Encoding:

COP1

010001

CF

00010

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

CFC1 rt, fs

MIPS32

Move Control Word From Floating Point

Purpose:

Move Control Word From Floating Point

To copy a word from an FPU control register to a GPR.

Description:

 GPR[rt] = FP_Control[fs]

Copy the 32-bit word from FP (coprocessor 1) control register fs into GPR rt, sign-extending it to 64 bits.

The definition of this instruction has been extended in Release 5 to support user mode read and write of StatusFR under the control of Config5UFR. This optional feature is meant to facilitate transition from FR=0 to FR=1 floating-point register modes in order to obsolete FR=0 mode in a future architecture release. User code may set and clear StatusFR without kernel intervention, providing kernel explicitly provides permission.

This UFR facility is not supported in Release 6 because Release 6 only allows FR=1 mode. Accessing the UFR and

UNFR registers causes a Reserved Instruction exception in Release 6 because FIRUFRP is always 0.

The definition of this instruction has been extended in Release 6 to allow user code to read and modify the Config5FRE bit. Such modification is allowed when this bit is present (as indicated by FIRUFRP) and user mode

modification of the bit is enabled by the kernel (as indicated by Config5UFE). Setting Config5FRE to 1 causes all floating point instructions which are not compatible with FR=1 mode to take an Reserved Instruction exception. This makes it possible to run pre-Release 6 FR=0 floating point code on a Release 6 core which only supports FR=1 mode, provided the kernel has been set up to trap and emulate FR=0 behavior for these instructions. These instructions include floating-point arithmetic instructions that read/write single-precision registers, LWC1, SWC1, MTC1, and

MFC1 instructions.

The FRE facility uses COP1 register aliases FRE and NFRE to access Config5FRE.

Restrictions:

There are a few control registers defined for the floating point unit. Prior to Release 6, the result is UNPREDICTABLE if fs specifies a register that does not exist. In Release 6 and later, a Reserved Instruction exception occurs if fs specifies a register that does not exist.

The result is UNPREDICTABLE if fs specifies the UNFR or NFRE write-only control. Release 6 and later implementations are required to produce a Reserved Instruction exception; software must assume it is UNPREDICTABLE.

Operation:

   if fs = 0 then
   temp = FIR
elseif fs = 1 then /* read UFR (CP1 Register 1) */
   if FIRUFRP then
      if not Config5UFR then SignalException(ReservedInstruction) endif
      temp = StatusFR
   else
      if ConfigAR ≥ 2 SignalException(ReservedInstruction) /* Release 6 traps */ 
      endif
      temp = UNPREDICTABLE
   endif
elseif fs = 4 then /* read fs=4 UNFR not supported for reading - UFR suffices */
      if ConfigAR ≥ 2 SignalException(ReservedInstruction) /* Release 6 traps */ 
      endif
      temp = UNPREDICTABLE
elseif fs=5 then /* user read of FRE, if permitted */
   if ConfigAR <= 2 then temp = UNPREDICTABLE
   else
      if not Config5UFR then SignalException(ReservedInstruction) endif
      temp = 031 || Config5FRE 
   endif
elseif fs = 25 then /* FCCR */
   temp = 024 || FCSR31..25 || FCSR23
elseif fs = 26 then /* FEXR */
   temp = 014 || FCSR17..12 || 05 || FCSR6..2 || 02
elseif fs = 28 then /* FENR */
   temp = 020 || FCSR11.7 || 04 || FCSR24 || FCSR1..0
elseif fs = 31 then /* FCSR */
   temp = FCSR
else
   if Config2AR ≥ 2 SignalException(ReservedInstruction)
   /*Release 6 traps; includes NFRE*/
   endif
   temp = UNPREDICTABLE
endif
if Config2AR < 2 then
   GPR[rt] = sign_extend(temp)
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Historical Information:

For the MIPS I, II and III architectures, the contents of GPR rt are UNPREDICTABLE for the instruction immediately following CFC1.

MIPS V and MIPS32 introduced the three control registers that access portions of FCSR. These registers were not available in MIPS I, II, III, or IV.

MIPS32 Release 5 introduced the UFR and UNFR register aliases that allow user level access to StatusFR. Release 6 removes them.

Encoding:

COP2

010010

CF

00010

rt

Impl

6

5

5

16

Format:

CFC2 rt, Impl

MIPS32

Move Control Word From Coprocessor 2

Purpose:

Move Control Word From Coprocessor 2

To copy a word from a Coprocessor 2 control register to a GPR

Description:

 GPR[rt] = CP2CCR[Impl]

Copy the 32-bit word from the Coprocessor 2 control register denoted by the Impl field, sign-extending it to 64 bits.

The interpretation of the Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the architecture.

Restrictions:

The result is UNPREDICTABLE if Impl specifies a register that does not exist.

Operation:

temp = CP2CCR[Impl]
GPR[rt] = sign_extend(temp)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

fmt

00000

fs

fd

CLASS

011011

6

5

5

5

2

9

Format:

CLASS.fmt 

Scalar Floating-Point Class Mask

CLASS.S fd,fs

MIPS32 Release 6

Scalar Floating-Point Class Mask

CLASS.D fd,fs

MIPS32 Release 6

Scalar Floating-Point Class Mask

Purpose:

Scalar Floating-Point Class Mask

Scalar floating-point class shown as a bit mask for Zero, Negative, Infinite, Subnormal, Quiet NaN, or Signaling

NaN.

Description:

FPR[fd] = class(FPR[fs])

Stores in fd a bit mask reflecting the floating-point class of the floating point scalar value fs.

The mask has 10 bits as follows. Bits 0 and 1 indicate NaN values: signaling NaN (bit 0) and quiet NaN (bit 1). Bits

2, 3, 4, 5 classify negative values: infinity (bit 2), normal (bit 3), subnormal (bit 4), and zero (bit 5). Bits 6, 7, 8, 9 classify positive values: infinity (bit 6), normal (bit 7), subnormal (bit 8), and zero (bit 9).

This instruction corresponds to the class operation of the IEEE Standard for Floating-Point Arithmetic 754TM-2008.

This scalar FPU instruction also corresponds to the vector FCLASS.df instruction of MSA.

The input values and generated bit masks are not affected by the flush-subnormal-to-zero mode FCSR.FS.

The input operand is a scalar value in floating-point data format fmt. Bits beyond the width of fmt are ignored. The result is a 10-bit bitmask as described above, zero extended to fmt-width bits. Coprocessor register bits beyond fmtwidth bits are UNPREDICTABLE (e.g., for CLASS.S bits 32-63 are UNPREDICTABLE on a 64-bit FPU, while bits

32-128 bits are UNPREDICTABLE if the processor supports MSA).

Restrictions:

No data-dependent exceptions are possible.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

CLASS.fmt is defined only for formats S and D. Other formats must produce a Reserved Instruction exception

(unless used for a different instruction).

Operation:

if not IsCoprocessorEnabled(1) 
   then SignalException(CoprocessorUnusable, 1) endif
if not IsFloatingPointImplemented(fmt)) 
   then SignalException(ReservedInstruction) endif
fin = ValueFPR(fs,fmt)
masktmp = ClassFP(fin, fmt)
StoreFPR (fd, fmt, ftmp )
/* end of instruction */
function ClassFP(tt, ts, n)
/* Implementation defined class operation. */
endfunction ClassFP

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

pre-Release 6

SPECIAL2

011100

rs

rt

rd

0

00000

CLO

100001

6

5

5

5

5

6

Release 6

SPECIAL

000000

rs

00000

rd

00001

CLO

010001

6

5

5

5

5

6

Format:

CLO rd, rs

MIPS32

Count Leading Ones in Word

Purpose:

Count Leading Ones in Word

To count the number of leading ones in a word.

Description:

 GPR[rd] = count_leading_ones GPR[rs]

Bits 31..0 of GPR rs are scanned from most significant to least significant bit. The number of leading ones is counted and the result is written to GPR rd. If all of bits 31..0 were set in GPR rs, the result written to GPR rd is 32.

Restrictions:

Pre-Release 6: To be compliant with the MIPS32 and MIPS64 Architecture, software must place the same GPR number in both the rt and rd fields of the instruction. The operation of the instruction is UNPREDICTABLE if the rt and

rd fields of the instruction contain different values. Release 6's new instruction encoding does not contain an rt field.

If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the results of the operation are

UNPREDICTABLE.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

if NotWordValue(GPR[rs]) then
   UNPREDICTABLE
endif
temp = 32
for i in 31 .. 0
   if GPR[rs]i = 0 then
      temp = 31 - i
      break
   endif
endfor
GPR[rd] = temp

Exceptions:

None

Programming Notes:

As shown in the instruction drawing above, the Release 6 architecture sets the 'rt' field to a value of 00000.

Encoding:

pre-Release 6

SPECIAL2

011100

rs

rt

rd

0

00000

CLZ

100000

6

5

5

5

5

6

Release 6

SPECIAL

000000

rs

00000

rd

00001

CLZ

010000

6

5

5

5

5

6

Format:

CLZ rd, rs

MIPS32

Count Leading Zeros in Word

Purpose:

Count Leading Zeros in Word

Count the number of leading zeros in a word.

Description:

 GPR[rd] = count_leading_zeros GPR[rs]

Bits 31..0 of GPR rs are scanned from most significant to least significant bit. The number of leading zeros is counted and the result is written to GPR rd. If no bits were set in GPR rs, the result written to GPR rdis 32.

Restrictions:

Pre-Release 6: To be compliant with the MIPS32 and MIPS64 Architecture, software must place the same GPR number in both the rt and rd fields of the instruction. The operation of the instruction is UNPREDICTABLE if the rt and

rd fields of the instruction contain different values. Release 6's new instruction encoding does not contain an rt field.

If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the results of the operation are

UNPREDICTABLE.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

if NotWordValue(GPR[rs]) then
   UNPREDICTABLE
endif
temp = 32
for i in 31 .. 0
   if GPR[rs]i = 1 then
      temp = 31 - i
      break
   endif
endfor
GPR[rd] = temp

Exceptions:

None

Programming Notes:

Release 6 sets the 'rt' field to a value of 00000.

Encoding:

COP1

010001

CMP.condn.S

10100

ft

fs

fd

0

condn

COP1

010001

CMP.condn.D

10101

ft

fs

fd

0

condn

6

5

5

5

5

1

5

Format:

CMP.condn.fmt 

Floating Point Compare Setting Mask

CMP.condn.S fd, fs, ft

MIPS32 Release 6

Floating Point Compare Setting Mask

CMP.condn.D fd, fs, ft

MIPS32 Release 6

Floating Point Compare Setting Mask

Purpose:

Floating Point Compare Setting Mask

To compare FP values and record the result as a format-width mask of all 0s or all 1s in a floating point register

Description:

 FPR[fd] = FPR[fs] compare_cond FPR[ft]

The value in FPR fs is compared to the value in FPR ft.

The comparison is exact and neither overflows nor underflows.

If the comparison specified by the condn field of the instruction is true for the operand values, the result is true; otherwise, the result is false. If no exception is taken, the result is written into FPR fd; true is all 1s and false is all 0s, repeated the operand width of fmt. All other bits beyond the operand width fmt are UNPREDICTABLE. For example, a 32-bit single precision comparison writes a mask of 32 0s or 1s into bits 0 to 31 of FPR fd. It makes bits 32 to 63

UNPREDICTABLE if a 64-bit FPU without MSA is present. It makes bits 32 to 127 UNPREDICTABLE if MSA is present.

The values are in format fmt. These instructions, however, do not use an fmt field to determine the data type.

The condn field of the instruction specifies the nature of the comparison: equals, less than, and so on, unordered or ordered, signalling or quiet, as specified in Table 3.9 "Comparing CMP.condn.fmt, IEEE 754-2008, C.cond.fmt, and

MSA FP compares" on page 160.

Release 6: The condn field bits have specific purposes: cond4, and cond2..1 specify the nature of the comparison

(equals, less than, and so on); cond0 specifies whether the comparison is ordered or unordered, that is false or true if any operand is a NaN; cond3 indicates whether the instruction should signal an exception on QNaN inputs. However, in the future the MIPS ISA may be extended in ways that do not preserve these meanings.

All encodings of the condn field that are not specified (for example, items shaded in Table 3.9) are reserved in

Release 6 and produce a Reserved Instruction exception.

If one of the values is an SNaN, or if a signalling comparison is specified and at least one of the values is a QNaN, an

Invalid Operation condition is raised and the Invalid Operation flag is set in the FCSR. If the Invalid Operation

Enable bit is set in the FCSR, no result is written and an Invalid Operation exception is taken immediately. Otherwise,

the mask result is written into FPR fd.

There are four mutually exclusive ordering relations for comparing floating point values; one relation is always true and the others are false. The familiar relations are greater than, less than, and equal. In addition, the IEEE floating point standard defines the relation unordered, which is true when at least one operand value is NaN; NaN compares unordered with everything, including itself. Comparisons ignore the sign of zero, so +0 equals -0.

The comparison condition is a logical predicate, or equation, of the ordering relations such as less than or equal,

equal, not less than, or unordered or equal. Compare distinguishes among the 16 comparison predicates. The Boolean result of the instruction is obtained by substituting the Boolean value of each ordering relation for the two FP values in the equation. For example: If the equal relation is true, then all four example predicates above yield a true

result. If the unordered relation is true then only the final predicate, unordered or equal, yields a true result.

The predicates implemented are described in Table 3.9 "Comparing CMP.condn fmt, IEEE 754-2008, C.cond.fmt, and MSA FP compares" on page 160. Not all of the 16 IEEE predicates are implemented directly by hardware. For the directed comparisons (LT, LE, GT, GE) the missing predicates can be obtained by reversing the FPR register operands ft and fs. For example, the hardware implements the "Ordered Less Than" predicate LT(fs,ft); reversing the operands LT(ft,fs) produces the dual predicate "Unordered or Greater Than or Equal" UGE(fs,ft). Table 3.9 shows these mappings. Reversing inputs is ineffective for the symmetric predicates such as EQ; Release 6 implements these negative predicates directly, so that all mask values can be generated in a single instruction.

Table 3.9 compares CMP.condn.fmt to (1) the MIPS32 Pre-Release 6 C.cond.fmt instructions, and (2) the (MSA)

MIPS SIMD Architecture packed vector floating point comparison instructions. CMP.condn fmt provides exactly the same comparisons for FPU scalar values that MSA provides for packed vectors, with similar mnemonics.

CMP.condn fmt provides a superset of the MIPS32 Release 5 C.cond fmt comparisons.

In addition, Table 3.9 shows the corresponding IEEE 754-2008 comparison operations.

Restrictions:

Operation:

if SNaN(ValueFPR(fs, fmt)) or SNaN(ValueFPR(ft, fmt)) or
   QNaN(ValueFPR(fs, fmt)) or QNaN(ValueFPR(ft, fmt)) 
then
   less = false
   equal = false
   unordered = true
   if (SNaN(ValueFPR(fs,fmt)) or SNaN(ValueFPR(ft,fmt))) or
      (cond3 and (QNaN(ValueFPR(fs,fmt)) or QNaN(ValueFPR(ft,fmt)))) then
          SignalException(InvalidOperation)
   endif
   else
      less = ValueFPR(fs, fmt) <fmt ValueFPR(ft, fmt)
      equal = ValueFPR(fs, fmt) =fmt ValueFPR(ft, fmt)
      unordered = false
   endif
   condition = cond4 xor (
          (cond2 and less) 
          or (cond1 and equal)
          or (cond0 and unordered) )
   StoreFPR (fd, fmt, ExtendBit.fmt(condition))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation

Encoding:

COP2

010010

CO

1

cofun

6

1

25

Format:

COP2 func

MIPS32

Coprocessor Operation to Coprocessor 2

Purpose:

Coprocessor Operation to Coprocessor 2

To perform an operation to Coprocessor 2.

Description:

 CoprocessorOperation(2, cofun)

An implementation-dependent operation is performed to Coprocessor 2, with the cofun value passed as an argument.

The operation may specify and reference internal coprocessor registers, and may change the state of the coprocessor conditions, but does not modify state within the processor. Details of coprocessor operation and internal state are described in the documentation for each Coprocessor 2 implementation.

Restrictions:

Operation:

CoprocessorOperation(2, cofun)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL3

011111

rs

rt

00000

000

sz

CRC

001111

6

5

5

5

3

2

6

Format:

CRC32B, CRC32H, CRC32W, CRC32D 

Generate CRC with reversed polynomial 0xEDB88320

CRC32B rt, rs, rt

MIPS32 Release 6

Generate CRC with reversed polynomial 0xEDB88320

CRC32H rt, rs, rt

MIPS32 Release 6

Generate CRC with reversed polynomial 0xEDB88320

CRC32W rt, rs, rt

MIPS32 Release 6

Generate CRC with reversed polynomial 0xEDB88320

CRC32D rt, rs, rt

MIPS64 Release 6

Generate CRC with reversed polynomial 0xEDB88320

Purpose:

Generate CRC with reversed polynomial 0xEDB88320

Description:

GPR[rt] = CRC32(GRP[rs], GPR[rt])

CRC32B/H/W/D generates a 32-bit Cyclic Redundancy Check (CRC) value based on the reversed polynomial

0xEDB88320. The new 32-bit CRC value is generated with a cumulative 32-bit CRC value input as GPR[rt] and a byte or half-word or word or double-word message right-justified in GPR[rs]. The message size is encoded in field sz of the instruction.

The generated value overwrites the input CRC value in GPR[rt], after sign extension, as the original value is considered redundant once the cumulative CRC value is re-generated with the additional message. More importantly, source-destroying definition of the CRC instruction allows the instruction to be included in a loop without having to move the destination to the source for the next iteration of the instruction.

The CRC32B/H/W/D instruction does not pad the input message. It is software's responsibility to ensure the input message, whether byte, half-word, word or double-word is fully-defined, otherwise the result is UNPREDICTABLE and thus unusable.

The reversed polynomial is a 33-bit polynomial of degree 32. Since the coefficient of most significance is always 1, it is dropped from the 32-bit binary number, as per standard representation. The order of the remaining coefficients increases from right to left in the binary representation.

Since the CRC is processed more than a bit at a time, the order of bits in the data elements of size byte, half-word, word or double-word is important. The specification assumes support for an "lsb-first" (little-endian) standard, and thus coefficients of polynomial terms that represent the message must be ordered from right to left in order of deccreasing significance.

The specification of the CRC instruction assumes the following in regards to a message of arbitrary length whose 32bit CRC value is to be generated. The message itself is a polynomial represented by binary coefficients of each term of the polynomial.

The typical use of CRC is to generate a checksum to accompany a message that is transmitted electronically in order to detect transmission errors at the receiving end. If the message is considered to be a polynomial, the coefficient of the most-significant term is transmitted first followed by remaining bits in order of decreasing significance, followed by the 32-bit CRC. The specification for these CRC instruction is thus most appropriate for standards that transmit least significant bit of data first (little-endian), such as IEEE 802 Ethernet. The least-significant bit of data conveniently maps to the coefficient of the most-significant term of the message polynomial.

Restrictions:

No data-dependent exceptions are possible.

Operation:

if (Config5CRCP = 0) then
   SignalException(ReservedInstruction)
endif
if (sz = 0b00) then
   temp = CRC32(GPR[rt], GPR[rs], 1, 0xEDB88320)
else if (sz = 0b01) then
   temp = CRC32(GPR[rt], GPR[rs], 2, 0xEDB88320)
else if (sz = 0b10) then
   temp = CRC32(GPR[rt], GPR[rs], 4, 0xEDB88320)
else if (sz = 0b11) then
   if (Are64BitOperationsEnabled()) then
      temp = CRC32(GPR[rt], GPR[rs], 8, 0xEDB88320)
   else
      SignalException(ReservedInstruction)
   endif
endif
GPR[rt] = sign_extend(temp)
// Bit oriented definition of CRC32 function
function CRC32(value, message, numbytes, poly)
   # value - right-justified current 32-bit CRC value
   # message - right-justified byte/half-word/word/double-word message
   # numbytes - size of message in bytes: byte/half-word/word/double-word
   # poly - 32-bit reversed polynomial
   value = (value and 0xffffffff) xor {(64-(numbytes*8))'b0,message} 
   for (i=0; i<numbytes*8; i++)
      if (value and 0d1) then // check most significant coefficient
         value = (value >> 1) xor poly
      else
         value = (value >> 1) 
      endif
   endfor
   return value
endfunction

Exceptions:

Reserved Instruction Exception

Restriction:

These instructions are implemented in Release 6 only if CP0 Config5CRCP is set to 1.

Programing Notes:

When calculating CRC, it is recommended that the initial value of GPR[rt] be all ones, when the CRC instruction is the first in the sequence to be referenced. This allows the CRC to differentiate between actual leading zeroes in the message element, and zeros added by transmission errors. The initial all one's value makes no difference to the CRC calculation as long as both sender and receiver use the same assumption on the initial value, to generate and check respectively.

If the order of bits in bytes assumes most-significant bit first, then Release 6 BITSWAP can be used to reverse the order of bits in order to operate with these instructions. However BITSWAP would only apply to byte messages.

CRC32B/H/W/D instructions are interchangeable: a series of low-order CRC instructions can be reduced to a series of high-order CRC32H operations, to increase throughput of the overall CRC generation process. The process of doing this will add trailing zeroes to the message for which CRC is being generated, since the data element is now larger, however, this will not change the CRC value that is generated. It is the original message that must be transmitted along with the CRC, without the trailing zeroes.

In pseudo-assembly, the following sequence of byte CRC operations may be used to generate a cumulative CRC value. (Pseudo-assembly is used to clearly indicate terms which need to be modified for interchangeability.)

li $3, 0xFFFF_FFFF     // initialize CRC value
la $4, memaddr         // assume word-aligned for convenience
for (i=0; i < byte_cnt; i++)
lb $2, 0($4)        // read message bytes
crc32b $3, $2, $3
add $4, $4, 1       // increment byte memory address by 1
endfor

This is equivalent to the sequence of word CRC operations. The simple example assumes some multiple of 4 bytes are processed.

for (i=0; i < byte_cnt/4; i++)
lw $2, 0($4)        // read message words
crc32w $3, $2, $3
add $4, $4, 4       // increment word memory address by 4
endfor

The throughput is thus increased by a multiple of 4 as only a quarter of the byte oriented operations occur.

Encoding:

SPECIAL3

011111

rs

rt

00000

001

sz

CRC

001111

6

5

5

5

3

2

6

Format:

CRC32CB, CRC32CH, CRC32CW, CRC32CD 

Generate CRC with reversed polynomial 0x82F63B78

CRC32CB rt, rs, rt

MIPS32 Release 6

Generate CRC with reversed polynomial 0x82F63B78

CRC32CH rt, rs, rt

MIPS32 Release 6

Generate CRC with reversed polynomial 0x82F63B78

CRC32CW rt, rs, rt

MIPS32 Release 6

Generate CRC with reversed polynomial 0x82F63B78

CRC32CD rt, rs, rt

MIPS64 Release 6

Generate CRC with reversed polynomial 0x82F63B78

Purpose:

Generate CRC with reversed polynomial 0x82F63B78

Description:

GPR[rt] = CRC32C(GRP[rs], GPR[rt])

CRC32CB/H/W/D generates a 32-bit Cyclic Redundancy Check (CRC) value based on the reversed polynomial

0x82F63B78 (Castagnoli). The new 32-bit CRC value is generated with a cumulative 32-bit CRC value input as

GPR[rt] and a byte or half-word or word or double-word message right-justified in GPR[rs]. The message size is encoded in field sz of the instruction.

The generated value overwrites the input CRC value in GPR[rt], after sign extension, as the original value is considered redundant once the cumulative CRC value is re-generated with the additional message. More importantly, source-destroying definition of the CRC instruction allows the instruction to be included in a loop without having to move the destination to the source for the next iteration of the instruction.

The CRC32CB/H/W/D instruction does not pad the input message. It is software's responsibility to ensure the input message, whether byte, half-word, word or double-word is fully-defined, otherwise the result is UNPREDICTABLE and thus unusable.

The reversed polynomial is a 33-bit polynomial of degree 32. Since the coefficient of most significance is always 1, it is dropped from the 32-bit binary number, as per standard representation. The order of the remaining coefficients increases from right to left in the binary representation.

Since the CRC is processed more than a bit at a time, the order of bits in the data elements of size byte, half-word, word or double-word is important. The specification assumes support for an "lsb-first" (little-endian) standard, and thus coefficients of polynomial terms that represent the message must be ordered from right to left in order of deccreasing significance.

The specification of the CRC instruction assumes the following in regards to a message of arbitrary length whose 32bit CRC value is to be generated. The message itself is a polynomial represented by binary coefficients of each term of the polynomial.

The typical use of CRC is to generate a checksum to accompany a message that is transmitted electronically in order to detect transmission errors at the receiving end. If the message is considered to be a polynomial, the coefficient of the most-significant term is transmitted first followed by remaining bits in order of decreasing significance, followed by the 32-bit CRC. The specification for these CRC instruction is thus most appropriate for standards that transmit least significant bit of data first (little-endian), such as IEEE 802 Ethernet. The least-significant bit of data conveniently maps to the coefficient of the most-significant term of the message polynomial.

Restrictions:

No data-dependent exceptions are possible.

Operation:

if (Config5CRCP = 0) then
   SignalException(ReservedInstruction)
endif
if (sz = 0b00) then
   temp = CRC32(GPR[rt], GPR[rs], 1, 0x82F63B78)
else if (sz = 0b01) then
   temp = CRC32(GPR[rt], GPR[rs], 2, 0x82F63B78)
else if (sz = 0b10) then
   temp = CRC32(GPR[rt], GPR[rs], 4, 0x82F63B78)
else if (sz = 0b11) then
   if (Are64BitOperationsEnabled()) then
      temp = CRC32(GPR[rt], GPR[rs], 8, 0x82F63B78)
   else
      SignalException(ReservedInstruction)
   endif
endif
GPR[rt] = sign_extend(temp)
// Bit oriented definition of CRC32 function
function CRC32(value, message, numbytes, poly)
   # value - right-justified current 32-bit CRC value
   # message - right-justified byte/half-word/word/double-word message
   # numbytes - size of message in bytes: byte/half-word/word/double-word
   # poly - 32-bit reversed polynomial
   value = (value and 0xffffffff) xor {(64-(numbytes*8))'b0,message} 
   for (i=0; i<numbytes*8; i++)
      if (value and 0d1) then // check most significant coefficient
         value = (value >> 1) xor poly
      else
         value = (value >> 1) 
      endif
   endfor
   return value
endfunction

Exceptions:

Reserved Instruction Exception

Restriction:

These instructions are implemented in Release 6 only if CP0 Config5CRCP is set to 1.

Programing Notes:

When calculating CRC, it is recommended that the initial value of GPR[rt] be all ones, when the CRC instruction is the first in the sequence to be referenced. This allows the CRC to differentiate between actual leading zeroes in the message element, and zeros added by transmission errors. The initial all one's value makes no difference to the CRC calculation as long as both sender and receiver use the same assumption on the initial value, to generate and check respectively.

If the order of bits in bytes assumes most-significant bit first, then Release 6 BITSWAP can be used to reverse the order of bits in order to operate with these instructions. However BITSWAP would only apply to byte messages.

CRC32CB/H/W/D instructions are interchangeable: a series of low-order CRC instructions can be reduced to a series of high-order CRC32CH operations, to increase throughput of the overall CRC generation process. The process of doing this will add trailing zeroes to the message for which CRC is being generated, since the data element is now larger, however, this will not change the CRC value that is generated. It is the original message that must be transmitted along with the CRC, without the trailing zeroes.

In pseudo-assembly, the following sequence of byte CRC operations may be used to generate a cumulative CRC value. (Pseudo-assembly is used to clearly indicate terms which need to be modified for interchangeability.)

li $3, 0xFFFF_FFFF     // initialize CRC value
la $4, memaddr         // assume word-aligned for convenience
for (i=0; i < byte_cnt; i++)
lb $2, 0($4)        // read message bytes
crc32cb $3, $2, $3
add $4, $4, 1       // increment byte memory address by 1
endfor

This is equivalent to the sequence of word CRC operations. The simple example assumes some multiple of 4 bytes are processed.

for (i=0; i < byte_cnt/4; i++)
lw $2, 0($4)        // read message words
crc32cw $3, $2, $3
add $4, $4, 4       // increment word memory address by 4
endfor

The throughput is thus increased by a multiple of 4 as only a quarter of the byte oriented operations occur.

Encoding:

COP1

010001

CT

00110

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

CTC1 rt, fs

MIPS32

Move Control Word to Floating Point

Purpose:

Move Control Word to Floating Point

To copy a word from a GPR to an FPU control register.

Description:

 FP_Control[fs] = GPR[rt]

Copy the low word from GPR rt into the FP (coprocessor 1) control register indicated by fs.

Writing to the floating point Control/Status register, the FCSR, causes the appropriate exception if any Cause bit and its corresponding Enable bit are both set. The register is written before the exception occurs. Writing to FEXR to set a cause bit whose enable bit is already set, or writing to FENR to set an enable bit whose cause bit is already set causes the appropriate exception. The register is written before the exception occurs and the EPC register contains the address of the CTC1 instruction.

The definition of this instruction has been extended in Release 5 to support user mode read and write of StatusFR under the control of Config5UFR. This optional feature is meant to facilitate transition from FR=0 to FR=1 floating-point register modes in order to obsolete FR=0 mode in a future architecture release. User code may set and clear StatusFR without kernel intervention, providing kernel explicitly provides permission.

This UFR facility is not supported in Release 6 since Release 6 only allows FR=1 mode. Accessing the UFR and

UNFR registers causes a Reserved Instruction exception in Release 6 since FIRUFRP is always 0.

The definition of this instruction has been extended in Release 6 to allow user code to read and modify the Config5FRE bit. Such modification is allowed when this bit is present (as indicated by FIRUFRP) and user mode

modification of the bit is enabled by the kernel (as indicated by Config5UFE). Setting Config5FRE to 1 causes all floating point instructions which are not compatible with FR=1 mode to take an Reserved Instruction exception. This makes it possible to run pre-Release 6 FR=0 floating point code on a Release 6 core which only supports FR=1 mode, provided the kernel has been set up to trap and emulate FR=0 behavior for these instructions. These instructions include floating-point arithmetic instructions that read/write single-precision registers, LWC1, SWC1, MTC1, and

MFC1 instructions.

The FRE facility uses COP1 register aliases FRE and NFRE to access Config5FRE.

Restrictions:

There are a few control registers defined for the floating point unit. Prior to Release 6, the result is UNPREDICTABLE if fs specifies a register that does not exist. In Release 6 and later, a Reserved Instruction exception occurs if fs specifies a register that does not exist.

Furthermore, the result is UNPREDICTABLE if fd specifies the UFR, UNFR, FRE and NFRE aliases, with fs anything other than 00000, GPR[0]. Release 6 implementations and later are required to produce a Reserved Instruction exception; software must assume it is UNPREDICTABLE.

Operation:

temp = GPR[rt]31..0
if (fs = 1 or fs = 4) then
   /* clear UFR or UNFR(CP1 Register 1)*/
   if ConfigAR >= 2 SignalException(ReservedInstruction) /* Release 6 traps */ endif
   if not Config5UFR then SignalException(ReservedInstruction) endif
   if not (rt = 0 and FIRUFRP) then UNPREDICTABLE /*end of instruction*/ endif
   if fs = 1 then StatusFR = 0
   elseif fs = 4 then StatusFR = 1
   else /* cannot happen */
elseif fs=5 then /* user write of 1 to FRE, if permitted */
   if ConfigAR <= 2 then UNPREDICTABLE
   else
      if rt != 0 then SignalException(ReservedInstruction) endif
      if not Config5UFR then SignalException(ReservedInstruction) endif
      Config5UFR = 0
   endif
elseif fs=6 then /* user write of 0 to FRE, if permitted (NFRE alias) */
   if ConfigAR <= 2 then UNPREDICTABLE
   else
      if rt != 0 then SignalException(ReservedInstruction) endif
      if not Config5UFR then SignalException(ReservedInstruction) endif
      Config5UFR = 1
   endif
elseif fs = 25 then /* FCCR */
   if temp31..8 != 024 then
      UNPREDICTABLE
   else
      FCSR = temp7..1 || FCSR24 || temp0 || FCSR22..0
   endif
elseif fs = 26 then /* FEXR */
   if temp31..18 != 0 or temp11..7 != 0 or temp2..0 != 0then
      UNPREDICTABLE
   else
      FCSR = FCSR31..18 || temp17..12 || FCSR11..7 || 
      temp6..2 || FCSR1..0
   endif
elseif fs = 28 then /* FENR */
   if temp31..12 != 0 or temp6..3 != 0 then
      UNPREDICTABLE
   else
      FCSR = FCSR31..25 || temp2 || FCSR23..12 || temp11..7 
      || FCSR6..2 || temp1..0
   endif
elseif fs = 31 then /* FCSR */
   if (FCSRImpl field is not implemented) and(temp22..18 != 0) then 
      UNPREDICTABLE
   elseif (FCSRImpl field is implemented) and temp20..18 != 0 then
      UNPREDICTABLE
   else
      FCSR = temp
   endif
else
   if Config2AR >= 2 SignalException(ReservedInstruction) /* Release 6 traps */
   endif
   UNPREDICTABLE
endif
CheckFPException()

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Division-by-zero, Inexact, Overflow, Underflow

Historical Information:

For the MIPS I, II and III architectures, the contents of floating point control register fs are UNPREDICTABLE for the instruction immediately following CTC1.

MIPS V and MIPS32 introduced the three control registers that access portions of FCSR. These registers were not available in MIPS I, II, III, or IV.

MIPS32 Release 5 introduced the UFR and UNFR register aliases that allow user level access to StatusFR.

MIPS32 Release 6 introduced the FRE and NFRE register aliases that allow user to cause traps for FR=0 mode emulation.

Encoding:

COP2

010010

CT

00110

rt

Impl

6

5

5

16

Format:

CTC2 rt, Impl

MIPS32

Move Control Word to Coprocessor 2

Purpose:

Move Control Word to Coprocessor 2

To copy a word from a GPR to a Coprocessor 2 control register.

Description:

 CP2CCR[Impl] = GPR[rt]

Copy the low word from GPR rt into the Coprocessor 2 control register denoted by the Impl field. The interpretation of the Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the architecture.

Restrictions:

The result is UNPREDICTABLE if rd specifies a register that does not exist.

Operation:

temp = GPR[rt]31..0
CP2CCR[Impl] = temp

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

fmt

0

00000

fs

fd

CVT.D

100001

6

5

5

5

5

6

Format:

CVT.D.fmt 

Floating Point Convert to Double Floating Point

CVT.D.S fd, fs

MIPS32

Floating Point Convert to Double Floating Point

CVT.D.W fd, fs

MIPS32

Floating Point Convert to Double Floating Point

CVT.D.L fd, fs

MIPS64, MIPS32 Release 2

Floating Point Convert to Double Floating Point

Purpose:

Floating Point Convert to Double Floating Point

To convert an FP or fixed point value to double FP.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in double floating point format and rounded according to the current rounding mode in FCSR. The result is placed in FPR fd. If fmt is S or W, then the operation is always exact.

Restrictions:

The fields fs and fd must specify valid FPRs, fs for type fmt and fd for double floating point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

For CVT.D.L, the result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit

FPU register model.

Operation:

StoreFPR (fd, D, ConvertFmt(ValueFPR(fs, fmt), fmt, D))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact

Encoding:

COP1

010001

fmt

0

00000

fs

fd

CVT.L

100101

6

5

5

5

5

6

Format:

CVT.L.fmt 

Floating Point Convert to Long Fixed Point

CVT.L.S fd, fs

MIPS64, MIPS32 Release 2

Floating Point Convert to Long Fixed Point

CVT.L.D fd, fs

MIPS64, MIPS32 Release 2

Floating Point Convert to Long Fixed Point

Purpose:

Floating Point Convert to Long Fixed Point

To convert an FP value to a 64-bit fixed point.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

Convert the value in format fmt in FPR fs to long fixed point format and round according to the current rounding mode in FCSR. The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

263-1. On cores with FCSRNAN2008=1, the default result is:

263-1 when the input value is +inf or rounds to a number larger than 263-1

Restrictions:

The fields fs and fd must specify valid FPRs, fs for type fmt and fd for long fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Operation:

StoreFPR (fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact,

Encoding:

COP1

010001

fmt

10000

ft

fs

fd

CVT.PS

100110

6

5

5

5

5

6

Format:

CVT.PS.S fd, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Convert Pair to Paired Single

Purpose:

Floating Point Convert Pair to Paired Single

To convert two FP values to a paired single value.

Description:

 FPR[fd] = FPR[fs]31..0 || FPR[ft]31..0

The single-precision values in FPR fs and ft are written into FPR fd as a paired-single value. The value in FPR fs is written into the upper half, and the value in FPR ft is written into the lower half.

CVT.PS.S is similar to PLL.PS, except that it expects operands of format S instead of PS.

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and ft must specify FPRs valid for operands of type S. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format S; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

StoreFPR(fd, S, ValueFPR(fs,S) || ValueFPR(ft,S))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation

Encoding:

COP1

010001

fmt

0

00000

fs

fd

CVT.S

100000

6

5

5

5

5

6

Format:

CVT.S.fmt 

Floating Point Convert to Single Floating Point

CVT.S.D fd, fs

MIPS32

Floating Point Convert to Single Floating Point

CVT.S.W fd, fs

MIPS32

Floating Point Convert to Single Floating Point

CVT.S.L fd, fs

MIPS64, MIPS32 Release 2

Floating Point Convert to Single Floating Point

Purpose:

Floating Point Convert to Single Floating Point

To convert an FP or fixed point value to single FP.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in single floating point format and rounded according to the current rounding mode in FCSR. The result is placed in FPR fd.

Restrictions:

The fields fs and fd must specify valid FPRs-fs for type fmt and fd for single floating point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

For CVT.S.L, the result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit

FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Operation:

StoreFPR(fd, S, ConvertFmt(ValueFPR(fs, fmt), fmt, S))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact, Overflow, Underflow

Encoding:

COP1

010001

fmt

10110

0

00000

fs

fd

CVT.S.PL

101000

6

5

5

5

5

6

Format:

CVT.S.PL fd, fs

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Convert Pair Lower to Single Floating Point

Purpose:

Floating Point Convert Pair Lower to Single Floating Point

To convert one half of a paired single FP value to single FP.

Description:

 FPR[fd] = FPR[fs]31..0

The lower paired single value in FPR fs, in format PS, is converted to a value in single floating point format. The result is placed in FPR fd. This instruction can be used to isolate the lower half of a paired single value.

The operation is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify valid FPRs-fs for type PS and fd for single floating point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format PS; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of CVT.S.PL is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

StoreFPR (fd, S, ConvertFmt(ValueFPR(fs, PS), PL, S))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Encoding:

COP1

010001

fmt

10110

0

00000

fs

fd

CVT.S.PU

100000

6

5

5

5

5

6

Format:

CVT.S.PU fd, fs

MIPS64, MIPS32 Release 2, , removed in Release 6

Floating Point Convert Pair Upper to Single Floating Point

Purpose:

Floating Point Convert Pair Upper to Single Floating Point

To convert one half of a paired single FP value to single FP

Description:

 FPR[fd] = FPR[fs]63..32

The upper paired single value in FPR fs, in format PS, is converted to a value in single floating point format. The result is placed in FPR fd. This instruction can be used to isolate the upper half of a paired single value.

The operation is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify valid FPRs-fs for type PS and fd for single floating point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format PS; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of CVT.S.PU is UNPREDICTABLE if the processor is executing the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU

Availability and Compatibility:

This instruction was removed in Release 6.

Operation:

StoreFPR (fd, S, ConvertFmt(ValueFPR(fs, PS), PU, S))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Encoding:

COP1

010001

fmt

0

00000

fs

fd

CVT.W

100100

6

5

5

5

5

6

Format:

CVT.W.fmt 

Floating Point Convert to Word Fixed Point

CVT.W.S fd, fs

MIPS32

Floating Point Convert to Word Fixed Point

CVT.W.D fd, fs

MIPS32

Floating Point Convert to Word Fixed Point

Purpose:

Floating Point Convert to Word Fixed Point

To convert an FP value to 32-bit fixed point.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 32-bit word fixed point format and rounded according to the current rounding mode in FCSR. The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

263-1. On cores with FCSRNAN2008=1, the default result is:

263-1 when the input value is +inf or rounds to a number larger than 263-1

Restrictions:

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for word fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact

Encoding:

DADDI

011000

rs

rt

immediate

6

5

5

16

Format:

DADDI rt, rs, immediate

MIPS64, removed in Release 6

Doubleword Add Immediate

Purpose:

Doubleword Add Immediate

To add a constant to a 64-bit integer. If overflow occurs, then trap.

Description:

 GPR[rt] = GPR[rs] + immediate

The 16-bit signed immediate is added to the 64-bit value in GPR rs to produce a 64-bit result. If the addition results in

64-bit 2's complement arithmetic overflow, then the destination register is not modified and an Integer Overflow exception occurs. If it does not overflow, the 64-bit result is placed into GPR rt.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

The encoding is reused for other instructions introduced by Release 6.

Operation:

temp = (GPR[rs]63||GPR[rs]) + sign_extend(immediate)
if (temp64 != temp63) then
   SignalException(IntegerOverflow)
else
   GPR[rt] = temp63..0
endif

Exceptions:

Integer Overflow, Reserved Instruction

Programming Notes:

DADDIU performs the same arithmetic operation but does not trap on overflow.

Encoding:

DADDIU

011001

rs

rt

immediate

6

5

5

16

Format:

DADDIU rt, rs, immediate

MIPS64

Doubleword Add Immediate Unsigned

Purpose:

Doubleword Add Immediate Unsigned

To add a constant to a 64-bit integer

Description:

 GPR[rt] = GPR[rs] + sign_extend(immediate)

The 16-bit signed immediate is added to the 64-bit value in GPR rs and the 64-bit arithmetic result is placed into

GPR rt.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

Operation:

GPR[rt] = GPR[rs] + sign_extend(immediate)

Exceptions:

Reserved Instruction

Programming Notes:

The term "unsigned" in the instruction name is a misnomer; this operation is 64-bit modulo arithmetic that does not trap on overflow. It is appropriate for unsigned arithmetic such as address arithmetic, or integer arithmetic environments that ignore overflow, such as C language arithmetic.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

DADD

101100

6

5

5

5

5

6

Format:

DADD rd, rs, rt

MIPS64

Doubleword Add

Purpose:

Doubleword Add

To add 64-bit integers. If overflow occurs, then trap.

Description:

 GPR[rd] = GPR[rs] + GPR[rt]

The 64-bit doubleword value in GPR rt is added to the 64-bit value in GPR rs to produce a 64-bit result. If the addition results in 64-bit 2's complement arithmetic overflow, then the destination register is not modified and an Integer

Overflow exception occurs. If it does not overflow, the 64-bit result is placed into GPR rd.

Restrictions:

Operation:

temp = (GPR[rs]63||GPR[rs]) + (GPR[rt]63||GPR[rt])
if (temp64 != temp63) then
   SignalException(IntegerOverflow)
else
   GPR[rd] = temp63..0
endif

Exceptions:

Integer Overflow, Reserved Instruction

Programming Notes:

DADDU performs the same arithmetic operation but does not trap on overflow.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

DADDU

101101

6

5

5

5

5

6

Format:

DADDU rd, rs, rt

MIPS64

Doubleword Add Unsigned

Purpose:

Doubleword Add Unsigned

To add 64-bit integers

Description:

 GPR[rd] = GPR[rs] + GPR[rt]

The 64-bit doubleword value in GPR rt is added to the 64-bit value in GPR rs and the 64-bit arithmetic result is placed into GPR rd.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

Operation:

GPR[rd] = GPR[rs] + GPR[rt]

Exceptions:

Reserved Instruction

Programming Notes:

The term "unsigned" in the instruction name is a misnomer; this operation is 64-bit modulo arithmetic that does not trap on overflow. It is appropriate for unsigned arithmetic such as address arithmetic, or integer arithmetic environments that ignore overflow, such as C language arithmetic.

Encoding:

MIPS64 pre-Release 6

SPECIAL2

011100

rs

rt

rd

0

00000

DCLO

100101

6

5

5

5

5

6

MIPS64 Release 6

SPECIAL

000000

rs

0

00000

rd

00001

DCLO

010011

6

5

5

5

5

6

Format:

DCLO rd, rs

MIPS64

Count Leading Ones in Doubleword

Purpose:

Count Leading Ones in Doubleword

To count the number of leading ones in a doubleword

Description:

 GPR[rd] = count_leading_ones GPR[rs]

The 64-bit word in GPR rs is scanned from most-significant to least-significant bit. The number of leading ones is counted and the result is written to GPR rd. If all 64 bits were set in GPR rs, the result written to GPR rd is 64.

Restrictions:

Pre-Release 6: To be compliant with the MIPS32 and MIPS64 Architecture, software must place the same GPR number in both the rt and rd fields of the instruction. The operation of the instruction is UNPREDICTABLE if the rt and

rd fields of the instruction contain different values. Release 6's new instruction encoding does not contain anrt field;

Release 6 implementations are required to signal a Reserved Instruction exception if the rt field is nonzero.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

temp = 64
for i in 63.. 0
   if GPR[rs]i = 0 then
      temp = 63 - i
      break
   endif
endfor
GPR[rd] = temp

Exceptions:

None

Encoding:

MIPS64 pre-Release 6

SPECIAL2

011100

rs

rt

rd

0

00000

DCLZ

100100

6

5

5

5

5

6

MIPS64 Release 6

SPECIAL

000000

rs

0

00000

rd

00001

DCLZ

010010

6

5

5

5

5

6

Format:

DCLZ rd, rs

MIPS64

Count Leading Zeros in Doubleword

Purpose:

Count Leading Zeros in Doubleword

To count the number of leading zeros in a doubleword

Description:

 GPR[rd] = count_leading_zeros GPR[rs]

The 64-bit word in GPR rs is scanned from most significant to least significant bit. The number of leading zeros is counted and the result is written to GPR rd. If no bits were set in GPR rs, the result written to GPR rd is 64.

Restrictions:

Pre-Release 6: To be compliant with the MIPS32 and MIPS64 Architecture, software must place the same GPR number in both the rt and rd fields of the instruction. The operation of the instruction is UNPREDICTABLE if the rt and

rd fields of the instruction contain different values. Release 6's new instruction encoding does not contain an rt field.

Release 6 implementations are required to signal a Reserved Instruction exception if the rt field is nonzero.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

temp = 64
for i in 63.. 0
   if GPR[rs]i = 1 then
      temp = 63 - i
      break
   endif
endfor
GPR[rd] = temp

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

DDIV

011110

6

5

5

10

6

Format:

DDIV rs, rt

MIPS64, removed in Release 6

Doubleword Divide

Purpose:

Doubleword Divide

To divide 64-bit signed integers.

Description:

 (LO, HI) = GPR[rs] / GPR[rt]

The 64-bit doubleword in GPR rs is divided by the 64-bit doubleword in GPR rt, treating both operands as signed values. The 64-bit quotient is placed into special register LO and the 64-bit remainder is placed into special register HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

If the divisor in GPR rt is zero, the arithmetic result value is UNPREDICTABLE.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

   LO = GPR[rs] div GPR[rt]
   HI = GPR[rs] mod GPR[rt]

Exceptions:

Reserved Instruction

Programming Notes:

See "Programming Notes" for the DIV instruction.

Historical Perspective:

In MIPS III, if either of the two instructions preceding the divide is an MFHI or MFLO, the result of the MFHI or

MFLO is UNPREDICTABLE. Reads of the HI or LO special register must be separated from subsequent instructions that write to them by two or more instructions. This restriction was removed in MIPS IV and MIPS32 and all subsequent levels of the architecture.

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

DDIVU

011111

6

5

5

10

6

Format:

DDIVU rs, rt

MIPS64, removed in Release 6

Doubleword Divide Unsigned

Purpose:

Doubleword Divide Unsigned

To divide 64-bit unsigned integers.

Description:

 (LO, HI) = GPR[rs] / GPR[rt]

The 64-bit doubleword in GPR rs is divided by the 64-bit doubleword in GPR rt, treating both operands as unsigned values. The 64-bit quotient is placed into special register LO and the 64-bit remainder is placed into special register

HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

If the divisor in GPR rt is zero, the arithmetic result value is UNPREDICTABLE.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

q = (0 || GPR[rs]) div (0 || GPR[rt])
r = (0 || GPR[rs]) mod (0 || GPR[rt])
LO = q63..0
HI = r63..0

Exceptions:

Reserved Instruction

Programming Notes:

See "Programming Notes" for the DIV instruction.

Historical Perspective:

In MIPS III, if either of the two instructions preceding the divide is an MFHI or MFLO, the result of the MFHI or

MFLO is UNPREDICTABLE. Reads of the HI or LO special register must be separated from subsequent instructions that write to them by two or more instructions. This restriction was removed in MIPS IV and MIPS32 and all subsequent levels of the architecture.

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

DERET

011111

6

1

19

6

Format:

DERET 

EJTAG

Debug Exception Return

Purpose:

Debug Exception Return

To Return from a debug exception.

Description:

DERET clears execution and instruction hazards, returns from Debug Mode and resumes non-debug execution at the instruction whose address is contained in the DEPC register. DERET does not execute the next instruction (i.e. it has no delay slot).

Restrictions:

A DERET placed between an LL and SC instruction does not cause the SC to fail.

If the DEPC register with the return address for the DERET was modified by an MTC0 or a DMTC0 instruction, a

CP0 hazard exists that must be removed via software insertion of the appropriate number of SSNOP instructions (for implementations of Release 1 of the Architecture) or by an EHB, or other execution hazard clearing instruction (for implementations of Release 2 of the Architecture).

DERET implements a software barrier that resolves all execution and instruction hazards created by Coprocessor 0 state changes (for Release 2 implementations, refer to the SYNCI instruction for additional information on resolving instruction hazards created by writing the instruction stream). The effects of this barrier are seen starting with the instruction fetch and decode of the instruction at the PC to which the DERET returns.

This instruction is legal only if the processor is executing in Debug Mode.

Pre-Release 6: The operation of the processor is UNDEFINED if a DERET is executed in the delay slot of a branch or jump instruction.

Release 6 implementations are required to signal a Reserved Instruction exception if DERET is encountered in the delay slot or forbidden slot of a branch or jump instruction.

Operation:

DebugDM = 0
DebugIEXI = 0
if IsMIPS16Implemented() | (Config3ISA > 0) then
   PC = DEPC63..1 || 0
   ISAMode = DEPC0
else
   PC = DEPC
endif
ClearHazards()

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL3

011111

rs

rt

msbdminus32

(size-1-32)

lsb

(pos)

DEXTM

000001

6

5

5

5

5

6

Format:

DEXTM rt, rs, pos, size

MIPS64 Release 2

Doubleword Extract Bit Field Middle

Purpose:

Doubleword Extract Bit Field Middle

To extract a bit field from GPR rs and store it right-justified into GPR rt.

Description:

 GPR[rt] = ExtractField(GPR[rs], msbd, lsb)

The bit field starting at bit pos and extending for size bits is extracted from GPR rs and stored zero-extended and right-justified in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msbdminus32 (the most significant bit of the destination field in GPR rt, minus 32), in instruction bits 15..11, and lsb (least significant bit of the source field in GPR rs), in instruction bits 10..6, as follows:

msbdminus32 = size-1-32
lsb = pos
msbd = msbdminus32 + 32
msb = lsb+msbd

For this instruction, the values of pos and size must satisfy all of the following relations:

0 <= pos < 32
32 < size <= 64
32 < pos+size <= 64

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The operation is UNPREDICTABLE if (lsb + msbd + 1) > 64.

Operation:

msbd = msbdminus32 + 32
if ((lsb + msbd + 1) > 64) then
   UNPREDICTABLE
endif
GPR[rt] = 063-(msbd+1) || GPR[rs]msbd+lsb..pos

Exceptions:

Reserved Instruction

Programming Notes

The assembler will accept any value of pos and size that satisfies the relationship 0 < pos+size <= 64 and emit DEXT,

DEXTM, or DEXTU as appropriate to the values. Programmers should always specify the DEXT mnemonic and let the assembler select the instruction to use.

Encoding:

SPECIAL3

011111

rs

rt

msbd

(size-1)

lsb

(pos)

DEXT

000011

6

5

5

5

5

6

Format:

DEXT rt, rs, pos, size

MIPS64 Release 2

Doubleword Extract Bit Field

Purpose:

Doubleword Extract Bit Field

To extract a bit field from GPR rs and store it right-justified into GPR rt.

Description:

 GPR[rt] = ExtractField(GPR[rs], msbd, lsb)

The bit field starting at bit pos and extending for size bits is extracted from GPR rs and stored zero-extended and right-justified in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msbd (the most significant bit of the destination field in GPR rt), in instruction bits 15..11, and lsb

(least significant bit of the source field in GPR rs), in instruction bits 10..6, as follows:

msbd = size-1
lsb = pos
msb = lsb+msbd

For this instruction, the values of pos and size must satisfy all of the following relations:

0 <= pos < 32
0 < size <= 32
0 < pos+size <= 63

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

Because of the limits on the values of msbd and lsb, there is no UNPREDICTABLE case for this instruction.

Operation:

GPR[rt] = 063-(msbd+1) || GPR[rs]msbd+lsb..lsb

Exceptions:

Reserved Instruction

Programming Notes

The assembler will accept any value of pos and size that satisfies the relationship 0 < pos+size <= 64 and emit DEXT,

DEXTM, or DEXTU as appropriate to the values. Programmers should always specify the DEXT mnemonic and let the assembler select the instruction to use.

   

Encoding:

SPECIAL3

011111

rs

rt

msbd

(size-1)

lsbminus32

(pos-32)

DEXTU

000010

6

5

5

5

5

6

Format:

DEXTU rt, rs, pos, size

MIPS64 Release 2

Doubleword Extract Bit Field Upper

Purpose:

Doubleword Extract Bit Field Upper

To extract a bit field from GPR rs and store it right-justified into GPR rt.

Description:

 GPR[rt] = ExtractField(GPR[rs], msbd, lsb)

The bit field starting at bit pos and extending for size bits is extracted from GPR rs and stored zero-extended and right-justified in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msbd (the most significant bit of the destination field in GPR rt), in instruction bits 15..11, and

lsbminus32 (least significant bit of the source field in GPR rs, minus32), in instruction bits 10..6, as follows:

msbd = size-1
lsbminus32 = pos-32
lsb = lsbminus32 + 32
msb = lsb+msbd

For this instruction, the values of pos and size must satisfy all of the following relations:

32 <= pos < 64
0 < size <= 32
32 < pos+size <= 64

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The operation is UNPREDICTABLE if (lsb + msbd + 1) > 64.

Operation:

lsb = lsbminus32 + 32
if ((lsb + msbd + 1) > 64) then
   UNPREDICTABLE
endif
GPR[rt] = 063-(msbd+1) || GPR[rs]msbd+lsb..pos

Exceptions:

Reserved Instruction

Programming Notes

The assembler will accept any value of pos and size that satisfies the relationship 0 < pos+size <= 64 and emit DEXT,

DEXTM, or DEXTU as appropriate to the values. Programmers should always specify the DEXT mnemonic and let the assembler select the instruction to use.

Encoding:

SPECIAL3

011111

rs

rt

msbminus32

(pos+size-33)

lsb

(pos)

DINSM

000101

6

5

5

5

5

6

Format:

DINSM rt, rs, pos, size

MIPS64 Release 2

Doubleword Insert Bit Field Middle

Purpose:

Doubleword Insert Bit Field Middle

To merge a right-justified bit field from GPR rs into a specified position in GPR rt.

Description:

 GPR[rt] = InsertField(GPR[rt], GPR[rs], msb, lsb)

The right-most size bits from GPR rs are inserted into the value from GPR rt starting at bit position pos. The result is placed back in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msbminus32 (the most significant bit of the field, minus 32), in instruction bits 15..11, and lsb (least significant bit of the field), in instruction bits 10..6, as follows:

msbminus32 = pos+size-1-32
lsb = pos
msb = msbminus32 + 32

For this instruction, the values of pos and size must satisfy all of the following relations:

0 <= pos < 32
2 <= size <= 64
32 < pos+size <= 64

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

Because of the instruction format, lsb can never be greater than msb, so there is no UNPREDICATABLE case for this instruction.

Operation:

msb = msbminus32 + 32
GPR[rt] = GPR[rt]63..msb+1 || GPR[rs]msb-lsb..0 || GPR[rt]lsb-1..0

Exceptions:

Reserved Instruction

Programming Notes

The assembler will accept any value of pos and size that satisfies the relationship 0 < pos+size <= 64 and emit DINS,

DINSM, or DINSU as appropriate to the values. Programmers should always specify the DINS mnemonic and let the assembler select the instruction to use.

Encoding:

SPECIAL3

011111

rs

rt

msb

(pos+size-1)

lsb

(pos)

DINS

000111

6

5

5

5

5

6

Format:

DINS rt, rs, pos, size

MIPS64 Release 2

Doubleword Insert Bit Field

Purpose:

Doubleword Insert Bit Field

To merge a right-justified bit field from GPR rs into a specified position in GPR rt.

Description:

 GPR[rt] = InsertField(GPR[rt], GPR[rs], msb, lsb)

The right-most size bits from GPR rs are merged into the value from GPR rt starting at bit position pos. The result is placed back in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msb (the most significant bit of the field), in instruction bits 15..11, and lsb (least significant bit of the field), in instruction bits 10..6, as follows:

msb = pos+size-1
lsb = pos

For this instruction, the values of pos and size must satisfy all of the following relations:

0 <= pos < 32
0 < size <= 32
0 < pos+size <= 32

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The operation is UNPREDICTABLE if lsb > msb.

Operation:

if (lsb > msb) then
      UNPREDICTABLE
   endif
   GPR[rt] = GPR[rt]63..msb+1 || GPR[rs]msb-lsb..0 || GPR[rt]lsb-1..0

Exceptions:

Reserved Instruction

Programming Notes

The assembler will accept any value of pos and size that satisfies the relationship 0 < pos+size <= 64 and emit DINS,

DINSM, or DINSU as appropriate to the values. Programmers should always specify the DINS mnemonic and let the assembler select the instruction to use.

Encoding:

SPECIAL3

011111

rs

rt

msbminus32

(pos+size-33)

lsbminus32

(pos-32)

DINSU

000110

6

5

5

5

5

6

Format:

DINSU rt, rs, pos, size

MIPS64 Release 2

Doubleword Insert Bit Field Upper

Purpose:

Doubleword Insert Bit Field Upper

To merge a right-justified bit field from GPR rs into a specified position in GPR rt.

Description:

 GPR[rt] = InsertField(GPR[rt], GPR[rs], msb, lsb)

The right-most size bits from GPR rs are inserted into the value from GPR rt starting at bit position pos. The result is placed back in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msbminus32 (the most significant bit of the field, minus 32), in instruction bits 15..11, and lsbminus32

(least significant bit of the field, minus 32), in instruction bits 10..6, as follows:

msbminus32 = pos+size-1-32
lsbminus32 = pos-32
msb = msbminus32 + 32
lsb = lsbminus32 + 32

For this instruction, the values of pos and size must satisfy all of the following relations:

32 <= pos < 64
1 <= size <= 32
32 < pos+size <= 64

Restrictions:

In implementations pre-Release 2 of the architecture, the instruction resulted in a Reserved Instruction exception.

The operation is UNPREDICTABLE if lsb > msb.

Operation:

lsb = lsbminus32 + 32
msb = msbminus32 + 32
if (lsb > msb) then
   UNPREDICTABLE
endif
GPR[rt] = GPR[rt]63..msb+1 || GPR[rs]msb-lsb..0 || GPR[rt]lsb-1..0

Exceptions:

Reserved Instruction

Programming Notes

The assembler accepts any value of pos and size that satisfies the relationship 0 < pos+size <= 64 and emit DINS,

DINSM, or DINSU as appropriate to the values. Programmers should always specify the DINS mnemonic and let the assembler select the instruction to use.

Encoding:

COP0

0100 00

MFMC0

01 011

rt

12

0110 0

0

000 00

sc

0

0

0 0

0

000

6

5

5

5

5

1

2

3

Format:

DI rt

MIPS32 Release 2

Disable Interrupts

Purpose:

Disable Interrupts

To return the previous value of the Status register and disable interrupts. If DI is specified without an argument, GPR r0 is implied, which discards the previous value of the Status register.

Description:

 GPR[rt] = Status; StatusIE = 0

The current value of the Status register is sign-extended and loaded into general register rt. The Interrupt Enable (IE) bit in the Status register is then cleared.

Restrictions:

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

This operation specification is for the general interrupt enable/disable operation, with the sc field as a variable. The individual instructions DI and EI have a specific value for the sc field.

data = Status
GPR[rt] = sign_extend(data)
StatusIE = 0

Exceptions:

Coprocessor Unusable

Reserved Instruction (Release 1 implementations)

Programming Notes:

The effects of this instruction are identical to those accomplished by the sequence of reading Status into a GPR, clearing the IE bit, and writing the result back to Status. Unlike the multiple instruction sequence, however, the DI instruction cannot be aborted in the middle by an interrupt or exception.

This instruction creates an execution hazard between the change to the Status register and the point where the change to the interrupt enable takes effect. This hazard is cleared by the EHB, JALR.HB, JR.HB, or ERET instructions. Software must not assume that a fixed latency will clear the execution hazard.

Encoding:

COP1

010001

fmt

ft

fs

fd

DIV

000011

6

5

5

5

5

6

Format:

DIV.fmt 

Floating Point Divide

DIV.S fd, fs, ft

MIPS32

Floating Point Divide

DIV.D fd, fs, ft

MIPS32

Floating Point Divide

Purpose:

Floating Point Divide

To divide FP values.

Description:

 FPR[fd] = FPR[fs] / FPR[ft]

The value in FPR fs is divided by the value in FPR ft. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

Operation:

StoreFPR (fd, fmt, ValueFPR(fs, fmt) / ValueFPR(ft, fmt))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Invalid Operation, Unimplemented Operation, Division-by-zero, Overflow, Underflow

Encoding:

SPECIAL

000000

rs

rt

rd

DIV

00010

SOP32

011010

SPECIAL

000000

rs

rt

rd

MOD

00011

SOP32

011010

SPECIAL

000000

rs

rt

rd

DIVU

00010

SOP33

011011

SPECIAL

000000

rs

rt

rd

MODU

00011

SOP33

011011

SPECIAL

000000

rs

rt

rd

DDIV

00010

SOP36

011110

SPECIAL

000000

rs

rt

rd

DMOD

00011

SOP36

011110

SPECIAL

000000

rs

rt

rd

DDIVU

00010

SOP37

011111

SPECIAL

000000

rs

rt

rd

DMODU

00011

SOP37

011111

6

5

5

5

5

6

Format:

DIV MOD DIVU MODU DDIV DMOD DDIVU DMODU 

Divide Words Signed

DIV rd,rs,rt

MIPS32 Release 6

Divide Words Signed

MOD rd,rs,rt

MIPS32 Release 6

Modulo Words Signed

DIVU rd,rs,rt

MIPS32 Release 6

Divide Words Unsigned

MODU rd,rs,rt

MIPS32 Release 6

Modulo Words Unsigned

DDIV rd,rs,rt

MIPS64 Release 6

Divide Doublewords Signed

DMOD rd,rs,rt

MIPS64 Release 6

Modulo Doublewords Signed

DDIVU rd,rs,rt

MIPS64 Release 6

Divide Doublewords Unsigned

DMODU rd,rs,rt

MIPS64 Release 6

Modulo Doublewords Unsigned

Purpose:

Divide Integers (with result to GPR)

DIV: Divide Words Signed

MOD: Modulo Words Signed

DIVU: Divide Words Unsigned

MODU: Modulo Words Unsigned

DDIV: Divide Doublewords Signed

DMOD: Modulo Doublewords Signed

DDIVU: Divide Doublewords Unsigned

DMODU: Modulo Doublewords Unsigned

Description:

DIV:  GPR[rd] = sign_extend.32( divide.signed( GPR[rs], GPR[rt] )
MOD:  GPR[rd] = sign_extend.32( modulo.signed( GPR[rs], GPR[rt] )
DIVU:  GPR[rd] = sign_extend.32( divide.unsigned( GPR[rs], GPR[rt] )
MODU:  GPR[rd] = sign_extend.32( modulo.unsigned( GPR[rs], GPR[rt] )
DDIV:  GPR[rd] = divide.signed( GPR[rs], GPR[rt] )
DMOD:  GPR[rd] = modulo.signed( GPR[rs], GPR[rt] )
DDIVU: GPR[rd] = divide.unsigned( GPR[rs], GPR[rt] )
DMODU: GPR[rd] = modulo.unsigned( GPR[rs], GPR[rt] )

The Release 6 divide and modulo instructions divide the operands in GPR rs and GPR rt, and place the quotient or remainder in GPR rd.

For each of the div/mod operator pairs DIV/M OD, DIVU/MODU, DDIV/DMOD, DDIVU/DMODU the results satisfy the equation (A div

B)*B + (A mod B) = A, where (A mod B) has same sign as the dividend A, and
abs(A mod B) < abs(B). This equation uniquely defines the results. 

NOTE: if the divisor B=0, this equation cannot be satisfied, and the result is UNPREDICTABLE. This is commonly called "truncated division".

DIV performs a signed 32-bit integer division, and places the 32-bit quotient result in the destination register.

MOD performs a signed 32-bit integer division, and places the 32-bit remainder result in the destination register. The remainder result has the same sign as the dividend.

DIVU performs an unsigned 32-bit integer division, and places the 32-bit quotient result in the destination register.

MODU performs an unsigned 32-bit integer division, and places the 32-bit remainder result in the destination register.

DDIV performs a signed 64-bit integer division, and places the 64-bit quotient result in the destination register.

DMOD performs a signed 64-bit integer division, and places the 64-bit remainder result in the destination register.

The remainder result has the same sign as the dividend.

DDIVU performs an unsigned 64-bit integer division, and places the 64-bit quotient result in the destination register.

DMODU performs an unsigned 64-bit integer division, and places the 64-bit remainder result in the destination register.

Restrictions:

If the divisor in GPR rt is zero, the result value is UNPREDICTABLE.

On a 64-bit CPU, the 32-bit signed divide (DIV) and modulo (MOD) instructions are UNPREDICTABLE if inputs are not signed extended 32-bit integers.

Special provision is made for the inputs to unsigned 32-bit divide and modulo on a 64-bit CPU. Since many 32-bit instructions sign extend 32 bits to 64 even for unsigned computation, properly sign extended numbers must be accepted as input, and truncated to 32 bits, clearing bits 32-63. However, it is also desirable to accept zero extended

32-bit integers, with bits 32-63 all 0.

On a 64-bit CPU, DIVU and MODU are UNPREDICTABLE if their inputs are not zero or sign extended 32-bit integers.

On a 64-bit CPU, the 32-bit divide and modulo instructions, both signed and unsigned, sign extend the result as if it is a 32-bit signed integer.

DDIV, DMOD, DDIVU, DMODU: Reserved Instruction exception if 64-bit instructions are not enabled.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

Release 6 divide instructions have the same opcode mnemonic as the pre-Release 6 divide instructions (DIV, DIVU,

DDIV, DDIVU). The instruction encodings are different, as are the instruction semantics: the Release 6 instruction produces only the quotient, whereas the pre-Release 6 instruction produces quotient and remainder in HI/LO registers respectively, and separate modulo instructions are required to obtain the remainder.

The assembly syntax distinguishes the Release 6 from the pre-Release 6 divide instructions. For example, Release 6

"DIVrd,rs,rt" specifies 3 register operands, versus pre-Release 6 "DIV rs,rt", which has only two register arguments, with the HI/LO registers implied.

Some assemblers accept the pseudo-instruction syntax

"DIVrd,rs,rt" and expand it to do "DIV rs,rt;MFHI rd". Phrases such as "DIV with GPR output" and

"DIV with HI/LO output" may be used when disambiguation is necessary.

Pre-Release 6 divide instructions that produce quotient and remainder in the HI/LO registers produce a Reserved

Instruction exception on Release 6. In the future, the instruction encoding may be reused for other instructions.

Programming Notes:

Because the divide and modulo instructions are defined to not trap if dividing by zero, it is safe to emit code that checks for zero-divide after the divide or modulo instruction.

Operation

DDIV, DMOD, DDIVU, DMODU:
 if not Are64bitOperationsEnabled then SignalException(ReservedInstruction) endif 
if NotWordValue(GPR[rs]) then UNPREDICTABLE endif
if NotWordValue(GPR[rt]) then UNPREDICTABLE endif
/* recommended implementation: ignore bits 32-63 for DIV, MOD, DIVU, MODU */
DIV, MOD: 
 s1 = signed_word(GPR[rs])
  s2 = signed_word(GPR[rt])
DIVU, MODU: 
  s1 = unsigned_word(GPR[rs])
  s2 = unsigned_word(GPR[rt])
DDIV, DMOD: 
  s1 = signed_doubleword(GPR[rs])
  s2 = signed_doubleword(GPR[rt])
DDIVU, DMODU: 
  s1 = unsigned_doubleword(GPR[rs])
  s2 = unsigned_doubleword(GPR[rt])
DIV, DIVU, DDIV, DDIVU:
  quotient = s1 div s2
MOD, MODU, DMOD, DMODU:
  remainder = s1 mod s2
DIV:  GPR[rd] = sign_extend.32( quotient )
MOD:  GPR[rd] = sign_extend.32( remainder )
DIVU:  GPR[rd] = sign_extend.32( quotient )
MODU:  GPR[rd] = sign_extend.32( remainder )
DDIV:  GPR[rd] = quotient
DMOD:  GPR[rd] = remainder 
DDIVU: GPR[rd] = quotient 
DMODU: GPR[rd] = remainder 
/* end of instruction */
where
   function zero_or_sign_extended.32(val)
      if value63..32 = (value31)32 then return true
      if value63..32 = (0)32 then return true
      return false
   end function

Exceptions:

DIV, MOD, DIVU, MODU: No arithmetic exceptions occur. Division by zero produces an UNPREDICTABLE result.

DDIV, DMOD, DDIVU, DMODU: Reserved Instruction.

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

DIV

011010

6

5

5

10

6

Format:

DIV rs, rt

MIPS32, removed in Release 6

Divide Word

Purpose:

Divide Word

To divide a 32-bit signed integers.

Description:

 (HI, LO) = GPR[rs] / GPR[rt]

The 32-bit word value in GPR rs is divided by the 32-bit value in GPR rt, treating both operands as signed values.

The 32-bit quotient is sign-extended and placed into special register LO and the 32-bit remainder is sign-extended and placed into special register HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

If the divisor in GPR rt is zero, the arithmetic result value is UNPREDICTABLE.

Availability and Compatibility:

DIV has been removed in Release 6 and has been replaced by DIV and MOD instructions that produce only quotient and remainder, respectively. Refer to the Release 6 introduced 'DIV' and 'MOD' instructions in this manual for more information. This instruction remains current for all release levels lower than Release 6 of the MIPS architecture.

Operation:

if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then 
   UNPREDICTABLE
endif
q  = GPR[rs]31..0 div GPR[rt]31..0
LO = sign_extend(q31..0)
r  = GPR[rs]31..0 mod GPR[rt]31..0
HI = sign_extend(r31..0)

Exceptions:

None

Programming Notes:

No arithmetic exception occurs under any circumstances. If divide-by-zero or overflow conditions are detected and some action taken, then the divide instruction is followed by additional instructions to check for a zero divisor and/or for overflow. If the divide is asynchronous then the zero-divisor check can execute in parallel with the divide. The action taken on either divide-by-zero or overflow is either a convention within the program itself, or within the system software. A possibility is to take a BREAK exception with a code field value to signal the problem to the system software.

As an example, the C programming language in a UNIX® environment expects division by zero to either terminate the program or execute a program-specified signal handler. C does not expect overflow to cause any exceptional condition. If the C compiler uses a divide instruction, it also emits code to test for a zero divisor and execute a BREAK instruction to inform the operating system if a zero is detected.

By default, most compilers for the MIPS architecture emits additional instructions to check for the divide-by-zero and overflow cases when this instruction is used. In many compilers, the assembler mnemonic "DIV r0, rs, rt" can be used to prevent these additional test instructions to be emitted.

In some processors the integer divide operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written interlocks until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the divide so that other instructions can execute in parallel.

Historical Perspective:

In MIPS 1 through MIPS III, if either of the two instructions preceding the divide is an MFHI or MFLO, the result of the MFHI or MFLO is UNPREDICTABLE. Reads of the HI or LO special register must be separated from subsequent instructions that write to them by two or more instructions. This restriction was removed in MIPS IV and

MIPS32 and all subsequent levels of the architecture.

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

DIVU

011011

6

5

5

10

6

Format:

DIVU rs, rt

MIPS32, removed in Release 6

Divide Unsigned Word

Purpose:

Divide Unsigned Word

To divide 32-bit unsigned integers

Description:

 (HI, LO) = GPR[rs] / GPR[rt]

The 32-bit word value in GPR rs is divided by the 32-bit value in GPR rt, treating both operands as unsigned values.

The 32-bit quotient is sign-extended and placed into special register LO and the 32-bit remainder is sign-extended and placed into special register HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

If either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

If the divisor in GPR rt is zero, the arithmetic result value is UNPREDICTABLE.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then
   UNPREDICTABLE 
endif
q  = (0 || GPR[rs]31..0) div (0 || GPR[rt]31..0)
r  = (0 || GPR[rs]31..0) mod (0 || GPR[rt]31..0)
LO = sign_extend(q31..0)
HI = sign_extend(r31..0)

Exceptions:

None

Programming Notes:

Pre-Release 6 instruction DIV has been removed in Release 6 and has been replaced by DIV and MOD instructions that produce only quotient and remainder, respectively. Refer to the Release 6 introduced 'DIV' and 'MOD' instructions in this manual for more information. This instruction remains current for all release levels lower than Release 6 of the MIPS architecture.

See "Programming Notes" for the DIV instruction.

Historical Perspective:

In MIPS 1 through MIPS III, if either of the two instructions preceding the divide is an MFHI or MFLO, the result of the MFHI or MFLO is UNPREDICTABLE. Reads of the HI or LO special register must be separated from subsequent instructions that write to them by two or more instructions. This restriction was removed in MIPS IV and

MIPS32 and all subsequent levels of the architecture.

Encoding:

COP0

010000

DMF

00001

rt

rd

0

0000 0000

sel

6

5

5

5

8

3

Format:

DMFC0 rt, rd, sel

MIPS64

Doubleword Move from Coprocessor 0

Purpose:

Doubleword Move from Coprocessor 0

To move the contents of a coprocessor 0 register to a general purpose register (GPR).

Description:

 GPR[rt] = CPR[0,rd,sel]

The contents of the coprocessor 0 register are loaded into GPR rt. Note that not all coprocessor 0 registers support the

sel field. In those instances, the sel field must be zero.

Restrictions:

The results are UNDEFINED if coprocessor 0 does not contain a register as specified by rd and sel, or if the coprocessor 0 register specified by rd and sel is a 32-bit register.

Operation:

// 'Width' returns width (32/64) of data returned by CPR
if ((Width(CPR[0,rd,sel])  = 32) and (ConfigAR>=2) ) then
   dataword = CPR[0,rd,sel] 
   GPR[rt] = { x0000_0000 || dataword}
elseif ((Width(CPR[0,rd,sel])  = 32) and (ConfigAR<2)) then 
   UNDEFINED
else
   datadoubleword = CPR[0,rd,sel] 
   GPR[rt] = datadoubleword
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

DMF

00001

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

DMFC1 rt,fs

MIPS64

Doubleword Move from Floating Point

Purpose:

Doubleword Move from Floating Point

To move a doubleword from an FPR to a GPR.

Description:

 GPR[rt] = FPR[fs]

The contents of FPR fs are loaded into GPR rt..

Restrictions:

Operation:

datadoubleword = ValueFPR(fs, UNINTERPRETED_DOUBLEWORD)
GPR[rt] = datadoubleword

Exceptions:

Coprocessor Unusable, Reserved Instruction

Historical Information:

For MIPS III, the contents of GPR rt are undefined for the instruction immediately following DMFC1.

Encoding:

COP2

010010

DMF

00001

rt

Impl

6

5

5

16

Format:

DMFC2, rt, rd, sel

MIPS64

Doubleword Move from Coprocessor 2

Purpose:

Doubleword Move from Coprocessor 2

To move a doubleword from a coprocessor 2 register to a GPR.

Description:

 GPR[rt] = CP2CPR[Impl]

The contents of the coprocessor 2 register denoted by the Impl field is loaded into GPR rt. The interpretation of the

Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the architecture.

Restrictions:

The results are UNPREDICTABLE if Impl specifies a coprocessor 2 register that does not exist, or if the coprocessor 2 register specified by rd and sel is a 32-bit register.

Operation:

datadoubleword = CP2CPR[Impl]
GPR[rt] = datadoubleword

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP0

010000

DMT

00101

rt

rd

0

0000 0000

sel

6

5

5

5

8

3

MIPS64

Format:

DMTC0 rt, rd, sel

MIPS64

Doubleword Move to Coprocessor 0

Purpose:

Doubleword Move to Coprocessor 0

To move a doubleword from a GPR to a coprocessor 0 register.

Description:

 CPR[0,rd,sel] = GPR[rt]

The contents of GPR rt are loaded into the coprocessor 0 register specified in the rd and sel fields. Not all coprocessor

0 registers support the sel field. In those instances, the sel field must be zero.

Restrictions:

The results are UNDEFINED if coprocessor 0 does not contain a register as specified by rd and sel, or if the coprocessor 0 register specified by rd and sel is a 32-bit register.

Operation:

// 'Width' returns width (32/64) of data returned by CPR.
if ((Width(CPR[0,rd,sel])  = 32) and (ConfigAR>=2) ) then 
   dataword <- GPR[rt]31:0
   CPR[0,rd,sel] <- dataword
elseif ((Width(CPR[0,rd,sel])  = 32) and (ConfigAR<2)) then 
   UNDEFINED
else
   datadoubleword <- GPR[rt]
   CPR[0,rd,sel] <- datadoubleword
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

DMT

00101

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

DMTC1 rt, fs

MIPS64

Doubleword Move to Floating Point

Purpose:

Doubleword Move to Floating Point

To copy a doubleword from a GPR to an FPR.

Description:

 FPR[fs] = GPR[rt]

The doubleword contents of GPR rt are placed into FPR fs.

Restrictions:

Operation:

datadoubleword = GPR[rt]
StoreFPR(fs, UNINTERPRETED_DOUBLEWORD, datadoubleword)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Historical Information:

For MIPS III, the contents of FPR fs are undefined for the instruction immediately following DMTC1.

Encoding:

COP2

010010

DMT

00101

rt

Impl

6

5

5

16

MIPS64

Format:

DMTC2 rt, Impl, sel

MIPS64

Doubleword Move to Coprocessor 2

Purpose:

Doubleword Move to Coprocessor 2

To move a doubleword from a GPR to a coprocessor 2 register.

Description:

 CP2CPR[Impl] = GPR[rt]

The contents GPR rt are loaded into the coprocessor 2 register denoted by the Impl field. The interpretation of the

Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the architecture.

Restrictions:

The results are UNPREDICTABLE if Impl specifies a coprocessor 2 register that does not exist, or if the coprocessor 2 register specified by rd and sel is a 32-bit register.

Operation:

datadoubleword = GPR[rt]
CP2CPR[Impl]= datadoubleword

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

DMULT

011100

6

5

5

10

6

Format:

DMULT rs, rt

MIPS64, removed in Release 6

Doubleword Multiply

Purpose:

Doubleword Multiply

To multiply 64-bit signed integers.

Description:

 (LO, HI) = GPR[rs] <= GPR[rt]

The 64-bit doubleword value in GPR rt is multiplied by the 64-bit value in GPR rs, treating both operands as signed values, to produce a 128-bit result. The low-order 64-bit doubleword of the result is placed into special register LO, and the high-order 64-bit doubleword is placed into special register HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

prod = GPR[rs] <= GPR[rt]
LO = prod63..0
HI = prod127..64

Exceptions:

Reserved Instruction

Programming Notes:

In some processors the integer multiply operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written interlocks until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Historical Perspective:

In MIPS III, if either of the two instructions preceding the divide is an MFHI or MFLO, the result of the MFHI or

MFLO is UNPREDICTABLE. Reads of the HI or LO special register must be separated from subsequent instructions that write to them by two or more instructions. This restriction was removed in MIPS IV and all subsequent levels of the architecture.

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

DMULTU

011101

6

5

5

10

6

Format:

DMULTU rs, rt

MIPS64, removed in Release 6

Doubleword Multiply Unsigned

Purpose:

Doubleword Multiply Unsigned

To multiply 64-bit unsigned integers.

Description:

 (LO, HI) = GPR[rs] <= GPR[rt]

The 64-bit doubleword value in GPR rt is multiplied by the 64-bit value in GPR rs, treating both operands as unsigned values, to produce a 128-bit result. The low-order 64-bit doubleword of the result is placed into special register LO, and the high-order 64-bit doubleword is placed into special register HI. No arithmetic exception occurs under any circumstances.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

prod = (0||GPR[rs]) <= (0||GPR[rt])
LO = prod63..0
HI = prod127..64

Exceptions:

Reserved Instruction

Programming Notes:

In some processors the integer multiply operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written interlocks until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Historical Perspective:

In MIPS III, if either of the two instructions preceding the divide is an MFHI or MFLO, the result of the MFHI or

MFLO is UNPREDICTABLE. Reads of the HI or LO special register must be separated from subsequent instructions that write to them by two or more instructions. This restriction was removed in MIPS IV and all subsequent levels of the architecture.

Encoding:

SPECIAL

000000

0000

R

1

rt

rd

saminus32

DSLR32

111110

6

4

1

5

5

5

6

Format:

DROTR32 rd, rt, sa

MIPS64 Release 2

Doubleword Rotate Right Plus 32

Purpose:

Doubleword Rotate Right Plus 32

To execute a logical right-rotate of a doubleword by a fixed amount-32 to 63 bits

Description:

 GPR[rd] = GPR[rt] <= (right) (saminus32+32)

The 64-bit doubleword contents of GPR rt are rotated right; the result is placed in GPR rd. The bit-rotate amount in the range 32 to 63 is specified by saminus32+32.

Restrictions:

Operation:

s        = 1 || sa   /* 32+saminus32 */
GPR[rd]   = GPR[rt]s-1..0 || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0000

R

1

rt

rd

sa

DSRL

111010

6

4

1

5

5

5

6

Format:

DROTR rd, rt, sa

MIPS64 Release 2

Doubleword Rotate Right

Purpose:

Doubleword Rotate Right

To execute a logical right-rotate of a doubleword by a fixed amount-0 to 31 bits.

Description:

 GPR[rd] = GPR[rt] <= (right) sa

The doubleword contents of GPR rt are rotated right; the result is placed in GPR rd. The bit-rotate amount in the range 0 to 31 is specified by sa.

Restrictions:

Operation:

s        = 0 || sa
GPR[rd]   = GPR[rt]s-1..0 || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

rd

0000

R

1

DSRLV

010110

6

5

5

5

4

1

6

Format:

DROTRV rd, rt, rs

MIPS64 Release 2

Doubleword Rotate Right Variable

Purpose:

Doubleword Rotate Right Variable

To execute a logical right-rotate of a doubleword by a variable number of bits

Description:

 GPR[rd] = GPR[rt] <= (right) GPR[rs]

The 64-bit doubleword contents of GPR rt are rotated right; the result is placed in GPR rd. The bit-rotate amount in the range 0 to 63 is specified by the low-order 6 bits in GPR rs.

Restrictions:

Operation:

s        = GPR[rs]5..0
GPR[rd]   = GPR[rt]s-1..0  || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL3

011111

0

00000

rt

rd

DSBH

00010

DBSHFL

100100

6

5

5

5

5

6

Format:

DSBH rd, rt

MIPS64 Release 2

Doubleword Swap Bytes Within Halfwords

Purpose:

Doubleword Swap Bytes Within Halfwords

To swap the bytes within each halfword of GPR rt and store the value into GPR rd.

Description:

 GPR[rd] = SwapBytesWithinHalfwords(GPR[rt])

Within each halfword of GPR rt the bytes are swapped and stored in GPR rd.

Restrictions:

In implementations Release 1 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

GPR[rd] = GPR[t]55.48 || GPR[t]63..56 || GPR[t]39..32 || GPR[t]47..40 ||
          GPR[t]23..16 || GPR[t]31..24 || GPR[t]7..0 || GPR[t]15..8

Exceptions:

Reserved Instruction

Programming Notes:

The DSBH and DSHD instructions can be used to convert doubleword data of one endianness to the other endianness.

For example:

ld    t0, 0(a1)           /* Read doubleword value */
dsbh  t0, t0              /* Convert endiannes of the halfwords */
dshd  t0, t0              /* Swap the halfwords within the doublewords */

Encoding:

SPECIAL3

011111

0

00000

rt

rd

DSHD

00101

DBSHFL

100100

6

5

5

5

5

6

Format:

DSHD rd, rt

MIPS64 Release 2

Doubleword Swap Halfwords Within Doublewords

Purpose:

Doubleword Swap Halfwords Within Doublewords

To swap the halfwords of GPR rt and store the value into GPR rd.

Description:

 GPR[rd] = SwapHalfwordsWithinDoublewords(GPR[rt])

The halfwords of GPR rt are swapped and stored in GPR rd.

Restrictions:

In implementations of Release 1 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

GPR[rd] = GPR[rt]15..0 || GPR[rt]31..16 || GPR[rt]47..32 || GPR[rt]63..48

Exceptions:

Reserved Instruction

Programming Notes:

The DSBH and DSHD instructions can be used to convert doubleword data of one endianness to the other endianness.

For example:

ld    t0, 0(a1)           /* Read doubleword value */
dsbh  t0, t0              /* Convert endiannes of the halfwords */
dshd  t0, t0              /* Swap the halfwords within the doublewords */

Encoding:

SPECIAL

000000

0

00000

rt

rd

sa

DSLL32

111100

6

5

5

5

5

6

Format:

DSLL32 rd, rt, sa

MIPS64

Doubleword Shift Left Logical Plus 32

Purpose:

Doubleword Shift Left Logical Plus 32

To execute a left-shift of a doubleword by a fixed amount-32 to 63 bits

Description:

 GPR[rd] = GPR[rt] << (sa+32)

The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in

GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.

Restrictions:

Operation:

s        = 1 || sa   /* 32+sa */
GPR[rd]   = GPR[rt](63-s)..0 || 0s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0

00000

rt

rd

sa

DSLL

111000

6

5

5

5

5

6

Format:

DSLL rd, rt, sa

MIPS64

Doubleword Shift Left Logical

Purpose:

Doubleword Shift Left Logical

To execute a left-shift of a doubleword by a fixed amount-0 to 31 bits

Description:

 GPR[rd] = GPR[rt] << sa

The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in

GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.

Restrictions:

Operation:

s     = 0 || sa
GPR[rd] = GPR[rt](63-s)..0 || 0s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

DSLLV

010100

6

5

5

5

5

6

Format:

DSLLV rd, rt, rs

MIPS64

Doubleword Shift Left Logical Variable

Purpose:

Doubleword Shift Left Logical Variable

To execute a left-shift of a doubleword by a variable number of bits.

Description:

 GPR[rd] = GPR[rt] << GPR[rs]

The 64-bit doubleword contents of GPR rt are shifted left, inserting zeros into the emptied bits; the result is placed in

GPR rd. The bit-shift amount in the range 0 to 63 is specified by the low-order 6 bits in GPR rs.

Restrictions:

Operation:

s        = GPR[rs]5..0
GPR[rd]   = GPR[rt](63-s)..0 || 0s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0

00000

rt

rd

sa

DSRA32

111111

6

5

5

5

5

6

Format:

DSRA32 rd, rt, sa

MIPS64

Doubleword Shift Right Arithmetic Plus 32

Purpose:

Doubleword Shift Right Arithmetic Plus 32

To execute an arithmetic right-shift of a doubleword by a fixed amount-32 to 63 bits

Description:

 GPR[rd] = GPR[rt] >> (sa+32)   (arithmetic)

The doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into the emptied bits; the result is placed in GPR rd. The bit-shift amount in the range 32 to 63 is specified by sa+32.

Restrictions:

Operation:

s        = 1 || sa   /* 32+sa */
GPR[rd]   = (GPR[rt]63)s || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0

00000

rt

rd

sa

DSRA

111011

6

5

5

5

5

6

Format:

DSRA rd, rt, sa

MIPS64

Doubleword Shift Right Arithmetic

Purpose:

Doubleword Shift Right Arithmetic

To execute an arithmetic right-shift of a doubleword by a fixed amount-0 to 31 bits.

Description:

 GPR[rd] = GPR[rt] >> sa   (arithmetic)

The 64-bit doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into the emptied bits; the result is placed in GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.

Restrictions:

Operation:

s        = 0 || sa
GPR[rd]   = (GPR[rt]63)s || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

DSRAV

010111

6

5

5

5

5

6

Format:

DSRAV rd, rt, rs 

MIPS64

Doubleword Shift Right Arithmetic Variable

Purpose:

Doubleword Shift Right Arithmetic Variable

To execute an arithmetic right-shift of a doubleword by a variable number of bits.

Description:

 GPR[rd] = GPR[rt] >> GPR[rs]   (arithmetic)

The doubleword contents of GPR rt are shifted right, duplicating the sign bit (63) into the emptied bits; the result is placed in GPR rd. The bit-shift amount in the range 0 to 63 is specified by the low-order 6 bits in GPR rs.

Restrictions:

Operation:

s        = GPR[rs]5..0
GPR[rd]   = (GPR[rt]63)s || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0000

R

0

rt

rd

saminus32

DSRL32

111110

6

1

1

5

5

5

6

Format:

DSRL32 rd, rt, sa 

MIPS64

Doubleword Shift Right Logical Plus 32

Purpose:

Doubleword Shift Right Logical Plus 32

To execute a logical right-shift of a doubleword by a fixed amount-32 to 63 bits

Description:

 GPR[rd] = GPR[rt] >> (saminus32+32)    (logical)

The 64-bit doubleword contents of GPR rt are shifted right, inserting zeros into the emptied bits; the result is placed in GPR rd. The bit-shift amount in the range 32 to 63 is specified by saminus32+32.

Restrictions:

Operation:

s        = 1 || sa   /* 32+saminus32 */
GPR[rd]   = 0s || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0000

R

0

rt

rd

sa

DSRL

111010

6

4

1

5

5

5

6

Format:

DSRL rd, rt, sa

MIPS64

Doubleword Shift Right Logical

Purpose:

Doubleword Shift Right Logical

To execute a logical right-shift of a doubleword by a fixed amount-0 to 31 bits.

Description:

 GPR[rd] = GPR[rt] >> sa   (logical)

The doubleword contents of GPR rt are shifted right, inserting zeros into the emptied bits; the result is placed in

GPR rd. The bit-shift amount in the range 0 to 31 is specified by sa.

Restrictions:

Operation:

s        = 0 || sa
GPR[rd]   = 0s || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

rd

0000

R

0

DSRLV

010110

6

5

5

5

4

1

6

Format:

DSRLV rd, rt, rs

MIPS64

Doubleword Shift Right Logical Variable

Purpose:

Doubleword Shift Right Logical Variable

To execute a logical right-shift of a doubleword by a variable number of bits

Description:

 GPR[rd] = GPR[rt] >> GPR[rs]   (logical)

The 64-bit doubleword contents of GPR rt are shifted right, inserting zeros into the emptied bits; the result is placed in GPR rd. The bit-shift amount in the range 0 to 63 is specified by the low-order 6 bits in GPR rs.

Restrictions:

Operation:

s        = GPR[rs]5..0
GPR[rd]   = 0s  || GPR[rt]63..s

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

DSUB

101110

6

5

5

5

5

6

Format:

DSUB rd, rs, rt

MIPS64

Doubleword Subtract

Purpose:

Doubleword Subtract

To subtract 64-bit integers; trap on overflow

Description:

 GPR[rd] = GPR[rs] - GPR[rt]

The 64-bit doubleword value in GPR rt is subtracted from the 64-bit value in GPR rs to produce a 64-bit result. If the subtraction results in 64-bit 2's complement arithmetic overflow, then the destination register is not modified and an

Integer Overflow exception occurs. If it does not overflow, the 64-bit result is placed into GPR rd.

Restrictions:

Operation:

temp = (GPR[rs]63||GPR[rs]) - (GPR[rt]63||GPR[rt])
if (temp64 != temp63) then
   SignalException(IntegerOverflow)
else
   GPR[rd] = temp63..0
endif

Exceptions:

Integer Overflow, Reserved Instruction

Programming Notes:

DSUBU performs the same arithmetic operation but does not trap on overflow.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

DSUBU

101111

6

5

5

5

5

6

Format:

DSUBU rd, rs, rt

MIPS64

Doubleword Subtract Unsigned

Purpose:

Doubleword Subtract Unsigned

To subtract 64-bit integers

Description:

 GPR[rd] = GPR[rs] - GPR[rt]

The 64-bit doubleword value in GPR rt is subtracted from the 64-bit value in GPR rs and the 64-bit arithmetic result is placed into GPR rd.

No Integer Overflow exception occurs under any circumstances.

Restrictions:

Operation: 64-bit processors

GPR[rd] = GPR[rs] - GPR[rt]

Exceptions:

Reserved Instruction

Programming Notes:

The term "unsigned" in the instruction name is a misnomer; this operation is 64-bit modulo arithmetic that does not trap on overflow. It is appropriate for unsigned arithmetic, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as C language arithmetic.

Encoding:

COP0

010000

MFMC0

01011

rt

0

00000

0

00000

sc

1

0

00

4

100

6

5

5

5

5

1

2

3

Format:

DVP rt

MIPS32 Release 6

Disable Virtual Processor

Purpose:

Disable Virtual Processor

To disable all virtual processors in a physical core other than the virtual processor that issued the instruction.

Description:

GPR[rt] = VPControl ; VPControlDIS = 1

Disabling a virtual processor means that instruction fetch is terminated, and all outstanding instructions for the affected virtual processor(s) must be complete before the DVP itself is allowed to retire. Any outstanding events such as hardware instruction or data prefetch, or page-table walks must also be terminated.

The DVP instruction has implicit SYNC(stype=0) semantics but with respect to the other virtual processors in the physical core.

After all other virtual processors have been disabled, VPControlDIS is set. Prior to modification and if rt is nonzero, sign-extended VPControl is written to GPR[rt].If DVP is specified without rt, then rtmust be 0.

DVP may also take effect on a virtual processor that has executed a WAIT or a PAUSE instruction. If a virtual processor has executed a WAIT instruction, then it cannot resume execution on an interrupt until an EVP has been executed.

If the EVP is executed before the interrupt arrives, then the virtual processor resumes in a state as if the DVP had not been executed, that is, it waits for the interrupt.

If a virtual processor has executed a PAUSE instruction, then it cannot resume execution until an EVP has been executed, even if LLbit is cleared. If an EVP is executed before the LLbit is cleared, then the virtual processor resumes in a state as if the DVP has not been executed, that is, it waits for the LLbit to clear.

The execution of a DVP must be followed by the execution of an EVP. The execution of an EVP causes execution to resume immediately-where applicable-on all other virtual processors, as if the DVP had not been executed. The execution is completely restorable after the EVP. If an event occurs in between the DVP and EVP that renders state of the virtual processor UNPREDICTABLE (such as power-gating), then the effect of EVP is UNPREDICTABLE.

DVP may only take effect if VPControlDIS=0. Otherwise it is treated as a NOP instruction.

If a virtual processor is disabled due to a DVP, then interrupts are also disabled for the virtual processor, that is, logically StatusIE=0. StatusIE for the target virtual processors though is not cleared though as software cannot access state on the virtual processors that have been disabled. Similarly, deferred exceptions will not cause a disabled virtual processor to be re-enabled for execution, at least until execution is re-enabled by the EVP instruction. The virtual processor that executes the DVP, however, continues to be interruptible.

In an implementation, the ability of a virtual processor to execute instructions may also be under control external to the physical core which contains the virtual processor. If disabled by DVP, a virtual processor must not resume fetch in response to the assertion of this external signal to enable fetch. Conversely, if fetch is disabled by such external control, then execution of EVP will not cause fetch to resume at a target virtual processor for which the control is deasserted.

This instruction never executes speculatively. It must be the oldest unretired instruction to take effect.

This instruction is only available in Release 6 implementations. For implementations that do not support multithreading (Config5VP=0), this instruction must be treated as a NOP instruction.

Restrictions:

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

In implementations prior to Release 6 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

The pseudo-code below assumes that the DVP is executed by virtual processor 0, while the target virtual processor is numbered 'n', where n is each of all remaining virtual processors.

          
         if (VPControlDIS = 0)
         // Pseudo-code in italics provides recommended action wrt other VPs 
         disable_fetch(VPn) {
             if PAUSE(VPn) retires prior or at disable event
             then VPn execution is not resumed if LLbit is cleared prior to EVP
         }
         disable_interrupt(VPn) {
             if WAIT(VPn) retires prior or at disable event
             then interrupts are ignored by VPn until EVP
         }
         // DVP0 not retired until instructions for VPn completed
         while (VPn outstanding instruction)
             DVP0 unretired
         endwhile
         endif
data = VPControl
GPR[rt] = sign_extend(data)
VPControlDIS = 1

Exceptions:

Coprocessor Unusable

Reserved Instruction (pre-Release 6 implementations)

Programming Notes:

DVP may disable execution in the target virtual processor regardless of the operating mode - kernel, supervisor, user.

Kernel software may also be in a critical region, or in a high-priority interrupt handler when the disable occurs. Since the instruction is itself privileged, such events are considered acceptable.

Before executing an EVP in a DVP/EVP pair, software should first read VPControlDIS, returned by DVP, to determine whether the virtual processors are already disabled. If so, the DVP/EVP sequence should be abandoned. This step allows software to safely nest DVP/EVP pairs.

Privileged software may use DVP/EVP to disable virtual processors on a core, such as for the purpose of doing a cache flush without interference from other processes in a system with multiple virtual processors or physical cores.

DVP (and EVP) may be used in other cases such as for power-savings or changing state that is applicable to all virtual processors in a core, such as virtual processor scheduling priority, as described below:

   ll t0 0(a0)
   dvp   // disable all other virtual processors
   pause  // wait for LLbit to clear
   evp   // enable all othe virtual processors
   ll t0 0(a0)
   dvp   // disable all other virtual processors
   <change core-wide state>
   evp   // enable all othe virtual processors

Encoding:

SPECIAL

000000

0

00000

0

00000

0

00000

3

00011

SLL

000000

6

5

5

5

5

6

Format:

EHB 

Assembly Idiom MIPS32 Release 2

Execution Hazard Barrier

Purpose:

Execution Hazard Barrier

To stop instruction execution until all execution hazards have been cleared.

Description:

EHB is used to denote execution hazard barrier. The actual instruction is interpreted by the hardware as SLL r0, r0, 3.

This instruction alters the instruction issue behavior on a pipelined processor by stopping execution until all execution hazards have been cleared. Other than those that might be created as a consequence of setting StatusCU0, there are no execution hazards visible to an unprivileged program running in User Mode. All execution hazards created by previous instructions are cleared for instructions executed immediately following the EHB, even if the EHB is executed in the delay slot of a branch or jump. The EHB instruction does not clear instruction hazards-such hazards are cleared by the JALR.HB, JR.HB, and ERET instructions.

Restrictions:

None

Operation:

ClearExecutionHazards()

Exceptions:

None

Programming Notes:

In Release 2 implementations, this instruction resolves all execution hazards. On a superscalar processor, EHB alters the instruction issue behavior in a manner identical to SSNOP. For backward compatibility with Release 1 implementations, the last of a sequence of SSNOPs can be replaced by an EHB. In Release 1 implementations, the EHB will be treated as an SSNOP, thereby preserving the semantics of the sequence. In Release 2 implementations, replacing the final SSNOP with an EHB should have no performance effect because a properly sized sequence of SSNOPs will have already cleared the hazard. As EHB becomes the standard in MIPS implementations, the previous SSNOPs can be removed, leaving only the EHB.

Encoding:

COP0

0100 00

MFMC0

01 011

rt

12

0110 0

0

000 00

sc

1

0

0 0

0

000

6

5

5

5

5

1

2

3

Format:

EI rt

MIPS32 Release 2

Enable Interrupts

Purpose:

Enable Interrupts

To return the previous value of the Status register and enable interrupts. If EI is specified without an argument, GPR r0 is implied, which discards the previous value of the Status register.

Description:

 GPR[rt] = Status; StatusIE = 1

The current value of the Status register is sign-extended and loaded into general register rt. The Interrupt Enable (IE) bit in the Status register is then set.

Restrictions:

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

This operation specification is for the general interrupt enable/disable operation, with the sc field as a variable. The individual instructions DI and EI have a specific value for the sc field.

data = Status
GPR[rt] = sign_extend(data)
StatusIE = 1

Exceptions:

Coprocessor Unusable

Reserved Instruction (Release 1 implementations)

Programming Notes:

The effects of this instruction are identical to those accomplished by the sequence of reading Status into a GPR, setting the IE bit, and writing the result back to Status. Unlike the multiple instruction sequence, however, the EI instruction cannot be aborted in the middle by an interrupt or exception.

This instruction creates an execution hazard between the change to the Status register and the point where the change to the interrupt enable takes effect. This hazard is cleared by the EHB, JALR.HB, JR.HB, or ERET instructions. Software must not assume that a fixed latency will clear the execution hazard.

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

ERET

011000

6

1

19

6

Format:

ERET 

MIPS32

Exception Return

Purpose:

Exception Return

To return from interrupt, exception, or error trap.

Description:

ERET clears execution and instruction hazards, conditionally restores SRSCtlCSS from SRSCtlPSS in a Release 2 implementation, and returns to the interrupted instruction at the completion of interrupt, exception, or error processing. ERET does not execute the next instruction (that is, it has no delay slot).

Restrictions:

Pre-Release 6: The operation of the processor is UNDEFINED if an ERET is executed in the delay slot of a branch or jump instruction.

Release 6: Implementations are required to signal a Reserved Instruction exception if ERET is encountered in the delay slot or forbidden slot of a branch or jump instruction.

An ERET placed between an LL and SC instruction will always cause the SC to fail.

ERET implements a software barrier that resolves all execution and instruction hazards created by Coprocessor 0 state changes (for Release 2 implementations, refer to the SYNCI instruction for additional information on resolving instruction hazards created by writing the instruction stream). The effects of this barrier are seen starting with the instruction fetch and decode of the instruction at the PC to which the ERET returns.

In a Release 2 implementation, ERET does not restore SRSCtlCSS from SRSCtlPSS if StatusBEV = 1, or if StatusERL

= 1 because any exception that sets StatusERL to 1 (Reset, Soft Reset, NMI, or cache error) does not save SRSCtlCSS in SRSCtlPSS. If software sets StatusERL to 1, it must be aware of the operation of an ERET that may be subsequently executed.

Operation:

if StatusERL = 1 then
   temp = ErrorEPC
   StatusERL = 0
else
   temp = EPC
   StatusEXL = 0
   if (ArchitectureRevision() >= 2) and (SRSCtlHSS > 0) and (StatusBEV = 0) then
      SRSCtlCSS = SRSCtlPSS
   endif
endif
if IsMIPS16Implemented() | (Config3ISA > 0) then
   PC = temp63..1 || 0
   ISAMode = temp0
else
   PC = temp
endif
LLbit = 0
ClearHazards()

Exceptions:

Coprocessor Unusable Exception

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 000

1

ERET

011000

6

1

18

1

6

Format:

ERETNC 

MIPS32 Release 5

Exception Return No Clear

Purpose:

Exception Return No Clear

To return from interrupt, exception, or error trap without clearing the LLbit.

Description:

ERETNC clears execution and instruction hazards, conditionally restores SRSCtlCSS from SRSCtlPSS when implemented, and returns to the interrupted instruction at the completion of interrupt, exception, or error processing.

ERETNC does not execute the next instruction (i.e., it has no delay slot).

ERETNC is identical to ERET except that an ERETNC will not clear the LLbit that is set by execution of an LL instruction, and thus when placed between an LL and SC sequence, will never cause the SC to fail.

An ERET must continue to be used by default in interrupt and exception processing handlers. The handler may have accessed a synchronizable block of memory common to code that is atomically accessing the memory, and where the code caused the exception or was interrupted. Similarly, a process context-swap must also continue to use an ERET in order to avoid a possible false success on execution of SC in the restored context.

Multiprocessor systems with non-coherent cores (i.e., without hardware coherence snooping) should also continue to use ERET, because it is the responsibility of software to maintain data coherence in the system.

An ERETNC is useful in cases where interrupt/exception handlers and kernel code involved in a process contextswap can guarantee no interference in accessing synchronizable memory across different contexts. ERETNC can also be used in an OS-level debugger to single-step through code for debug purposes, avoiding the false clearing of the

LLbit and thus failure of an LL and SC sequence in single-stepped code.

Software can detect the presence of ERETNC by reading Config5LLB.

Restrictions:

Release 6 implementations are required to signal a Reserved Instruction exception if ERETNC is executed in the delay slot or Release 6 forbidden slot of a branch or jump instruction.

ERETNC implements a software barrier that resolves all execution and instruction hazards created by Coprocessor 0 state changes. (For Release 2 implementations, refer to the SYNCI instruction for additional information on resolving instruction hazards created by writing the instruction stream.) The effects of this barrier are seen starting with the instruction fetch and decode of the instruction in the PC to which the ERETNC returns.

Operation:

if StatusERL = 1 then
   temp = ErrorEPC
   StatusERL = 0
else
   temp = EPC
   StatusEXL = 0
   if (ArchitectureRevision() >= 2) and (SRSCtlHSS > 0) and (StatusBEV = 0) then
      SRSCtlCSS = SRSCtlPSS
   endif
endif
if IsMIPS16Implemented() | (Config3ISA > 0) then
   PC = temp63..1 || 0
   ISAMode = temp0
else
   PC = temp
endif
ClearHazards()

Exceptions:

Coprocessor Unusable Exception

Encoding:

COP0

010000

MFMC0

01011

rt

0

00000

0

00000

sc

0

0

00

4

100

6

5

5

5

5

1

2

3

Format:

EVP rt

MIPS32 Release 6

Enable Virtual Processor

Purpose:

Enable Virtual Processor

To enable all virtual processors in a physical core other than the virtual processor that issued the instruction.

Description:

GPR[rt] = VPControl ; VPControlDIS = 0

Enabling a virtual processor means that instruction fetch is resumed.

After all other virtual processors have been enabled, VPControlDIS is cleared. Prior to modification, if rt is nonzero, sign-extended VPControl is written to GPR[rt].If EVP is specified without rt, then rtmust be 0.

See the DVP instruction to understand the application of EVP in the context of WAIT/PAUSE/external-control

("DVP" on page 242).

The execution of a DVP must be followed by the execution of an EVP. The execution of an EVP causes execution to resume immediately, where applicable, on all other virtual processors, as if the DVP had not been executed, that is, execution is completely restorable after the EVP. On the other hand, if an event occurs in between the DVP and EVP that renders state of the virtual processor UNPREDICTABLE (such as power-gating), then the effect of EVP is

UNPREDICTABLE.

EVP may only take effect if VPControlDIS=1. Otherwise it is treated as a NOP

This instruction never executes speculatively. It must be the oldest unretired instruction to take effect.

This instruction is only available in Release 6 implementations. For implementations that do not support multithreading (Config5VP=0), this instruction must be treated as a NOP instruction.

Restrictions:

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

In implementations prior to Release 6 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

The pseudo-code below assumes that the EVP is executed by virtual processor 0, while the target virtual processor is numbered 'n', where n is each of all remaining virtual processors.

                  if (VPControlDIS = 1)
         // Pseudo-code in italics provides recommended action wrt other VPs 
         enable_fetch(VPn) {
             if PAUSE(VPn) retires prior or at disable event
             then VPn execution is not resumed if LLbit is cleared prior to EVP
         }
         enable_interrupt(VPn) {
             if WAIT(VPn) retires prior or at disable event
             then interrupts are ignored by VPn until EVP
         }
         endif
data = VPControl
GPR[rt] = sign_extend(data)
VPControlDIS = 0

Exceptions:

Coprocessor Unusable

Reserved Instruction (pre-Release 6 implementations)

Programming Notes:

Before executing an EVP in a DVP/EVP pair, software should first read VPControlDIS, returned by DVP, to determine whether the virtual processors are already disabled. If so, the DVP/EVP sequence should be abandoned. This step allows software to safely nest DVP/EVP pairs.

Privileged software may use DVP/EVP to disable virtual processors on a core, such as for the purpose of doing a cache flush without interference from other processes in a system with multiple virtual processors or physical cores.

DVP (and EVP) may be used in other cases such as for power-savings or changing state that is applicable to all virtual processors in a core, such as virtual processor scheduling priority, as described below:

   ll t0 0(a0)
   dvp   // disable all other virtual processors
   pause  // wait for LLbit to clear
   evp   // enable all othe virtual processors
   ll t0 0(a0)
   dvp   // disable all other virtual processors
   <change core-wide state>
   evp   // enable all othe virtual processors

Encoding:

SPECIAL3

011111

rs

rt

msbd

(size-1)

lsb

(pos)

EXT

000000

6

5

5

5

5

6

Format:

EXT rt, rs, pos, size

MIPS32 Release 2

Extract Bit Field

Purpose:

Extract Bit Field

To extract a bit field from GPR rs and store it right-justified into GPR rt.

Description:

 GPR[rt] = ExtractField(GPR[rs], msbd, lsb)

The bit field starting at bit pos and extending for size bits is extracted from GPR rs and stored zero-extended and right-justified in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msbd (the most significant bit of the destination field in GPR rt), in instruction bits 15..11, and lsb

(least significant bit of the source field in GPR rs), in instruction bits 10..6, as follows:

msbd = size-1
lsb = pos

The values of pos and size must satisfy all of the following relations:

0 <= pos < 32
0 < size <= 32
0 < pos+size <= 32

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The operation is UNPREDICTABLE if lsb+msbd > 31.

If GPR rs does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if ((lsb + msbd) > 31) or (NotWordValue(GPR[rs])) then
   UNPREDICTABLE
endif
temp = sign_extend(032-(msbd+1) || GPR[rs]msbd+lsb..lsb)
GPR[rt] = temp

Exceptions:

Reserved Instruction

Encoding:

COP1

010001

fmt

0

00000

fs

fd

FLOOR.L

001011

6

5

5

5

5

6

Format:

FLOOR.L.fmt 

Floating Point Floor Convert to Long Fixed Point

FLOOR.L.S fd, fs

MIPS64, MIPS32 Release 2

Floating Point Floor Convert to Long Fixed Point

FLOOR.L.D fd, fs

MIPS64, MIPS32 Release 2

Floating Point Floor Convert to Long Fixed Point

Purpose:

Floating Point Floor Convert to Long Fixed Point

To convert an FP value to 64-bit fixed point, rounding down

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 64-bit long fixed point format and rounded toward >=

(rounding mode 3). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

263-1. On cores with FCSRNAN2008=1, the default result is:

263-1 when the input value is +inf or rounds to a number larger than 263-1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for long fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Operation:

StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact

Encoding:

COP1

010001

fmt

0

00000

fs

fd

FLOOR.W

001111

6

5

5

5

5

6

Format:

FLOOR.W.fmt 

Floating Point Floor Convert to Word Fixed Point

FLOOR.W.S   fd, fs

MIPS32

Floating Point Floor Convert to Word Fixed Point

FLOOR.W.D   fd, fs

MIPS32

Floating Point Floor Convert to Word Fixed Point

Purpose:

Floating Point Floor Convert to Word Fixed Point

To convert an FP value to 32-bit fixed point, rounding down

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 32-bit word fixed point format and rounded toward ->=

(rounding mode 3). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

231-1. On cores with FCSRNAN2008=1, the default result is:

231-1 when the input value is +inf or rounds to a number larger than 231-1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for word fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Unimplemented Operation, Inexact

Encoding:

SPECIAL3

011111

rs

0

0000000000000

GINVI

00

GINV

111101

6

5

13

2

6

Format:

GINVI rs

MIPS32 Release 6

Global Invalidate Instruction Cache

Purpose:

Global Invalidate Instruction Cache

In a multi-processor system, fully invalidate all remote primary instruction-caches, or a specified single cache. The local primary instruction cache is also fully invalidated in the case where all the remote caches are to be invalidated.

Description:

 Invalidate_All_Primary_Instruction_Caches(null or rs)

Fully invalidate all remote primary instruction caches, or a specified single cache, whether local or remote. The local primary instruction cache is also fully invalidated in the case where all remote caches are to be invalidated.

If rs field of the opcode is 0, then all caches are to be invalidated. 'rs' should be specified as 0 in the assembly syntax for this case. If rs field of the opcode is not 0, then a single cache that is specified by an implementation dependent number of lower bits of GPR[rs] is invalidated, which may be the local cache itself.

Software based invalidation of the primary instruction cache is required in a system if coherency of the cache is not maintained in hardware. While typically limited to the primary cache, the scope of the invalidation within a processor is however implementation dependent - it should apply to all instruction caches within the cache hierarchy that required software coherence maintenance.

In legacy systems, it is software's responsibility to keep the instruction cache state consistent through SYNCI instructions. This instruction provides a method for bulk invalidating the instruction caches in lieu of SYNCI.

The instruction's action is considered complete when the both the local and remote cache invalidations are complete,

that is, the data in the cache is no longer available to the related instruction stream. Whether these invalidations are complete can only be determined by the completion of a SYNC (stype=0x14) that follows the invalidate instruction(s). With the completion of the SYNC operation, all global invalidations preceding the SYNC in the program are

considered globally visible.

Whether the SYNC(stype=0x14) or the global invalidate itself cause synchronization of the instruction stream to new state/context is implementation dependent.

A processor may send a global invalidate instruction remotely only when any preceding global invalidate for the program has reached a global ordering point.

The GINVI has no instruction or execution hazard barrier semantics in itself.

If the implementation allows a cache line to be locked, i.e., not replaceable during a fill, GINVI will not invalidate the line. A cache line can be locked through the optional CACHE "Fetch and Lock" instruction.

See Programming Notes for programming constraints.

Restrictions:

If an implementation does not support the instruction, a Reserved Instruction exception is caused.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

In a single processor SOC, this instruction acts on the local instruction cache only.

Operation:

Local:
   if (Config5GI != 2'b1x)
      SignalException(ReservedInstruction) // if not implemented
      break
   endif
   if IsCoprocessorEnabled(0) then
      // Fully invalidate local instruction cache, if selected.
      // Send invalidation message to other cores, if required.
   else
      SignalException(CoprocessorUnusable, 0)
   endif
Remote:
      // Fully invalidate remote instruction cache.

Exceptions:

Reserved Instruction, Coprocessor Unusable

Programming Notes:

For the local processor, the instruction stream is synchronized by an instruction hazard barrier such as JR.HB.

The instruction stream in the remote processor is synchronized with respect to the execution of GINVI once the

SYNC operation following GINVI completes.

The following sequence is recommended for use of GINVI.

ginvi     /* fully-invalidate all caches*/
sync 0x14 /* Enforce completion - all instruction streams synchronized. */
jr.hb ra  /* Clear instruction hazards*/

Implementation Notes:

None.

Encoding:

SPECIAL3

011111

rs

00000

000000

type

GINVT

10

GINV

111101

6

5

5

6

2

2

6

Format:

GINVT rs, type

MIPS32 Release 6

Global Invalidate TLB

Purpose:

Global Invalidate TLB

In a multi-processor system, invalidate translations of remote TLBs and local TLB.

Description:

 Invalidate_TLB(GPR[rs], MemoryMapID)

Invalidate a TLB in multiple ways - entire TLB, by Virtual Address and MemoryMapID, by MemoryMapID or Virtual Address. The Virtual Address is obtained from GPR[rs]. The MemoryMapID is derived from CP0

MemoryMapID. The virtual address is associated with a specific Memory Map identified by MemoryMapID.

The virtual address within GPR[rs] is aligned to twice the size of the minimum page size of 4KB i.e., it is equivalent to EntryHiVPN2: bit 13 of the virtual address is aligned to bit 13 of GPR[rs]. If the virtual address is not required, such as in the case of invalidate All or by MemoryMapID, then 'rs' should be specified as 0 in the assembly syntax but is otherwise ignored.

The MemoryMapID is a replacement for EntryHiASID. The MemoryMapID is an implementation-dependent number of bits that must be larger than the existing EntryHiASID (10-bits including EntryHiASIDX ). The purpose of a larger tag is to be able to uniquely identify processes in the system. A 16-bit MemoryMapID for example will identify 64K Memory Maps, while the current 8-bit ASID only identifies 256, and is thus subject to frequent recycling.

An implementation with MemoryMapID is designed to be backward compatible with software that uses EntryHiASID. See CP0 MemoryMapID.

Table 3.1 specifies the different types of invalidates supported as a function of the "type" field of the instruction.

Table 3.1 Types of Global TLB Invalidates

Encoding of "type" field

Definition

00

Invalidate entire TLB

01

Invalidate by VA (MemoryMapID is globalized)

10

Invalidate by MemoryMapID

11

Invalidate by VA and MemoryMapID.

With reference to Table 3.1, if the Global bit in a TLB entry is set, then MemoryMapID comparison is ignored by the operation.

The instruction is considered complete when the local and remote invalidations are complete. Whether these invalidations are complete can only be determined by the completion of a SYNC (stype=0x14) that follows the invalidate

instruction(s). With the completion of the SYNC operation, all invalidations of this type preceding the SYNC in the program are considered globally visible.

Whether the SYNC(stype=0x14) or the global invalidate itself cause synchronization of the instruction stream to new state/context is implementation dependent.

A GINVT based invalidation is complete, whether local or remote, when the following has occurred: the TLB is invalidated of matching entries, and all instructions in the instruction stream after the point of completion can only access the new context.

A processor may send a global invalidate instruction remotely only when any preceding global invalidate for the program has reached a global ordering point.

GINVT has no instruction or execution hazard barrier semantics in itself.

A GINVT operation that is specified to invalidate all entries will only invalidate non-wired entries. Other GINVT operations will invalidate wired entries on a match.

Restrictions:

If an implementation does not support the instruction, or use of MemoryMapID is disabled, then a Reserved Instruction exception is caused.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable exception is signaled.

In a single processor SOC, this instruction acts on the local TLB only.

Operation:

Local:
   if (Config5GI != 2'b11) then
      SignalException(ReservedInstruction, 0)
      break
   endif
   if IsCoprocessorEnabled(0) then
      if (Config5MI = 1)
         // generate control from instruction encoding.
         invAll = (ginvt[type] = 0b00)
         invVA = (ginvt[type] = 0b01)
         invMMid = (ginvt[type] = 0b10)
         invVAMMid = (ginvt[type] = 0b11)
         // generate data; how data is driven when unsupported is imp-dep.
         // Format of GPR[rs] equals CP0 EntryHi.
         InvMsgVPN2 = GPR[rs]VPN2msb..13            // VPN2msb is imp-dep in MIPS64
         InvMsgR = GPR[rs]63:62                   // R same as CP0 EntryLo.R
         InvMsgMMid = MemoryMapID                // imp-dep # of bits
         // Broadcast invalidation message to other cores.
         InvalidateTLB(InvMsgVPN2,InvMsgR,InvMsgMMid,invAll,invVAMMid,invMMid,invVA)
      else // if not implemented, MMid disabled
         SignalException(ReservedInstruction)
      endif
   else
      SignalException(CoprocessorUnusable, 0)
   endif
Remote:
      // Repeat in all remote TLBs
      InvalidateTLB(InvMsgVPN2,InvMsgR,InvMsgMMid,invAll,invVAMMid,invMMid,invVA)
function InvalidateTLB(InvMsgVPN2,InvMsgR,InvMsgMMid,invAll,invVAMMid,invMMid,invVA)
      // "Mask" is equivalent to CP0 PageMask.
      // "G" is equivalent to the Global bit in CP0 EntryLo0/1.
      // "R" is equivalent to the R bit in CP0 EntryHi.
      for i in 0..TLBEntries-1 
          // Wired entries are excluded.
          VAMatch = (((TLB[i]VPN2 and not TLB[i]Mask) = (InvMsgVPN2 and not 
TLB[i]Mask)) and (TLB[i]R = InvMsgR))
          MMidMatch = (TLB[i]MMid  =  InvMsgMMid)
          if ((invAll and (i>CP0.Wired.Wired)) or // do not invalidate Wired
          (VAMatch and ((TLB[i]G = 1) or MMidMatch) and invVAMMid) or
         (VAMatch and invVA) or
         (MMidMatch and (TLB[i]G != 1) and invMMid)) then
             TLB[i]HW_Valid = 0       // where HW_Valid is the entry valid bit
         endif
      endfor
endfunction

Exceptions:

Reserved Instruction, Coprocessor Unusable

Programming Notes:

Since CP0 MemoryMapID sources the value of MemoryMapID of the currently running process, the kernel must save/ restore MemoryMapID appropriately before it modifies it for the invalidation operation. Between the save and restore, it must utilize unmapped addresses.

An MTC0 that modifies MemoryMapID must be followed by an EHB to make this value visible to a subsequent

GINVT. Where multiple GINVTs are used prior to a single SYNC (stype=0x14), each may use a different value of

MemoryMapID.

For the local processor, the instruction stream is synchronized to the new translation context (where applicable) by an instruction hazard barrier such as JR.HB.

The instruction stream in the remote processor is synchronized with respect to the execution of GINVT once the

SYNC operation completes.

The following sequence is recommended for use of GINVT.

mtc0 0, C0_PWCtl /* disable Page Walker,where applicable;implementation-dependent*/
ehb             /* Clear execution hazards to prevent speculative walks*/
ginvt r1, type  /* Invalidate TLB(s) */
sync 0x14       /* Enforce completion */
jr.hb ra        /* Clear instruction hazards */

Whether the hardware page table walker, if implemented, needs to be disabled as shown above, is implementation dependent. It is recommended that hardware take the steps to locally disable the hardware page table walker to maintain TLB consistency, as it would for remote TLBs.

Software must take into account a system that may have potentially varying widths of MemoryMapID . While not recommended, different processors may have different implemented or programmed widths. Further, the interface between processors may support yet another width. If this is the case, then software responsible for global invalidates should be run on the processor with maximum width. Software must zero-fill any bits that are unused by a target.

Software should also be able to rely on the implementation zero-filling bits where widths increase across any interface.

If an intermediate interface between source and target truncates the width of MemoryMapID, then software could address this limitation through various means: It could restrict the use of MemoryMapID to the interface width, it could program MemoryMapID with the expectation that over-invalidation may occur, or it should default to legacy means of invalidating the caches to prevent unreliable system behavior.

Encoding:

SPECIAL3

011111

rs

rt

msb

(pos+size-1)

lsb

(pos)

INS

000100

6

5

5

5

5

6

Format:

INS rt, rs, pos, size

MIPS32 Release 2

Insert Bit Field

Purpose:

Insert Bit Field

To merge a right-justified bit field from GPR rs into a specified field in GPR rt.

Description:

 GPR[rt] = InsertField(GPR[rt], GPR[rs], msb, lsb)

The right-most size bits from GPR rs are merged into the value from GPR rt starting at bit position pos. The result is placed back in GPR rt. The assembly language arguments pos and size are converted by the assembler to the instruction fields msb (the most significant bit of the field), in instruction bits 15..11, and lsb (least significant bit of the field), in instruction bits 10..6, as follows:

msb = pos+size-1
lsb = pos

The values of pos and size must satisfy all of the following relations:

0 <= pos < 32
0 < size <= 32
0 < pos+size <= 32

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The operation is UNPREDICTABLE if lsb > msb.

If either GPR rs or GPR rt does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

   if (lsb > msb) or (NotWordValue(GPR[rs])) or (NotWordValue(GPR[rt]))) then
      UNPREDICTABLE
   endif
GPR[rt] = sign_extend(GPR[rt]31..msb+1 || GPR[rs]msb-lsb..0 || GPR[rt]lsb-1..0)

Exceptions:

Reserved Instruction

Encoding:

pre-Release 6:

SPECIAL

000000

rs

0

00000

rd

1

Any other

legal hint

value

JALR

001001

6

5

5

5

1

4

6

Release 6:

SPECIAL

000000

rs

0

00000

rd

rd != 00000

1

Any other

legal hint

value

JALR

001001

6

5

5

5

1

4

6

Format:

JALR.HB rd, rs

MIPS32 Release 2

Jump and Link Register with Hazard Barrier

Purpose:

Jump and Link Register with Hazard Barrier

To execute a procedure call to an instruction address in a register and clear all execution and instruction hazards

Description:

 GPR[rd] = return_addr, PC = GPR[rs], clear execution and instruction hazards

Place the return address link in GPR rd. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

For processors that do not implement the MIPS16e or microMIPS ISA:

For processors that do implement the MIPS16e or microMIPS ISA:

Error exception will occur when the target instruction is fetched.

In both cases, execute the instruction that follows the jump, in the branch delay slot, before executing the jump itself.

JALR.HB implements a software barrier that resolves all execution and instruction hazards created by Coprocessor 0 state changes (for Release 2 implementations, refer to the SYNCI instruction for additional information on resolving instruction hazards created by writing the instruction stream). The effects of this barrier are seen starting with the instruction fetch and decode of the instruction at the PC to which the JALR.HB instruction jumps. An equivalent barrier is also implemented by the ERET instruction, but that instruction is only available if access to Coprocessor 0 is enabled, whereas JALR.HB is legal in all operating modes.

This instruction clears both execution and instruction hazards. Refer to the EHB instruction description for the method of clearing execution hazards alone.

JALR.HB uses bit 10 of the instruction (the upper bit of the hint field) to denote the hazard barrier operation.

Restrictions:

JALR.HB does not clear hazards created by any instruction that is executed in the delay slot of the JALR.HB. Only hazards created by instructions executed before the JALR.HB are cleared by the JALR.HB.

After modifying an instruction stream mapping or writing to the instruction stream, execution of those instructions has UNPREDICTABLE behavior until the instruction hazard has been cleared with JALR.HB, JR.HB, ERET, or

DERET. Further, the operation is UNPREDICTABLE if the mapping of the current instruction stream is modified.

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Jump-and-Link Restartability: Register specifiers rs and rd must not be equal, because such an instruction does not

have the same effect when re-executed. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by re-executing the branch when an exception occurs in the delay slot.

Restrictions Related to Multiple Instruction Sets: This instruction can change the active instruction set, if more than

one instruction set is implemented.

If only one instruction set is implemented, then the effective target address must obey the alignment rules of the instruction set. If multiple instruction sets are implemented, the effective target address must obey the alignment rules of the intended instruction set of the target address as specified by the bit 0 or GPR rs.

For processors that do not implement the microMIPS32/64 ISA, the effective target address in GPR rs must be naturally-aligned. For processors that do not implement the MIPS16 ASE nor microMIPS32/64 ISA, if either of the two least-significant bits are not zero, an Address Error exception occurs when the branch target is subsequently fetched as an instruction.

For processors that do implement the MIPS16 ASE or microMIPS32/64 ISA, if bit 0 is zero and bit 1 is one, an

Address Error exception occurs when the jump target is subsequently fetched as an instruction.

Availability and Compatibility:

Release 6 maps JR and JR.HB to JALR and JALR.HB with rd = 0:

Pre-Release 6, JR.HB and JALR.HB were distinct instructions, both with primary opcode SPECIAL, but with distinct function codes.

Release 6: JR.HB is defined to be JALR.HB with the destination register specifier rd set to 0. The primary opcode and function field are the same for JR.HB and JALR.HB. The pre-Release 6 instruction encoding for JR.HB is removed in Release 6.

Release 6 assemblers should accept the JR and JR.HB mnemonics, mapping them to the Release 6 instruction encodings.

Operation:

I: temp = GPR[rs]
   GPR[rd] = PC + 8
I+1:if (Config3ISA = 0) and (Config1CA = 0) then 
      PC  = temp
   else
      PC = tempGPRLEN-1..1 || 0
      ISAMode = temp0
   endif
   ClearHazards()

Exceptions:

None

Programming Notes:

This branch-and-link instruction can select a register for the return link; other link instructions use GPR 31. The default register for GPR rd, if omitted in the assembly language instruction, is GPR 31.

Release 6 JR.HB rs is implemented as JALR.HBr0,rs. For example, as JALR.HB with the destination set to the zero register, r0.

This instruction implements the final step in clearing execution and instruction hazards before execution continues. A hazard is created when a Coprocessor 0 or TLB write affects execution or the mapping of the instruction stream, or after a write to the instruction stream. When such a situation exists, software must explicitly indicate to hardware that the hazard should be cleared. Execution hazards alone can be cleared with the EHB instruction. Instruction hazards can only be cleared with a JR.HB, JALR.HB, or ERET instruction. These instructions cause hardware to clear the hazard before the instruction at the target of the jump is fetched. Note that because these instructions are encoded as jumps, the process of clearing an instruction hazard can often be included as part of a call (JALR) or return (JR) sequence, by simply replacing the original instructions with the HB equivalent.

Example: Clearing hazards due to an ASID change

/*
 * Code used to modify ASID and call a routine with the new
 * mapping established.
 *
 * a0 = New ASID to establish
 * a1 = Address of the routine to call
 */
   mfc0  v0, C0_EntryHi      /* Read current ASID */
   li    v1, ~M_EntryHiASID  /* Get negative mask for field */
   and   v0, v0, v1          /* Clear out current ASID value */
   or    v0, v0, a0          /* OR in new ASID value */
   mtc0  v0, C0_EntryHi      /* Rewrite EntryHi with new ASID */
   jalr.hb a1                /* Call routine, clearing the hazard */

Encoding:

pre-Release 6

SPECIAL

000000

rs

0

00000

rd

hint

JALR

001001

6

5

5

5

5

6

Release 6

SPECIAL

000000

rs

0

00000

rd

rd != 00000

hint

JALR

001001

6

5

5

5

5

6

Format:

JALR rd, rs

MIPS32

Jump and Link Register

Purpose:

Jump and Link Register

To execute a procedure call to an instruction address in a register

Description:

 GPR[rd] = return_addr, PC = GPR[rs]

Place the return address link in GPR rd. The return link is the address of the second instruction following the branch, where execution continues after a procedure call.

For processors that do not implement the MIPS16e or microMIPS ISA:

For processors that do implement the MIPS16e or microMIPS ISA:

Error exception will occur when the target instruction is fetched.

In both cases, execute the instruction that follows the jump, in the branch delay slot, before executing the jump itself.

In Release 1 of the architecture, the only defined hint field value is 0, which sets default handling of JALR. In

Release 2 of the architecture, bit 10 of the hint field is used to encode a hazard barrier. See the JALR.HB instruction description for additional information.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Jump-and-Link Restartability: Register specifiers rs and rd must not be equal, because such an instruction does not

have the same effect when re-executed. The result of executing such an instruction is UNPREDICTABLE. This restriction permits an exception handler to resume execution by re-executing the branch when an exception occurs in the delay slot.

Restrictions Related to Multiple Instruction Sets: This instruction can change the active instruction set, if more than

one instruction set is implemented.

If only one instruction set is implemented, then the effective target address must obey the alignment rules of the instruction set. If multiple instruction sets are implemented, the effective target address must obey the alignment rules of the intended instruction set of the target address as specified by the bit 0 or GPR rs.

For processors that do not implement the microMIPS32/64 ISA, the effective target address in GPR rs must be naturally-aligned. For processors that do not implement the MIPS16e ASE nor microMIPS32/64 ISA, if either of the two least-significant bits are not zero, an Address Error exception occurs when the branch target is subsequently fetched as an instruction.

For processors that do implement the MIPS16e ASE or microMIPS32/64 ISA, if target ISAMode bit is zero (GPR rs bit 0) and bit 1 is one, an Address Error exception occurs when the jump target is subsequently fetched as an instruction.

Availability and Compatibility:

Release 6 maps JR and JR.HB to JALR and JALR.HB with rd = 0:

Pre-Release 6, JR and JALR were distinct instructions, both with primary opcode SPECIAL, but with distinct function codes.

Release 6: JR is defined to be JALR with the destination register specifier rd set to 0. The primary opcode and function field are the same for JR and JALR. The pre-Release 6 instruction encoding for JR is removed in Release 6.

Release 6 assemblers should accept the JR and JR.HB mnemonics, mapping them to the Release 6 instruction encodings.

Operation:

I: temp = GPR[rs]
   GPR[rd] = PC + 8
I+1:if (Config3ISA = 0) and (Config1CA = 0) then 
      PC = temp
   else
      PC = tempGPRLEN-1..1 || 0
      ISAMode = temp0
   endif

Exceptions:

None

Programming Notes:

This jump-and-link register instruction can select a register for the return link; other link instructions use GPR 31.

The default register for GPR rd, if omitted in the assembly language instruction, is GPR 31.

Encoding:

JAL

000011

instr_index

6

26

Format:

JAL target 

MIPS32

Jump and Link

Purpose:

Jump and Link

To execute a procedure call within the current 256MB-aligned region.

Description:

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, at which location execution continues after a procedure call.

This is a PC-region branch (not PC-relative); the effective target address is in the "current" 256MB-aligned region.

The low 28 bits of the target address is the instr_index field shifted left 2bits. The remaining upper bits are the corresponding bits of the address of the instruction in the delay slot (not the branch itself).

Jump to the effective target address. Execute the instruction that follows the jump, in the branch delay slot, before executing the jump itself.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I:   GPR[31] = PC + 8
I+1:  PC = PCGPRLEN-1..28 || instr_index || 02

Exceptions:

None

Programming Notes:

Forming the branch target address by catenating PC and index bits rather than adding a signed offset to the PC is an advantage if all program code addresses fit into a 256MB region aligned on a 256MB boundary. It allows a branch from anywhere in the region to anywhere in the region, an action not allowed by a signed relative offset.

This definition creates the following boundary case: When the branch instruction is in the last word of a 256MB region, it can branch only to the following 256MB region containing the branch delay slot.

The Jump-and-Link instruction has been deprecated in Release 6. Use BALC instead.

Encoding:

JALX

011101

instr_index

6

26

Format:

JALX target

MIPS32 with (microMIPS or MIPS16e), removed in Release 6

Jump and Link Exchange

Purpose:

Jump and Link Exchange

To execute a procedure call within the current 256 MB-aligned region and change the ISA Mode from MIPS64 to microMIPS64 or MIPS16e.

Description:

Place the return address link in GPR 31. The return link is the address of the second instruction following the branch, at which location execution continues after a procedure call. The value stored in GPR 31 bit 0 reflects the current value of the ISA Mode bit.

This is a PC-region branch (not PC-relative); the effective target address is in the "current" 256 MB-aligned region.

The low 28 bits of the target address is the instr_index field shifted left 2 bits. The remaining upper bits are the corresponding bits of the address of the instruction in the delay slot (not the branch itself).

Jump to the effective target address, toggling the ISA Mode bit. Execute the instruction that follows the jump, in the branch delay slot, before executing the jump itself.

Restrictions:

This instruction only supports 32-bit aligned branch target addresses.

Control Transfer Instructions (CTIs) should not be placed in branch delay slots. CTIs include all branches and jumps,

NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Availability and Compatibility:

If the microMIPS base architecture is not implemented and the MIPS16e ASE is not implemented, a Reserved

Instruction exception is initiated.

The JALX instruction has been removed in Release 6 and reused its opcode for DAUI. Pre-Release 6 code using

JALX cannot run on Release 6 by trap-and-emulate. Equivalent functionality is provided by the JIALC instruction added by Release 6.

Operation:

I:    GPR[31] = PC + 8
I+1:  PC = PCGPRLEN-1..28 || instr_index || 02
      ISAMode = (not ISAMode)

Exceptions:

None

Programming Notes:

Forming the branch target address by concatenating PC and index bits rather than adding a signed offset to the PC is an advantage if all program code addresses fit into a 256 MB region aligned on a 256 MB boundary. It allows a branch from anywhere in the region to anywhere in the region, an action not allowed by a signed relative offset.

This definition creates the following boundary case: When the branch instruction is in the last word of a 256 MB region, it can branch only to the following 256 MB region containing the branch delay slot.

Encoding:

POP76

111110

JIALC

00000

rt

offset

6

5

5

16

Format:

JIALC rt, offset

MIPS32 Release 6

Jump Indexed and Link, Compact

Purpose:

Jump Indexed and Link, Compact

Description:

GPR[31] = PC+4, PC =( GPR[rt] + sign_extend( offset ) )

The jump target is formed by sign extending the offset field of the instruction and adding it to the contents of GPR rt.

The offset is NOT shifted, that is, each bit of the offset is added to the corresponding bit of the GPR.

Places the return address link in GPR 31. The return link is the address of the following instruction, where execution continues after a procedure call returns.

For processors that do not implement the MIPS16e or microMIPS ISA:

For processors that do implement the MIPS16e or microMIPS ISA:

Compact jumps do not have delay slots. The instruction after the jump is NOT executed when the jump is executed.

Restrictions:

This instruction is an unconditional, always taken, compact jump, and hence has neither a delay slot nor a forbidden slot. The instruction after the jump is not executed when the jump is executed.

The register specifier may be set to the link register $31, because compact jumps do not have the restartability issues of jumps with delay slots. However, this is not common programming practice.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Release 6 instructions JIALC and BNEZC differ only in the rs field, instruction bits 21-25. JIALC and BNEZC occupy the same encoding as pre-Release 6 instruction encoding SDC2, which is recoded in Release 6.

Exceptions:

None

Operation:

temp = GPR[rt] + sign_extend(offset)
GPR[31] = PC + 4
if (Config3ISA = 0) and (Config1CA = 0) then 
      PC = temp 
   else
      PC = (tempGPRLEN-1..1 || 0)
      ISAMode = temp0
   endif

Programming Notes:

JIALC does NOT shift the offset before adding it the register. This can be used to eliminate tags in the least significant bits that would otherwise produce misalignment. It also allows JIALC to be used as a substitute for the JALX instruction, removed in Release 6, where the lower bits of the target PC, formed by the addition of GPR[rt] and the unshifted offset, specify the target ISAmode.

Encoding:

POP66

110110

JIC

00000

rt

offset

6

5

5

16

Format:

JIC rt, offset 

MIPS32 Release 6

Jump Indexed, Compact

Purpose:

Jump Indexed, Compact

Description:

PC =( GPR[rt] + sign_extend( offset ) )

The branch target is formed by sign extending the offset field of the instruction and adding it to the contents of GPR rt.

The offset is NOT shifted, that is, each bit of the offset is added to the corresponding bit of the GPR.

For processors that do not implement the MIPS16e or microMIPS ISA:

For processors that do implement the MIPS16e or microMIPS ISA:

Compact jumps do not have a delay slot. The instruction after the jump is NOT executed when the jump is executed.

Restrictions:

This instruction is an unconditional, always taken, compact jump, and hence has neither a delay slot nor a forbidden slot. The instruction after the jump is not executed when the jump is executed.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Release 6 instructions JIC and BEQZC differ only in the rs field. JIC and BEQZC occupy the same encoding as preRelease 6 instruction LDC2, which is recoded in Release 6.

Exceptions:

None

Operation:

temp = GPR[rt] + sign_extend(offset)
if (Config3ISA = 0) and (Config1CA = 0) then 
      PC = temp 
   else
      PC = (tempGPRLEN-1..1 || 0)
      ISAMode = temp0
   endif

Programming Notes:

JIC does NOT shift the offset before adding it the register. This can be used to eliminate tags in the least significant bits that would otherwise produce misalignment. It also allows JIALC to be used as a substitute for the JALX instruction, removed in Release 6, where the lower bits of the target PC, formed by the addition of GPR[rt] and the unshifted offset, specify the target ISAmode.

Encoding:

pre-Release 6:

SPECIAL

000000

rs

0

00 0000 0000

1

Any other

legal hint

value

JR

001000

6

5

10

1

4

6

Release 6:

SPECIAL

000000

rs

0

00000

0

00000

1

Any other

legal hint

value

JALR

001001

6

5

5

5

1

4

6

Format:

JR.HB rs

MIPS32 Release 2

Jump Register with Hazard Barrier

Purpose:

Jump Register with Hazard Barrier

To execute a branch to an instruction address in a register and clear all execution and instruction hazards.

Description:

 PC = GPR[rs], clear execution and instruction hazards

Jump to the effective target address in GPR rs. Execute the instruction following the jump, in the branch delay slot, before jumping.

For processors that do not implement the MIPS16e or microMIPS ISA:

For processors that do implement the MIPS16e or microMIPS ISA:

Error exception will occur when the target instruction is fetched.

JR.HB implements a software barrier that resolves all execution and instruction hazards created by Coprocessor 0 state changes (for Release 2 implementations, refer to the SYNCI instruction for additional information on resolving instruction hazards created by writing the instruction stream). The effects of this barrier are seen starting with the instruction fetch and decode of the instruction at the PC to which the JR.HB instruction jumps. An equivalent barrier is also implemented by the ERET instruction, but that instruction is only available if access to Coprocessor 0 is enabled, whereas JR.HB is legal in all operating modes.

This instruction clears both execution and instruction hazards. Refer to the EHB instruction description for the method of clearing execution hazards alone.

JR.HB uses bit 10 of the instruction (the upper bit of the hint field) to denote the hazard barrier operation.

Restrictions:

JR.HB does not clear hazards created by any instruction that is executed in the delay slot of the JR.HB. Only hazards created by instructions executed before the JR.HB are cleared by the JR.HB.

After modifying an instruction stream mapping or writing to the instruction stream, execution of those instructions has UNPREDICTABLE behavior until the hazard has been cleared with JALR.HB, JR.HB, ERET, or DERET. Further, the operation is UNPREDICTABLE if the mapping of the current instruction stream is modified.

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Restrictions Related to Multiple Instruction Sets: This instruction can change the active instruction set, if more than

one instruction set is implemented.

If only one instruction set is implemented, then the effective target address must obey the alignment rules of the instruction set. If multiple instruction sets are implemented, the effective target address must obey the alignment rules of the intended instruction set of the target address as specified by the bit 0 or GPR rs.

For processors that do not implement the microMIPS ISA, the effective target address in GPR rs must be naturallyaligned. For processors that do not implement the MIPS16 ASE or microMIPS ISA, if either of the two least-significant bits are not zero, an Address Error exception occurs when the branch target is subsequently fetched as an instruction.

For processors that do implement the MIPS16 ASE or microMIPS ISA, if bit 0 is zero and bit 1 is one, an Address

Error exception occurs when the jump target is subsequently fetched as an instruction.

Availability and Compatibility:

Release 6 maps JR and JR.HB to JALR and JALR.HB with rd = 0:

Pre-Release 6, JR.HB and JALR.HB were distinct instructions, both with primary opcode SPECIAL, but with distinct function codes.

Release 6: JR.HB is defined to be JALR.HB with the destination register specifier rd set to 0. The primary opcode and function field are the same for JR.HB and JALR.HB. The pre-Release 6 instruction encoding for JR.HB is removed in

Release 6.

Release 6 assemblers should accept the JR and JR.HB mnemonics, mapping them to the Release 6 instruction encodings.

Operation:

I: temp = GPR[rs]
I+1:if (Config3ISA = 0) and (Config1CA = 0) then
      PC = temp
   else
      PC = tempGPRLEN-1..1 || 0
      ISAMode = temp0
   endif
   ClearHazards()

Exceptions:

None

Programming Notes:

This instruction implements the final step in clearing execution and instruction hazards before execution continues. A hazard is created when a Coprocessor 0 or TLB write affects execution or the mapping of the instruction stream, or after a write to the instruction stream. When such a situation exists, software must explicitly indicate to hardware that the hazard should be cleared. Execution hazards alone can be cleared with the EHB instruction. Instruction hazards can only be cleared with a JR.HB, JALR.HB, or ERET instruction. These instructions cause hardware to clear the hazard before the instruction at the target of the jump is fetched. Note that because these instructions are encoded as jumps, the process of clearing an instruction hazard can often be included as part of a call (JALR) or return (JR) sequence, by simply replacing the original instructions with the HB equivalent.

Example: Clearing hazards due to an ASID change

/*
 * Routine called to modify ASID and return with the new
 * mapping established.
 *
 * a0 = New ASID to establish
 */
   mfc0  v0, C0_EntryHi      /* Read current ASID */
   li    v1, ~M_EntryHiASID  /* Get negative mask for field */
   and   v0, v0, v1          /* Clear out current ASID value */
   or    v0, v0, a0          /* OR in new ASID value */
   mtc0  v0, C0_EntryHi      /* Rewrite EntryHi with new ASID */
   jr.hb ra                 /* Return, clearing the hazard */
   nop

Example: Making a write to the instruction stream visible

/*
 * Routine called after new instructions are written to
 * make them visible and return with the hazards cleared.
 */
   {Synchronize the caches - see the SYNCI and CACHE instructions}
   sync                      /* Force memory synchronization */
   jr.hb ra                 /* Return, clearing the hazard */
   nop

Example: Clearing instruction hazards in-line

   la    AT, 10f
   jr.hb AT                 /* Jump to next instruction, clearing */
   nop                       /*   hazards */
10:

Encoding:

pre-Release 6:

SPECIAL

000000

rs

0

00 0000 0000

hint

JR

001000

6

5

10

5

6

Release 6:

SPECIAL

000000

rs

0

00000

00000

hint

JALR

001001

6

5

5

5

5

6

Format:

JR rs

MIPS32

Jump Register

Purpose:

Jump Register

To execute a branch to an instruction address in a register

Description:

 PC = GPR[rs]

Jump to the effective target address in GPR rs. Execute the instruction following the jump, in the branch delay slot, before jumping.

For processors that do not implement the MIPS16e or microMIPS ISA:

For processors that do implement the MIPS16e or microMIPS ISA:

Error exception will occur when the target instruction is fetched.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Restrictions Related to Multiple Instruction Sets: This instruction can change the active instruction set, if more than

one instruction set is implemented.

If only one instruction set is implemented, then the effective target address must obey the alignment rules of the instruction set. If multiple instruction sets are implemented, the effective target address must obey the alignment rules of the intended instruction set of the target address as specified by the bit 0 or GPR rs.

For processors that do not implement the microMIPS ISA, the effective target address in GPR rs must be naturallyaligned. For processors that do not implement the MIPS16e ASE or microMIPS ISA, if either of the two least-significant bits are not zero, an Address Error exception occurs when the branch target is subsequently fetched as an instruction.

For processors that do implement the MIPS16e ASE or microMIPS ISA, if bit 0 is zero and bit 1 is one, an Address

Error exception occurs when the jump target is subsequently fetched as an instruction.

In release 1 of the architecture, the only defined hint field value is 0, which sets default handling of JR. In Release 2 of the architecture, bit 10 of the hint field is used to encode an instruction hazard barrier. See the JR.HB instruction description for additional information.

Availability and Compatibility:

Release 6 maps JR and JR.HB to JALR and JALR.HB with rd = 0:

Pre-Release 6, JR and JALR were distinct instructions, both with primary opcode SPECIAL, but with distinct function codes.

Release 6: JR is defined to be JALR with the destination register specifier rd set to 0. The primary opcode and function field are the same for JR and JALR. The pre-Release 6 instruction encoding for JR is removed in Release 6.

Release 6 assemblers should accept the JR and JR.HB mnemonics, mapping them to the Release 6 instruction encodings.

Operation:

I: temp = GPR[rs]
I+1:if (Config3ISA = 0) and (Config1CA = 0) then
      PC = temp
   else
      PC = tempGPRLEN-1..1 || 0
      ISAMode = temp0
   endif

Exceptions:

None

Programming Notes:

Software should use the value 31 for the rs field of the instruction word on return from a JAL, JALR, or BGEZAL, and should use a value other than 31 for remaining uses of JR.

Encoding:

J

000010

instr_index

6

26

Format:

J target

MIPS32

Jump

Purpose:

Jump

To branch within the current 256 MB-aligned region.

Description:

This is a PC-region branch (not PC-relative); the effective target address is in the "current" 256 MB-aligned region.

The low 28 bits of the target address is the instr_index field shifted left 2bits. The remaining upper bits are the corresponding bits of the address of the instruction in the delay slot (not the branch itself).

Jump to the effective target address. Execute the instruction that follows the jump, in the branch delay slot, before executing the jump itself.

Restrictions:

Control Transfer Instructions (CTIs) should not be placed in branch delay slots or Release 6 forbidden slots. CTIs

include all branches and jumps, NAL, ERET, ERETNC, DERET, WAIT, and PAUSE.

Pre-Release 6: Processor operation is UNPREDICTABLE if a control transfer instruction (CTI) is placed in the delay slot of a branch or jump.

Release 6: If a control transfer instruction (CTI) is executed in the delay slot of a branch or jump, Release 6 implementations are required to signal a Reserved Instruction exception.

Operation:

I: 
I+1: PC = PCGPRLEN-1..28 || instr_index || 02

Exceptions:

None

Programming Notes:

Forming the branch target address by catenating PC and index bits rather than adding a signed offset to the PC is an advantage if all program code addresses fit into a 256MB region aligned on a 256MB boundary. It allows a branch from anywhere in the region to anywhere in the region, an action not allowed by a signed relative offset.

This definition creates the following boundary case: When the jump instruction is in the last word of a 256MB region, it can branch only to the following 256MB region containing the branch delay slot.

The Jump instruction has been deprecated in Release 6. Use BC instead.

Encoding:

SPECIAL3

011111

base

rt

offset

0

LBE

101100

6

5

5

9

1

6

Format:

LBE rt, offset(base)

MIPS32

Load Byte EVA

Purpose:

Load Byte EVA

To load a byte as a signed value from user mode virtual address space when executing in kernel mode.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 8-bit byte at the memory location specified by the effective address are fetched, sign-extended, and placed in GPR rt. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The LBE instruction functions the same as the LB instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode and executing in kernel mode. Memory segments using UUSK or MUSK access modes are also accessible. Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Restrictions:

Only usable when access to Coprocessor0 is enabled and accessing an address within a segment configured using

UUSK, MUSK or MUSUK access mode.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
memdoubleword = LoadMemory (CCA, BYTE, pAddr, vAddr, DATA)
byte = vAddr2..0 xor BigEndianCPU3
GPR[rt] = sign_extend(memdoubleword7+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid

Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

LB

100000

base

rt

offset

6

5

5

16

Format:

LB rt, offset(base)

MIPS32

Load Byte

Purpose:

Load Byte

To load a byte from memory as a signed value.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 8-bit byte at the memory location specified by the effective address are fetched, sign-extended, and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

None

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
memdoubleword = LoadMemory (CCA, BYTE, pAddr, vAddr, DATA)
byte = vAddr2..0 xor BigEndianCPU3
GPR[rt] = sign_extend(memdoubleword7+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Watch

Encoding:

SPECIAL3

011111

base

rt

offset

0

LBUE

101000

6

5

5

9

1

6

Format:

LBUE rt, offset(base)

MIPS32

Load Byte Unsigned EVA

Purpose:

Load Byte Unsigned EVA

To load a byte as an unsigned value from user mode virtual address space when executing in kernel mode.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 8-bit byte at the memory location specified by the effective address are fetched, zero-extended, and placed in GPR rt. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The LBUE instruction functions the same as the LBU instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Restrictions:

Only usable when access to Coprocessor0 is enabled and accessing an address within a segment configured using

UUSK, MUSK or MUSUK access mode.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
memdoubleword = LoadMemory (CCA, BYTE, pAddr, vAddr, DATA)
byte = vAddr2..0 xor BigEndianCPU3
GPR[rt] = zero_extend(memdoubleword7+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

LBU

100100

base

rt

offset

6

5

5

16

Format:

LBU rt, offset(base)

MIPS32

Load Byte Unsigned

Purpose:

Load Byte Unsigned

To load a byte from memory as an unsigned value

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 8-bit byte at the memory location specified by the effective address are fetched, zero-extended, and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

None

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
memdoubleword = LoadMemory (CCA, BYTE, pAddr, vAddr, DATA)
byte = vAddr2..0 xor BigEndianCPU3
GPR[rt] = zero_extend(memdoubleword7+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Watch

Encoding:

LDC1

110101

base

ft

offset

6

5

5

16

Format:

LDC1 ft, offset(base)

MIPS32

Load Doubleword to Floating Point

Purpose:

Load Doubleword to Floating Point

To load a doubleword from memory to an FPR.

Description:

 FPR[ft] = memory[GPR[base] + offset]

The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed in FPR ft. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress2..0 != 0 (not doubleword-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword = LoadMemory(CCA, DOUBLEWORD, pAddr, vAddr, DATA)
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
memlsw = LoadMemory(CCA, WORD, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
memmsw = LoadMemory(CCA, WORD, pAddr, vAddr+4, DATA)
memdoubleword = memmsw || memlsw
StoreFPR(ft, UNINTERPRETED_DOUBLEWORD, memdoubleword)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, Address Error, Watch

Encoding:

pre-Release 6

LDC2

110110

base

rt

offset

6

5

5

16

Release 6

COP2

010010

LDC2

01110

rt

base

offset

6

5

5

5

11

Format:

LDC2 rt, offset(base)

MIPS32

Load Doubleword to Coprocessor 2

Purpose:

Load Doubleword to Coprocessor 2

To load a doubleword from memory to a Coprocessor 2 register.

Description:

 CPR[2,rt,0] = memory[GPR[base] + offset]

The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed in Coprocessor 2 register rt. The signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress2..0 != 0 (not doubleword-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword = LoadMemory(CCA, DOUBLEWORD, pAddr, vAddr, DATA)
CPR[2,rt,0] = memdoubleword
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
memlsw = LoadMemory(CCA, WORD, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
memmsw = LoadMemory(CCA, WORD, pAddr, vAddr+4, DATA)
=memlsw
memmsw

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, Address Error, Watch

Programming Notes:

There is no Load Linked Word Unsigned operation corresponding to Load Word Unsigned.

Release 6 implements a 9-bit offset, whereas all release levels lower than Release 6 implement a 16-bit offset.

Programming Notes:

As shown in the instruction drawing above, Release 6 implements an 11-bit offset, whereas all release levels lower than Release 6 of the MIPS architecture implement a 16-bit offset.

Encoding:

LDL

011010

base

rt

offset

6

5

5

16

Format:

LDL rt, offset(base)

MIPS64, removed in Release 6

Load Doubleword Left

Purpose:

Load Doubleword Left

To load the most-significant part of a doubleword from an unaligned memory address

Description:

 GPR[rt] = GPR[rt] MERGE memory[GPR[base] + offset]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 8 consecutive bytes forming a doubleword (DW) in memory, starting at an arbitrary byte boundary.

A part of DW, the most-significant 1 to 8 bytes, is in the aligned doubleword containing EffAddr. This part of DW is loaded appropriately into the most-significant (left) part of GPR rt, leaving the remainder of GPR rt unchanged.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
byte = vAddr2..0 xor BigEndianCPU3
memdoubleword = LoadMemory (CCA, byte, pAddr, vAddr, DATA)
GPR[rt] = memdoublworde7+8*byte..0 || GPR[rt]55–8*byte..0

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Reserved Instruction, Watch

Encoding:

PCREL

111011

rs

LDPC

110

offset

6

5

3

18

Format:

LDPC rs, offset

MIPS64 Release 6

Load Doubleword PC-relative

Purpose:

Load Doubleword PC-relative

To load a doubleword from memory, using a PC-relative address.

Description:

GPR[rs] = memory[ (PC&~0x7) + sign_extend( offset << 3) ]

The bit offset is shifted left by 3 bits, sign-extended, and added to the address of the aligned doubleword containing the LDPC instruction.

The contents of the 64-bit doubleword at the memory location specified by the effective address are fetched, and placed in GPR rs.

Restrictions:

LDPC is naturally aligned, by specification.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Operation

vAddr = ( (PC&~0x7)+ sign_extend(offset) )
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword = LoadMemory (CCA, DOUBLEWORD, pAddr, vAddr, DATA)
GPR[rs] = memdoubleword

Exceptions:

TLB Refill, TLB Invalid, TLB Read Inhibit, Bus Error, Address Error, Watch, Reserved Instruction

Programming Note

The Release 6 PC-relative loads (LWPC, LWUPC, LDPC) are considered data references.

For the purposes of watchpoints (provided by the CP0 WatchHi and WatchLo registers) and EJTAG breakpoints, the

PC-relative reference is considered to be a data reference, rather than an instruction reference. That is, the watchpoint or breakpoint is triggered only if enabled for data references.

Encoding:

LDR

011011

base

rt

offset

6

5

5

16

Format:

LDR rt, offset(base)

MIPS64, removed in Release 6

Load Doubleword Right

Purpose:

Load Doubleword Right

To load the least-significant part of a doubleword from an unaligned memory address

Description:

 GPR[rt] = GPR[rt] MERGE memory[GPR[base] + offset]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 8 consecutive bytes forming a doubleword (DW) in memory, starting at an arbitrary byte boundary.

A part of DW, the most-significant 1 to 8 bytes, is in the aligned doubleword containing EffAddr. This part of DW is loaded appropriately into the most-significant (left) part of GPR rt, leaving the remainder of GPR rt unchanged.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 1 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
byte = vAddr2..0 xor BigEndianCPU3
memdoubleword = LoadMemory (CCA, byte, pAddr, vAddr, DATA)
GPR[rt] = GPR[rt]63..64-8*byte || memdoublworde63..8*byte

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Reserved Instruction, Watch

Encoding:

LD

110111

base

rt

offset

6

5

5

16

MIPS64

Format:

LD rt, offset(base)

Load Doubleword

Purpose:

Load Doubleword

To load a doubleword from memory

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If any of the 3 least-significant bits of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword = LoadMemory (CCA, DOUBLEWORD, pAddr, vAddr, DATA)
GPR[rt] = memdoubleword

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Reserved Instruction, Watch

Encoding:

COP1X

010011

base

index

0

00000

fd

LDXC1

000001

6

5

5

5

5

6

Format:

LDXC1 fd, index(base)

MIPS64,MIPS32 Release 2, removed in Release 6

Load Doubleword Indexed to Floating Point

Purpose:

Load Doubleword Indexed to Floating Point

To load a doubleword from memory to an FPR (GPR+GPR addressing)

Description:

 FPR[fd] = memory[GPR[base] + GPR[index]]

The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed in FPR fd. The contents of GPR index and GPR base are added to form the effective address.

Restrictions:

An Address Error exception occurs if EffectiveAddress2..0 != 0 (not doubleword-aligned). Availability and Compatibility:

This instruction has been removed in Release 6.

Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32 Release 1. Required in

MIPS32 Release 2 and all subsequent versions of MIPS32. When required, required whenever FPU is present, whether a 32-bit or 64-bit FPU, whether in 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1, StatusFR=0 or 1).

Operation:

vAddr = GPR[base] + GPR[index]
if vAddr2..0 != 03 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword = LoadMemory(CCA, DOUBLEWORD, pAddr, vAddr, DATA)
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
memlsw = LoadMemory(CCA, WORD, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
memmsw = LoadMemory(CCA, WORD, pAddr, vAddr+4, DATA)
memdoubleword = memmsw || memlsw
StoreFPR(fd, UNINTERPRETED_DOUBLEWORD, memdoubleword)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Reserved Instruction, Coprocessor Unusable, Watch

Encoding:

SPECIAL3

011111

base

rt

offset

0

LHE

101101

6

5

5

9

1

6

Format:

LHE rt, offset(base)

MIPS32

Load Halfword EVA

Purpose:

Load Halfword EVA

To load a halfword as a signed value from user mode virtual address space when executing in kernel mode.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 16-bit halfword at the memory location specified by the aligned effective address are fetched, sign-extended, and placed in GPR rt. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The LHE instruction functions the same as the LH instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Restrictions:

Only usable when access to Coprocessor0 is enabled and accessing an address within a segment configured using

UUSK, MUSK or MUSUK access mode.

Pre-Release 6: The effective address must be naturally-aligned. If the least-significant bit of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
memdoubleword = LoadMemory (CCA, HALFWORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU2 || 0)
GPR[rt] = sign_extend(memdoubleword15+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error

Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

LH

100001

base

rt

offset

6

5

5

16

Format:

LH rt, offset(base)

MIPS32

Load Halfword

Purpose:

Load Halfword

To load a halfword from memory as a signed value

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 16-bit halfword at the memory location specified by the aligned effective address are fetched, sign-extended, and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If the least-significant bit of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
memdoubleword = LoadMemory (CCA, HALFWORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU2 || 0)
GPR[rt] = sign_extend(memdoubleword15+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch

Encoding:

SPECIAL3

011111

base

rt

offset

0

LHUE

101001

6

5

5

9

1

6

Format:

LHUE rt, offset(base)

MIPS32

Load Halfword Unsigned EVA

Purpose:

Load Halfword Unsigned EVA

To load a halfword as an unsigned value from user mode virtual address space when executing in kernel mode.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 16-bit halfword at the memory location specified by the aligned effective address are fetched, zero-extended, and placed in GPR rt. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The LHUE instruction functions the same as the LHU instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Restrictions:

Only usable when access to Coprocessor0 is enabled and accessing an address within a segment configured using

UUSK, MUSK or MUSUK access mode.

Pre-Release 6: The effective address must be naturally-aligned. If the least-significant bit of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
memdoubleword = LoadMemory (CCA, HALFWORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU2 || 0)
GPR[rt] = zero_extend(memdoubleword15+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

LHU

100101

base

rt

offset

6

5

5

16

Format:

LHU rt, offset(base)

MIPS32

Load Halfword Unsigned

Purpose:

Load Halfword Unsigned

To load a halfword from memory as an unsigned value

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 16-bit halfword at the memory location specified by the aligned effective address are fetched, zero-extended, and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If the least-significant bit of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
memdoubleword = LoadMemory (CCA, HALFWORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU2 || 0)
GPR[rt] = zero_extend(memdoubleword15+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Watch

Encoding:

SPECIAL3

011111

base

rt

rd

0

0000

1

LLD

110111

6

5

5

5

4

1

6

Format:

LLDP rt, rd, (base)

MIPS64 Release 6

Load Linked DoubleWord Paired

Purpose:

Load Linked DoubleWord Paired

To load two double-words from memory for an atomic read-modify-write, writing a double-word each to two registers.

Description:

 GPR[rd] = memory[GPR[base]]127..64, GPR[rt] = memory[GPR[base]]63..0

The LLDP and SCDP instructions provide primitives to implement a paired double-word atomic read-modify-write

(RMW) operation at a synchronizable memory location.

The paired double-word at the memory location specified by the quad-word aligned effective address is read in a single atomic memory operation. The least significant double-word is written into GPR rt. The most significant doubleword is written into GPR rd.

A paired double-word read or write occurs as a pair of double-word reads or writes that is quad-word atomic.

The instruction has no offset. The effective address is equal to the contents of GPR base.

The execution of LLDP begins a RMW sequence on the current processor. There can be only one active RMW sequence per processor. When an LLDP is executed it starts an active RMW sequence replacing any other sequence that was active. The RMW sequence is completed by a subsequent SCDP instruction that either completes the RMW sequence atomically and succeeds, or does not and fails.

Successful execution of the LLDP results in setting LLbit and writing CP0 LLAddr, where LLbit is the least-significant bit of LLAddr. LLAddr contains the data-type aligned address of the operation, in this case a quad-word aligned address.

Executing LLDP on one processor does not cause an action that, by itself, causes a store conditional instruction type for the same block to fail on another processor.

An execution of LLDP does not have to be followed by execution of SCDP; a program is free to abandon the RMW sequence without attempting a write.

Restrictions:

The addressed location must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result is UNPREDICTABLE. Which storage is synchronizable is a function of both CPU and system implementations. See the documentation of the SC instruction for the formal definition.

The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.

Providing misaligned support is not a requirement for this instruction.

Availability and Compatibility

This instruction is introduced by Release 6. It is only present if Config5XNP=0.

Operation:

vAddr = GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
// PAIREDDOUBLEWORD: two double-word data-type that is quad-word atomic
memquadword = LoadMemory (CCA, PAIREDDOUBLEWORD, pAddr, vAddr, DATA)
GPR[rt] = memquadword63..0
GPR[rd] = memquadword127..64
LLAddr = pAddr // quad-word aligned i.e., pAddr3..0 are 0, or not supported.
LLbit = 1

Exceptions:

TLB Refill, TLB Invalid, Reserved Instruction, Address Error, Watch

Programming Notes:

An LLDP instruction for which the two destination registers are the same but non-zero is UNPREDICTABLE. An

LLDP with two zero destination registers followed by a SCDP can be used to accomplish a quad-word atomic write.

Encoding:

MIPS64 pre-Release 6

LLD

110100

base

rt

offset

6

5

5

16

MIPS64 Release 6

SPECIAL3

011111

base

rt

offset

0

LLD

110111

6

5

5

9

1

6

Format:

LLD rt, offset(base)

MIPS64

Load Linked Doubleword

Purpose:

Load Linked Doubleword

To load a doubleword from memory for an atomic read-modify-write

Description:

 GPR[rt] = memory[GPR[base] + offset]

The LLD and SCD instructions provide primitives to implement atomic read-modify-write (RMW) operations for synchronizable memory locations.

The contents of the 64-bit doubleword at the memory location specified by the aligned effective address are fetched and placed into GPR rt. The signed offset is added to the contents of GPR base to form an effective address.

This begins a RMW sequence on the current processor. There can be only one active RMW sequence per processor.

When an LLD is executed it starts the active RMW sequence and replaces any other sequence that was active. The

RMW sequence is completed by a subsequent SCD instruction that either completes the RMW sequence atomically and succeeds, or does not complete and fails.

Executing LLD on one processor does not cause an action that, by itself, would cause an SCD for the same block to fail on another processor.

An execution of LLD does not have to be followed by execution of SCD; a program is free to abandon the RMW sequence without attempting a write.

Restrictions:

The addressed location must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result in UNPREDICTABLE. Which storage is synchronizable is a function of both CPU and system implementations. See the documentation of the SCD instruction for the formal definition.

The effective address must be naturally-aligned. If any of the 3 least-significant bits of the effective address is nonzero, an Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Availability and Compatibility:

This instruction has been reallocated an opcode in Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr2..0 != 03 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword = LoadMemory (CCA, DOUBLEWORD, pAddr, vAddr, DATA)
GPR[rt] = memdoubleword
LLbit = 1

Exceptions:

TLB Refill, TLB Invalid, Address Error, Reserved Instruction, Watch

Encoding:

SPECIAL3

011111

base

rt

offset

0

LLE

101110

6

5

5

9

1

6

Format:

LLE rt, offset(base)

MIPS32

Load Linked Word EVA

Purpose:

Load Linked Word EVA

To load a word from a user mode virtual address when executing in kernel mode for an atomic read-modify-write

Description:

 GPR[rt] = memory[GPR[base] + offset]

The LLE and SCE instructions provide the primitives to implement atomic read-modify-write (RMW) operations for synchronizable memory locations using user mode virtual addresses while executing in kernel mode.

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, signextended to the GPR register length, and written into GPR rt. The 9-bit signed offset is added to the contents of

GPR base to form an effective address.

This begins a RMW sequence on the current processor. There can be only one active RMW sequence per processor.

When an LLE is executed it starts an active RMW sequence replacing any other sequence that was active. The RMW sequence is completed by a subsequent SCE instruction that either completes the RMW sequence atomically and succeeds, or does not and fails.

Executing LLE on one processor does not cause an action that, by itself, causes an SCE for the same block to fail on another processor.

An execution of LLE does not have to be followed by execution of SCE; a program is free to abandon the RMW sequence without attempting a write.

The LLE instruction functions the same as the LL instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Segmentation Control for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Restrictions:

The addressed location must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result is UNPREDICTABLE. Which storage is synchronizable is a function of both CPU and system implementations. See the documentation of the SCE instruction for the formal definition.

The effective address must be naturally-aligned. If either of the 2 least-significant bits of the effective address is nonzero, an Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr1..0 != 02 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] = sign_extend(memdoubleword31+8*byte..8*byte)
LLbit = 1

Exceptions:

TLB Refill, TLB Invalid, Address Error, Reserved Instruction, Watch, Coprocessor Unusable

Programming Notes:

There is no Load Linked Word Unsigned operation corresponding to Load Word Unsigned.

Encoding:

pre-Release 6

LL

110000

base

rt

offset

6

5

5

16

Release 6

SPECIAL3

011111

base

rt

offset

0

LL

110110

6

5

5

9

1

6

Format:

LL rt, offset(base)

MIPS32

Load Linked Word

Purpose:

Load Linked Word

To load a word from memory for an atomic read-modify-write

Description:

 GPR[rt] = memory[GPR[base] + offset]

The LL and SC instructions provide the primitives to implement atomic read-modify-write (RMW) operations for synchronizable memory locations.

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, signextended to the GPR register length, and written into GPR rt. The signed offset is added to the contents of GPR base to form an effective address.

This begins a RMW sequence on the current processor. There can be only one active RMW sequence per processor.

When an LL is executed it starts an active RMW sequence replacing any other sequence that was active. The RMW sequence is completed by a subsequent SC instruction that either completes the RMW sequence atomically and succeeds, or does not and fails.

Executing LL on one processor does not cause an action that, by itself, causes an SC for the same block to fail on another processor.

An execution of LL does not have to be followed by execution of SC; a program is free to abandon the RMW sequence without attempting a write.

Restrictions:

The addressed location must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result is UNPREDICTABLE. Which storage is synchronizable is a function of both CPU and system implementations. See the documentation of the SC instruction for the formal definition.

The effective address must be naturally-aligned. If either of the 2 least-significant bits of the effective address is nonzero, an Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Availability and Compatibility:

This instruction has been reallocated an opcode in Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr1..0 != 02 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] = sign_extend(memdoubleword31+8*byte..8*byte)
LLbit = 1

Exceptions:

TLB Refill, TLB Invalid, Address Error, Watch

Programming Notes:

There is no Load Linked Word Unsigned operation corresponding to Load Word Unsigned.

Release 6 implements a 9-bit offset, whereas all release levels lower than Release 6 implement a 16-bit offset.

Encoding:

SPECIAL3

011111

base

rt

rd

0

0000

1

LLE

101110

6

5

5

5

4

1

6

Format:

LLWPE rt, rd, (base)

MIPS32 Release 6

Load Linked Word Paired EVA

Purpose:

Load Linked Word Paired EVA

To load two words from memory for an atomic read-modify-write, writing a word each to two registers. The load occurs in kernel mode from user virtual address space.

Description:

 GPR[rd] = memory[GPR[base]]63..32, GPR[rt] = memory[GPR[base]]31..0

The LLWPE and SCWPE instructions provide primitives to implement a paired word atomic read-modify-write

(RMW) operation at a synchronizable memory location.

The 64-bit paired word at the memory location specified by the double-word aligned effective address is read. The least significant word, sign-extended to the GPR register length, is written into GPR rt. The most significant word, sign-extended to the GPR register length, is written into GPR rd.

A paired word read or write occurs as a pair of word reads or writes that is double-word atomic.

The instruction has no offset. The effective address is equal to the contents of GPR base.

The execution of LLWPE begins a RMW sequence on the current processor. There can be only one active RMW sequence per processor. When an LLWPE is executed it starts an active RMW sequence replacing any other sequence that was active. The RMW sequence is completed by a subsequent SCWPE instruction that either completes the

RMW sequence atomically and succeeds, or does not and fails.

Successful execution of the LLWPE results in setting LLbit and writing CP0 LLAddr, where LLbit is the least-significant bit of LLAddr. LLAddr contains the data-type aligned address of the operation, in this case a double-word aligned address.

The LLWPE instruction functions the same as the LLWP instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible. Refer to Volume III, Segmentation Control for additional information.

Executing LLWPE on one processor does not cause an action that, by itself, causes a store conditional instruction type for the same block to fail on another processor.

An execution of LLWPE does not have to be followed by execution of SCWPE; a program is free to abandon the

RMW sequence without attempting a write.

Restrictions:

The addressed location must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result is UNPREDICTABLE. Which storage is synchronizable is a function of both CPU and system implementations. See the documentation of the SC instruction for the formal definition.

The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.

Providing misaligned support is not a requirement for this instruction.

Availability and Compatibility

This instruction is introduced by Release 6. It is only present if Config5XNP=0 and Config5EVA=1.

Operation:

vAddr = GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
// PAIREDWORD: two word data-type that is double-word atomic
memdoubleword = LoadMemory (CCA, PAIREDWORD, pAddr, vAddr, DATA)
GPR[rt] = sign_extend(memdoubleword31..0)
GPR[rd] = sign_extend(memdoubleword63..32)
LLAddr = pAddr // double-word aligned i.e., pAddr2..0 are 0, or not supported.
LLbit = 1

Exceptions:

TLB Refill, TLB Invalid, Reserved Instruction, Address Error, Watch, Coprocessor Unusable.

Programming Notes:

An LLWPE instruction for which the two destination registers are the same but non-zero is UNPREDICTABLE. An

LLWPE with two zero destination registers followed by a SCWPE can be used to accomplish a double-word atomic write.

Encoding:

SPECIAL3

011111

base

rt

rd

0

0000

1

LL

110110

6

5

5

5

4

1

6

Format:

LLWP rt, rd, (base)

MIPS32 Release 6

Load Linked Word Paired

Purpose:

Load Linked Word Paired

To load two words from memory for an atomic read-modify-write, writing a word each to two registers.

Description:

 GPR[rd] = memory[GPR[base]]63..32, GPR[rt] = memory[GPR[base]]31..0

The LLWP and SCWP instructions provide primitives to implement a paired word atomic read-modify-write (RMW) operation at a synchronizable memory location.

The 64-bit paired word, as a concatenation of two words, at the memory location specified by the double-word aligned effective address is read. The least significant word, sign-extended to the GPR register length, is written into

GPR rt,and the most significant word, sign-extended to the GPR register length, is written into GPR rd.

A paired word read or write occurs as a pair of word reads or writes that is double-word atomic.

The instruction has no offset. The effective address is equal to the contents of GPR base.

The execution of LLWP begins a RMW sequence on the current processor. There can be only one active RMW sequence per processor. When an LLWP is executed it starts an active RMW sequence replacing any other sequence that was active. The RMW sequence is completed by a subsequent SCWP instruction that either completes the RMW sequence atomically and succeeds, or does not and fails.

Successful execution of the LLWP results in setting LLbit and writing CP0 LLAddr, where LLbit is the least-significant bit of LLAddr. LLAddr contains the data-type aligned address of the operation, in this case a double-word.

Executing LLWP on one processor does not cause an action that, by itself, causes a store conditional instruction type for the same block to fail on another processor.

An execution of LLWP does not have to be followed by execution of SCWP; a program is free to abandon the RMW sequence without attempting a write.

Restrictions:

The addressed location must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result is UNPREDICTABLE. Which storage is synchronizable is a function of both CPU and system implementations. See the documentation of the SC instruction for the formal definition.

The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.

Providing misaligned support is not a requirement for this instruction.

Availability and Compatibility

This instruction is introduced by Release 6. It is only present if Config5XNP=0.

Operation:

vAddr = GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
// PAIREDWORD: two word data-type that is double-word atomic
memdoubleword = LoadMemory (CCA, PAIREDWORD, pAddr, vAddr, DATA)
GPR[rt] = sign_extend(memdoubleword31..0)
GPR[rd] = sign_extend(memdoubleword63..32)
LLAddr = pAddr // double-word aligned i.e., pAddr2..0 are 0, or not supported.
LLbit = 1

Exceptions:

TLB Refill, TLB Invalid, Reserved Instruction, Address Error, Watch

Programming Notes:

An LLWP instruction for which the two destination registers are the same but non-zero is UNPREDICTABLE. An

LLWP with two zero destination registers followed by a SCWP can be used to accomplish a double-word atomic write.

Encoding:

SPECIAL

000000

rs

rt

rd

000

sa

LSA

000101

SPECIAL

000000

rs

rt

rd

000

sa

DLSA

010101

6

5

5

5

3

2

6

Format:

LSA DLSA 

Load Scaled Address, Doubleword Load Scaled Address

LSA rd,rs,rt,sa 

MIPS32 Release 6

Load Scaled Address, Doubleword Load Scaled Address

DLSA rd,rs,rt,sa

MIPS64 Release 6

Load Scaled Address, Doubleword Load Scaled Address

Purpose:

Load Scaled Address, Doubleword Load Scaled Address

Description:

LSA:  GPR[rd] = sign_extend.32( (GPR[rs] << (sa+1)) + GPR[rt] )
DLSA: GPR[rd] = (GPR[rs] << (sa+1)) + GPR[rt]

LSA adds two values derived from registers rs and rt, with a scaling shift on rs. The scaling shift is formed by adding 1 to the 2-bit sa field, which is interpreted as unsigned. The scaling left shift varies from 1 to 5, corresponding to multiplicative scaling values of <=2, <=4, <=8, <=16, bytes, or 16, 32, 64, or 128 bits.

LSA is a MIPS32 compatible instruction, sign extending its result from bit 31 to bit 63.

DLSA is a MIPS64 compatible instruction, performing the scaled index calculation fully 64-bits wide.

Restrictions:

LSA: None

DLSA: Reserved Instruction exception if 64-bit instructions are not enabled.

Availability and Compatibility:

LSA instruction is introduced by and required as of Release 6.

DLSA instruction is introduced by and required as of Release 6.

Operation

LSA:  GPR[rd] = sign_extend.32( GPR[rs] << (sa+1) + GPR[rt] )
DLSA: GPR[rd] = GPR[rs] << (sa+1) + GPR[rt]

Exceptions:

LSA: None

DLSA: Reserved Instruction

Encoding:

Pre-Release 6

LUI

001111

0

00000

rt

immediate

6

5

5

16

Release 6

AUI

001111

00000

rt

immediate

6

5

5

16

Format:

LUI rt, immediate 

MIPS32, Assembly Idiom Release 6

Load Upper Immediate

Purpose:

Load Upper Immediate

To load a constant into the upper half of a word

Description:

 GPR[rt] = sign_extend(immediate || 016)

The 16-bit immediate is shifted left 16 bits and concatenated with 16 bits of low-order zeros. The 32-bit result is signextended and placed into GPR rt.

Restrictions:

None.

Operation:

GPR[rt] = sign_extend(immediate || 016)

Exceptions:

None

Programming Notes:

In Release 6, LUI is an assembly idiom of AUI with rs=0.

Encoding:

COP1X

010011

base

index

0

00000

fd

LUXC1

000101

6

5

5

5

5

6

Format:

LUXC1 fd, index(base)

MIPS64, MIPS32 Release 2, removed in Release 6

Load Doubleword Indexed Unaligned to Floating Point

Purpose:

Load Doubleword Indexed Unaligned to Floating Point

To load a doubleword from memory to an FPR (GPR+GPR addressing), ignoring alignment

Description:

 FPR[fd] = memory[(GPR[base] + GPR[index])PSIZE-1..3]

The contents of the 64-bit doubleword at the memory location specified by the effective address are fetched and placed into the low word of FPR fd. The contents of GPR index and GPR base are added to form the effective address. The effective address is doubleword-aligned; EffectiveAddress2..0 are ignored.

Restrictions:

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

vAddr = (GPR[base]+GPR[index])63..3 || 03
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memdoubleword  LoadMemory(CCA, DOUBLEWORD, pAddr, vAddr, DATA)
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
memlsw = LoadMemory(CCA, WORD, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
memmsw = LoadMemory(CCA, WORD, pAddr, vAddr+4, DATA)
memdoubleword = memmsw || memlsw
StoreFPR(ft, UNINTERPRETED_DOUBLEWORD, memdoubleword)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, Watch

Encoding:

LWC1

110001

base

ft

offset

6

5

5

16

Format:

LWC1 ft, offset(base) 

MIPS32

Load Word to Floating Point

Purpose:

Load Word to Floating Point

To load a word from memory to an FPR

Description:

 FPR[ft] = memory[GPR[base] + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched and placed into the low word of FPR ft. If FPRs are 64 bits wide, bits 63..32 of FPR ft become UNPREDICTABLE. The

16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress1..0 != 0 (not word-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory(CCA, WORD, pAddr, vAddr, DATA)
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
StoreFPR(ft, UNINTERPRETED_WORD, memdoubleword31+8*bytesel..8*bytesel)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Reserved Instruction, Coprocessor Unusable, Watch

Encoding:

pre-Release 6

LWC2

110010

base

rt

offset

6

5

5

16

Release 6

COP2

010010

LWC2

01010

rt

base

offset

6

5

5

5

11

Format:

LWC2 rt, offset(base)

MIPS32

Load Word to Coprocessor 2

Purpose:

Load Word to Coprocessor 2

To load a word from memory to a COP2 register.

Description:

 CPR[2,rt,0] = memory[GPR[base] + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched and placed into the low word of COP2 (Coprocessor 2) general register rt. The signed offset is added to the contents of

GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if +EffectiveAddress1..0 != 0 (not word-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Availability and Compatibility

This instruction has been recoded for Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory(CCA, DOUBLEWORD, pAddr, vAddr, DATA)
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
CPR[2,rt,0] = sign_extend(memdoubleword31+8*bytesel..8*bytesel)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Reserved Instruction, Coprocessor Unusable, Watch

Programming Notes:

Release 6 implements an 11-bit offset, whereas all release levels lower than Release 6 implement a 16-bit offset.

Encoding:

SPECIAL3

011111

base

rt

offset

0

LWE

101111

6

5

5

9

1

6

Format:

LWE rt, offset(base)

MIPS32

Load Word EVA

Purpose:

Load Word EVA

To load a word from user mode virtual address space when executing in kernel mode.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, signextended to the GPR register length if necessary, and placed in GPR rt. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The LWE instruction functions the same as the LW instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

Pre-Release 6: The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] = sign_extend(memdoubleword31+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

SPECIAL3

011111

base

rt

offset

0

LWLE

011001

6

5

5

9

1

6

Format:

LWLE rt, offset(base)

MIPS32, removed in Release 6

Load Word Left EVA

Purpose:

Load Word Left EVA

To load the most-significant part of a word as a signed value from an unaligned user mode virtual address while executing in kernel mode.

Description:

 GPR[rt] = GPR[rt] MERGE memory[GPR[base] + offset]

The 9-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

The most-significant 1 to 4 bytes of W is in the aligned word containing the EffAddr. This part of W is loaded into the most-significant (left) part of the word in GPR rt. The remaining least-significant part of the word in GPR rt is unchanged.

For 64-bit GPR rt registers, the destination word is the low-order word of the register. The loaded value is treated as a signed value; the word sign bit (bit 31) is always loaded from memory and the new sign bit value is copied into bits

63..32.

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
byte = 0 || (vAddr1..0 xor BigEndianCPU2)
word = vAddr2 xor BigEndianCPU
memdoubleword = LoadMemory (CCA, byte, pAddr, vAddr, DATA)
temp = memdoubleword31+32*word-8*byte..32*word || GPR[rt]23-8*byte..0
GPR[rt] = (temp31)32 || temp

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Programming Notes:

The architecture provides no direct support for treating unaligned words as unsigned values, that is, zeroing bits

63..32 of the destination register when bit 31 is loaded.

Historical Information:

In the MIPS I architecture, the LWL and LWR instructions were exceptions to the load-delay scheduling restriction.

A LWL or LWR instruction which was immediately followed by another LWL or LWR instruction, and used the same destination register would correctly merge the 1 to 4 loaded bytes with the data loaded by the previous instruction. All such restrictions were removed from the architecture in MIPS II.

Encoding:

LWL

100010

base

rt

offset

6

5

5

16

Format:

LWL rt, offset(base)

MIPS32, removed in Release 6

Load Word Left

Purpose:

Load Word Left

To load the most-significant part of a word as a signed value from an unaligned memory address

Description:

 GPR[rt] = GPR[rt] MERGE memory[GPR[base] + offset]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

The most-significant 1 to 4 bytes of W is in the aligned word containing the EffAddr. This part of W is loaded into the most-significant (left) part of the word in GPR rt. The remaining least-significant part of the word in GPR rt is unchanged.

For 64-bit GPR rt registers, the destination word is the low-order word of the register. The loaded value is treated as a signed value; the word sign bit (bit 31) is always loaded from memory and the new sign bit value is copied into bits

63..32.

Restrictions:

None

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
byte = 0 || (vAddr1..0 xor BigEndianCPU2)
word = vAddr2 xor BigEndianCPU
memdoubleword = LoadMemory (CCA, byte, pAddr, vAddr, DATA)
temp = memdoubleword31+32*word-8*byte..32*word || GPR[rt]23-8*byte..0
GPR[rt] = (temp31)32 || temp

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch

Programming Notes:

The architecture provides no direct support for treating unaligned words as unsigned values, that is, zeroing bits

63..32 of the destination register when bit 31 is loaded.

Historical Information:

In the MIPS I architecture, the LWL and LWR instructions were exceptions to the load-delay scheduling restriction.

A LWL or LWR instruction which was immediately followed by another LWL or LWR instruction, and used the same destination register would correctly merge the 1 to 4 loaded bytes with the data loaded by the previous instruction. All such restrictions were removed from the architecture in MIPS II.

Encoding:

PCREL

111011

rs

LWPC

01

offset

6

5

2

19

Format:

LWPC rs, offset

MIPS32 Release 6

Load Word PC-relative

Purpose:

Load Word PC-relative

To load a word from memory as a signed value, using a PC-relative address.

Description:

GPR[rs] = memory[ PC  + sign_extend( offset << 2 ) ]

The offset is shifted left by 2 bits, sign-extended, and added to the address of the LWPC instruction.

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, signextended to the GPR register length if necessary, and placed in GPR rs.

Restrictions:

LWPC is naturally aligned, by specification.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Operation

vAddr = ( PC + sign_extend(offset)<<2)
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
GPR[rs] = sign_extend(memword)

Exceptions:

TLB Refill, TLB Invalid, TLB Read Inhibit, Bus Error, Address Error, Watch

Programming Note

The Release 6 PC-relative loads (LWPC, LWUPC, LDPC) are considered data references.

For the purposes of watchpoints (provided by the CP0 WatchHi and WatchLo registers) and EJTAG breakpoints, the

PC-relative reference is considered to be a data reference rather than an instruction reference. That is, the watchpoint or breakpoint is triggered only if enabled for data references.

Encoding:

SPECIAL3

011111

base

rt

offset

0

LWRE

011010

6

5

5

9

1

6

Format:

LWRE rt, offset(base)

MIPS32, removed in Release 6

Load Word Right EVA

Purpose:

Load Word Right EVA

To load the least-significant part of a word from an unaligned user mode virtual memory address as a signed value while executing in kernel mode.

Description:

 GPR[rt] = GPR[rt] MERGE memory[GPR[base] + offset] 

The 9-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the least-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

A part of W (the least-significant 1 to 4 bytes) is in the aligned word containing EffAddr. This part of W is loaded into the least-significant (right) part of the word in GPR rt. The remaining most-significant part of the word in GPR rt is unchanged.

If GPR rt is a 64-bit register, the destination word is the low-order word of the register. The loaded value is treated as a signed value; if the word sign bit (bit 31) is loaded (that is, when all 4 bytes are loaded), then the new sign bit value is copied into bits 63..32. If bit 31 is not loaded, the value of bits 63..32 is implementation dependent; the value is either unchanged or a copy of the current value of bit 31.

Executing both LWRE and LWLE, in either order, delivers a sign-extended word value in the destination register.

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
byte = vAddr1..0 xor BigEndianCPU2
word = vAddr2 xor BigEndianCPU
memdoubleword = LoadMemory (CCA, byte, pAddr, vAddr, DATA)
temp = GPR[rt]31..32-8*byte || memdoubleword31+32*word..32*word+8*byte
if byte = 4 then
   utemp = (temp31)32     /* loaded bit 31, must sign extend */
else
             /* one of the following two behaviors: */
   utemp = GPR[rt]63..32   /* leave what was there alone */ 
   utemp = (GPR[rt]31)32  /* sign-extend bit 31 */
endif
GPR[rt] = utemp || temp

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Programming Notes:

The architecture provides no direct support for treating unaligned words as unsigned values, that is, zeroing bits

63..32 of the destination register when bit 31 is loaded.

Historical Information:

In the MIPS I architecture, the LWL and LWR instructions were exceptions to the load-delay scheduling restriction.

A LWL or LWR instruction which was immediately followed by another LWL or LWR instruction, and used the same destination register would correctly merge the 1 to 4 loaded bytes with the data loaded by the previous instruction. All such restrictions were removed from the architecture in MIPS II.

Encoding:

LWR

100110

base

rt

offset

6

5

5

16

Format:

LWR rt, offset(base)

MIPS32, removed in Release 6

Load Word Right

Purpose:

Load Word Right

To load the least-significant part of a word from an unaligned memory address as a signed value

Description:

 GPR[rt] = GPR[rt] MERGE memory[GPR[base] + offset] 

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the least-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

A part of W (the least-significant 1 to 4 bytes) is in the aligned word containing EffAddr. This part of W is loaded into the least-significant (right) part of the word in GPR rt. The remaining most-significant part of the word in GPR rt is unchanged.

If GPR rt is a 64-bit register, the destination word is the low-order word of the register. The loaded value is treated as a signed value; if the word sign bit (bit 31) is loaded (that is, when all 4 bytes are loaded), then the new sign bit value is copied into bits 63..32. If bit 31 is not loaded, the value of bits 63..32 is implementation dependent; the value is either unchanged or a copy of the current value of bit 31.

Executing both LWR and LWL, in either order, delivers a sign-extended word value in the destination register.

Restrictions:

None

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
if BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
byte = vAddr1..0 xor BigEndianCPU2
word = vAddr2 xor BigEndianCPU
memdoubleword = LoadMemory (CCA, byte, pAddr, vAddr, DATA)
temp = GPR[rt]31..32-8*byte || memdoubleword31+32*word..32*word+8*byte
if byte = 4 then
   utemp = (temp31)32     /* loaded bit 31, must sign extend */
else
             /* one of the following two behaviors: */
   utemp = GPR[rt]63..32   /* leave what was there alone */ 
   utemp = (GPR[rt]31)32  /* sign-extend bit 31 */
endif
GPR[rt] = utemp || temp

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch

Programming Notes:

The architecture provides no direct support for treating unaligned words as unsigned values, that is, zeroing bits

63..32 of the destination register when bit 31 is loaded.

Historical Information:

In the MIPS I architecture, the LWL and LWR instructions were exceptions to the load-delay scheduling restriction.

A LWL or LWR instruction which was immediately followed by another LWL or LWR instruction, and used the same destination register would correctly merge the 1 to 4 loaded bytes with the data loaded by the previous instruction. All such restrictions were removed from the architecture in MIPS II.

Encoding:

LW

100011

base

rt

offset

6

5

5

16

Format:

LW rt, offset(base)

MIPS32

Load Word

Purpose:

Load Word

To load a word from memory as a signed value

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, signextended to the GPR register length if necessary, and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] = sign_extend(memdoubleword31+8*byte..8*byte)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch

Encoding:

PCREL

111011

rs

LWUPC

10

offset

6

5

2

19

Format:

LWUPC rs, offset

MIPS64 Release 6

Load Word Unsigned PC-relative

Purpose:

Load Word Unsigned PC-relative

To load a word from memory as an unsigned value, using a PC-relative address.

Description:

GPR[rs] = memory[ PC + sign_extend( offset << 2 ) ]

The 19-bit offset is shifted left by 2 bits, sign-extended, and added to the address of the LWUPC instruction.

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, zeroextended to the GPR register length if necessary, and placed in GPR rs.

Restrictions:

LWUPC is naturally aligned, by specification.

Availability and Compatibility:

This instruction is introduced by and required as of MIPS64 Release 6.

Operation

vAddr = (  PC + sign_extend(offset)<< 2)
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
memword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
GPR[rs] = zero_extend(memword)

Exceptions:

TLB Refill, TLB Invalid, TLB Read Inhibit, Bus Error, Address Error, Watch

Programming Note

The Release 6 PC-relative loads (LWPC, LWUPC, LDPC) are considered data references.

For the purposes of watchpoints (provided by the CP0 WatchHi and WatchLo registers) and EJTAG breakpoints, the

PC-relative reference is considered to be a data reference rather than an instruction reference. That is, the watchpoint or breakpoint is triggered only if enabled for data references.

Encoding:

LWU

100111

base

rt

offset

6

5

5

16

Format:

LWU rt, offset(base)

MIPS64

Load Word Unsigned

Purpose:

Load Word Unsigned

To load a word from memory as an unsigned value.

Description:

 GPR[rt] = memory[GPR[base] + offset]

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched, zeroextended, and placed in GPR rt. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseoducode is not completely adapted for Release 6 misalignment support because the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory (CCA, WORD, pAddr, vAddr, DATA)
byte = vAddr2..0 xor (BigEndianCPU || 02)
GPR[rt] = 032 || memdoubleword31+8*byte..8*byte

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Reserved Instruction, Watch

Encoding:

COP1X

010011

base

index

0

00000

fd

LWXC1

000000

6

5

5

5

5

6

Format:

LWXC1 fd, index(base)

MIPS64, MIPS32 Release 2, removed in Release 6

Load Word Indexed to Floating Point

Purpose:

Load Word Indexed to Floating Point

To load a word from memory to an FPR (GPR+GPR addressing).

Description:

 FPR[fd] = memory[GPR[base] + GPR[index]]

The contents of the 32-bit word at the memory location specified by the aligned effective address are fetched and placed into the low word of FPR fd. If FPRs are 64 bits wide, bits 63..32 of FPR fs become UNPREDICTABLE. The contents of GPR index and GPR base are added to form the effective address.

Restrictions:

An Address Error exception occurs if EffectiveAddress1..0 != 0 (not word-aligned).

Availability and Compatibility:

This instruction has been removed in Release 6.

Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32 Release 1. Required in

MIPS32 Release 2 and all subsequent versions of MIPS32. When required, required whenever FPU is present, whether a 32-bit or 64-bit FPU, whether in 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1, StatusFR=0 or 1).

Operation:

vAddr = GPR[base] + GPR[index]
if vAddr1..0 != 02 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, LOAD)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
memdoubleword = LoadMemory(CCA, WORD, pAddr, vAddr, DATA)
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
StoreFPR(fd, UNINTERPRETED_WORD,
          memdoubleword31+8*bytesel..8*bytesel)

Exceptions:

TLB Refill, TLB Invalid, Address Error, Reserved Instruction, Coprocessor Unusable, Watch

Encoding:

COP1X

010011

fr

ft

fs

fd

MADD

100

fmt

6

5

5

5

5

3

3

Format:

MADD.fmt 

Floating Point Multiply Add

MADD.S fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Multiply Add

MADD.D fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Multiply Add

MADD.PS fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Multiply Add

Purpose:

Floating Point Multiply Add

To perform a combined multiply-then-add of FP values.

Description:

 FPR[fd] = (FPR[fs] x FPR[ft]) + FPR[fr]

The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product.

The intermediate product is rounded according to the current rounding mode in FCSR. The value in FPR fr is added to the product. The result sum is calculated to infinite precision, rounded according to the current rounding mode in

FCSR, and placed into FPR fd. The operands and result are values in format fmt. The results and flags are as if separate floating-point multiply and add instructions were executed.

MADD.PS multiplies then adds the upper and lower halves of FPR fr, FPR fs, and FPR ft independently, and ORs together any generated exceptional conditions.

The Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of MADD.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

MADD.S and MADD.D: Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32

Release 1. Required in MIPS32 Release 2 and all subsequent versions of MIPS32. When required, these instructions are to be implemented if an FPU is present either in a 32-bit or 64-bit FPU or in a 32-bit or 64-bit FP Register Mode

(FIRF64=0 or 1, StatusFR=0 or 1).

This instruction has been removed in Release 6 and has been replaced by the fused multiply-add instruction. Refer to the fused multiply-add instruction 'MADDF.fmt' in this manual for more information. Release 6 does not support

Paired Single (PS).

Operation:

vfr = ValueFPR(fr, fmt)
vfs = ValueFPR(fs, fmt)
vft = ValueFPR(ft, fmt)
StoreFPR(fd, fmt, (vfs xfmt vft) +fmt vfr)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

COP1

010001

fmt

ft

fs

fd

MADDF

011000

COP1

010001

fmt

ft

fs

fd

MSUBF

011001

6

5

5

5

5

3

Format:

MADDF.fmt MSUBF.fmt 

To perform a fused multiply-add of FP values.

MADDF.S fd, fs, ft

MIPS32 Release 6

Floating Point Fused Multiply Add, Floating Point Fused Multiply Subtract

MADDF.D fd, fs, ft

MIPS32 Release 6

Floating Point Fused Multiply Add, Floating Point Fused Multiply Subtract

MSUBF.S fd, fs, ft

MIPS32 Release 6

Floating Point Fused Multiply Add, Floating Point Fused Multiply Subtract

MSUBF.D fd, fs, ft

MIPS32 Release 6

Floating Point Fused Multiply Add, Floating Point Fused Multiply Subtract

Purpose:

Floating Point Fused Multiply Add, Floating Point Fused Multiply Subtract

MADDF.fmt: To perform a fused multiply-add of FP values.

MSUBF.fmt: To perform a fused multiply-subtract of FP values.

Description:

 
MADDF.fmt: FPR[fd] = FPR[fd] + (FPR[fs] <= FPR[ft])
MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] <= FPR[ft])

The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The intermediate product is calculated to infinite precision. The product is added to the value in FPR fd. The result sum is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt.

(For MSUBF fmt, the product is subtracted from the value in FPR fd.)

Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

None

Availability and Compatibility:

MADDF.fmt and MSUBF.fmt are required in Release 6.

MADDF.fmt and MSUBF.fmt are not available in architectures pre-Release 6.

The fused multiply add instructions, MADDF.fmt and MSUBF fmt, replace pre-Release 6 instructions such as

MADD fmt, MSUB.fmt, NMADD.fmt, and NMSUB.fmt. The replaced instructions were unfused multiply-add, with an intermediate rounding.

Release 6 MSUBF fmt, fd=fd-fs<=ft, corresponds more closely to pre-Release 6 NMADD fmt, fd=fr-fs<=ft, than to pre-Release 6 MSUB.fmt, fd=fs<=ft-fr.

FPU scalar MADDF fmt corresponds to MSA vector MADD.df.

FPU scalar MSUBF fmt corresponds to MSA vector MSUB.df.

Operation:

if not IsCoprocessorEnabled(1) 
   then SignalException(CoprocessorUnusable, 1) endif
if not IsFloatingPointImplemented(fmt)) 
   then SignalException(ReservedInstruction) endif
vfr = ValueFPR(fr, fmt)
vfs = ValueFPR(fs, fmt)
vfd = ValueFPR(fd, fmt)
MADDF.fmt: vinf = vfd +inf (vfs *inf vft)
MADDF.fmt: vinf = vfd -inf (vfs *inf vft)
StoreFPR(fd, fmt, vinf)

Special Considerations:

The fused multiply-add computation is performed in infinite precision, and signals Inexact, Overflow, or Underflow if and only if the final result differs from the infinite precision result in the appropriate manner.

Like most FPU computational instructions, if the flush-subnormals-to-zero mode, FCSR.FS=1, then subnormals are flushed before beginning the fused-multiply-add computation, and Inexact may be signaled.

I.e. Inexact may be signaled both by input flushing and/or by the fused-multiply-add: the conditions or ORed.

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

SPECIAL2

011100

rs

rt

0

0000

0

00000

MADD

000000

6

5

5

5

5

6

Format:

MADD rs, rt

MIPS32, removed in Release 6

Multiply and Add Word to Hi, Lo

Purpose:

Multiply and Add Word to Hi, Lo

To multiply two words and add the result to Hi, Lo.

Description:

 (HI,LO) = (HI,LO) + (GPR[rs] x GPR[rt])

The 32-bit word value in GPR rs is multiplied by the 32-bit word value in GPR rt, treating both operands as signed values, to produce a 64-bit result. The product is added to the 64-bit concatenated values of HI31..0 and LO31..0. The most significant 32 bits of the result are sign-extended and written into HI and the least significant 32 bits are signextended and written into LO. No arithmetic exception occurs under any circumstances.

Restrictions:

If GPRs rs or rt do not contain sign-extended 32-bit values (bits 63..31 equal), then the results of the operation are

UNPREDICTABLE.

This instruction does not provide the capability of writing directly to a target GPR.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = (HI31..0 || LO31..0) + (GPR[rs]31..0 x GPR[rt]31..0)
HI = sign_extend(temp63..32)
LO = sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Encoding:

SPECIAL2

011100

rs

rt

0

00000

0

00000

MADDU

000001

6

5

5

5

5

6

Format:

MADDU rs, rt

MIPS32, removed in Release 6

Multiply and Add Unsigned Word to Hi,Lo

Purpose:

Multiply and Add Unsigned Word to Hi,Lo

To multiply two unsigned words and add the result to HI, LO.

Description:

 (HI,LO) = (HI,LO) + (GPR[rs] x GPR[rt])

The 32-bit word value in GPR rs is multiplied by the 32-bit word value in GPR rt, treating both operands as unsigned values, to produce a 64-bit result. The product is added to the 64-bit concatenated values of HI31..0 and LO31..0. The most significant 32 bits of the result are sign-extended and written into HI and the least significant 32 bits are signextended and written into LO. No arithmetic exception occurs under any circumstances.

Restrictions:

If GPRs rs or rt do not contain sign-extended 32-bit values (bits 63..31 equal), then the results of the operation are

UNPREDICTABLE.

This instruction does not provide the capability of writing directly to a target GPR.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = (HI31..0 || LO31..0) + ((032 || GPR[rs]31..0) x (032 || GPR[rt]31..0))
HI = sign_extend(temp63..32)
LO = sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Encoding:

COP1

010001

fmt

ft

fs

fd

MAX

011110

6

5

5

5

5

6

COP1

010001

fmt

ft

fs

fd

MAXA

011111

6

5

5

5

5

6

COP1

010001

fmt

ft

fs

fd

MIN

011100

6

5

5

5

5

6

COP1

010001

fmt

ft

fs

fd

MINA

011101

6

5

5

5

5

6

Format:

MAX.fmt MIN.fmt MAXA.fmt MINA.fmt 

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MAX.S fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MAX.D fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MAXA.S fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MAXA.D fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MIN.S fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MIN.D fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MINA.S fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

MINA.D fd,fs,ft

MIPS32 Release 6

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

Purpose:

Scalar Floating-Point Max/Min/maxNumMag/minNumMag

Scalar Floating-Point Maximum

Scalar Floating-Point Minimum

Scalar Floating-Point argument with Maximum Absolute Value

Scalar Floating-Point argument with Minimum Absolute Value

Description:

MAX.fmt:  FPR[fd]= maxNum(FPR[fs],FPR[ft])
MIN.fmt:  FPR[fd]= minNum(FPR[fs],FPR[ft])
MAXA.fmt: FPR[fd]= maxNumMag(FPR[fs],FPR[ft])
MINA.fmt: FPR[fd]= minNumMag(FPR[fs],FPR[ft])

MAX.fmt writes the maximum value of the inputs fs and ft to the destination fd.

MIN.fmt writes the minimum value of the inputs fs and ft to the destination fd.

MAXA fmt takes input arguments fs and ft and writes the argument with the maximum absolute value to the destination fd.

MINA fmt takes input arguments fs and ft and writes the argument with the minimum absolute value to the destination fd.

The instructions MAX.fmt/MIN fmt/MAXA fmt/MINA.fmt correspond to the IEEE 754-2008 operations maxNum/ minNum/maxNumMag/minNumMag.

Numbers are preferred to NaNs: if one input is a NaN, but not both, the value of the numeric input is returned. If both are NaNs, the NaN in fs is returned.1

The scalar FPU instructions MAX fmt/MIN.fmt/MAXA fmt/MINA fmt correspond to the MSA instructions

FMAX.df/FMIN.df/FMAXA.df/FMINA.df.

Restrictions:

Data-dependent exceptions are possible as specified by the IEEE Standard for Floating-Point Arithmetic 754TM2008. See also the section "Special Cases", below.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

Operation:

if not IsCoprocessorEnabled(1) 
   then SignalException(CoprocessorUnusable, 1) endif
if not IsFloatingPointImplemented(fmt)
   then SignalException(ReservedInstruction) endif
v1 = ValueFPR(fs,fmt)
v2 = ValueFPR(ft,fmt)
if SNaN(v1) or SNaN(v2) then
   then SignalException(InvalidOperand)  endif 
if NaN(v1) and NaN(v2)then
   ftmp = v1
elseif NaN(v1) then
   ftmp = v2
elseif NaN(v2) then
   ftmp = v1
else
   case instruction of
   FMAX.fmt:  ftmp = MaxFP.fmt(ValueFPR(fs,fmt),ValueFPR(ft,fmt))
   FMIN.fmt:  ftmp = MinFP.fmt(ValueFPR(fs,fmt),ValueFPR(ft,fmt))
   FMAXA.fmt: ftmp = MaxAbsoluteFP.fmt(ValueFPR(fs,fmt),ValueFPR(ft,fmt))
   FMINA.fmt: ftmp = MinAbsoluteFP.fmt(ValueFPR(fs,fmt),ValueFPR(ft,fmt))
   end case
endif
StoreFPR (fd, fmt, ftmp)
/* end of instruction */
function MaxFP(tt, ts, n)
   /* Returns the largest argument. */
endfunction MaxFP
function MinFP(tt, ts, n)
   /* Returns the smallest argument. */
endfunction MaxFP
function MaxAbsoluteFP(tt, ts, n)
   /* Returns the argument with largest absolute value.
      For equal absolute values, returns the largest argument.*/
endfunction MaxAbsoluteFP
function MinAbsoluteFP(tt, ts, n)
   /* Returns the argument with smallest absolute value.
      For equal absolute values, returns the smallest argument.*/
endfunction MinAbsoluteFP
function NaN(tt, ts, n)
   /* Returns true if the value is a NaN */
   return SNaN(value) or QNaN(value)
endfunction MinAbsoluteFP

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation

Encoding:

COP0

010000

MF

00000

rt

rd

0

00000000

sel

6

5

5

5

8

3

Format:

MFC0 rt, rd, sel

MIPS32

Move from Coprocessor 0

Purpose:

Move from Coprocessor 0

To move the contents of a coprocessor 0 register to a general register.

Description:

 GPR[rt] = CPR[0,rd,sel]

The contents of the coprocessor 0 register specified by the combination of rd and sel are sign-extended and loaded into general register rt. Not all coprocessor 0 registers support the sel field. In those instances, the sel field must be zero.

When the coprocessor 0 register specified is the EntryLo0 or the EntryLo1 register, the RI/XI fields are moved to bits 31:30 of the destination register. This feature supports MIPS32 backward compatibility on a MIPS64 system.

Restrictions:

Pre-Release 6: The results are UNDEFINED if coprocessor 0 does not contain a register as specified by rd and sel.

Release 6: Reading a reserved register or a register that is not implemented for the current core configuration returns

0.

Operation:

reg = rd
if IsCoprocessorRegisterImplemented(0, reg, sel) then
   data = CPR[0, reg, sel]
   if (reg,sel = EntryLo1 or reg,sel = EntryLo0) then
      GPR[rt]29..0 = data29..0
      GPR[rt]31 = data63
      GPR[rt]30 = data62
      GPR[rt]63..32 = sign_extend(data63)
   else
      GPR[rt] = sign_extend(data)
   endif
else
   if ArchitectureRevision() >= 6 then
      GPR[rt] = 0
   else
      UNDEFINED
   endif
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

MF

00000

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

MFC1 rt, fs

MIPS32

Move Word From Floating Point

Purpose:

Move Word From Floating Point

To copy a word from an FPU (CP1) general register to a GPR.

Description:

 GPR[rt] = FPR[fs]

The contents of FPR fs are sign-extended and loaded into general register rt.

Restrictions:

Operation:

data = ValueFPR(fs, UNINTERPRETED_WORD)31..0
GPR[rt] = sign_extend(data)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Historical Information:

For MIPS I, MIPS II, and MIPS III the contents of GPR rt are UNPREDICTABLE for the instruction immediately following MFC1.

Encoding:

COP2

010010

MF

00000

rt

Impl

6

5

5

16

Format:

MFC2, rt, Impl, sel

MIPS32

Move Word From Coprocessor 2

Purpose:

Move Word From Coprocessor 2

To copy a word from a COP2 general register to a GPR.

Description:

 GPR[rt] = CP2CPR[Impl]

The contents of the coprocessor 2 register denoted by the Impl field are sign-extended and placed into general register

rt. The interpretation of the Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the

architecture.

Restrictions:

The results are UNPREDICTABLE if the Impl field specifies a coprocessor 2 register that does not exist.

Operation:

data = CP2CPR[Impl]31..0
GPR[rt] = sign_extend(data)

Exceptions:

Coprocessor Unusable

Encoding:

COP0

010000

MFH

00010

rt

rd

0

00000000

sel

6

5

5

5

8

3

Format:

MFHC0 rt, rd, sel

MIPS32 Release 5

Move from High Coprocessor 0

Purpose:

Move from High Coprocessor 0

To move the contents of the upper 32 bits of a Coprocessor 0 register, extended by 32-bits, to a general register.

Description:

 GPR[rt] = CPR[0,rd,sel][63:32]

The contents of the Coprocessor 0 register specified by the combination of rd and sel are sign-extended and loaded into general register rt. Not all Coprocessor 0 registers support the sel field, and in those instances, the sel field must be zero.

When the Coprocessor 0 register specified is the EntryLo0 or the EntryLo1 register, MFHC0 must undo the effects of

MTHC0, that is, bits 31:30 of the register must be returned as bits 1:0 of the GPR, and bits 32 and those of greater significance must be left-shifted by two and written to bits 31:2 of the GPR. This is because the RI and XI bits are repositioned on the write from GPR to EntryLo0 or the EntryLo1.

Restrictions:

Pre-Release 6: The results are UNDEFINED if Coprocessor 0 does not contain a register as specified by rd and sel, or the register exists but is not extended by 32-bits, or the register is extended for XPA, but XPA is not supported or enabled.

Release 6: Reading the high part of a register that is reserved, not implemented for the current core configuration, or that is not extended beyond 32 bits returns 0.

Availability and Compatibility:

This feature supports MIPS32 backward-compatibility on MIPS64 implementations.

Operation:

PABITS is the total number of physical address bits implemented. PABITS is defined in the descriptions of EntryLo0 and EntryLo1.

if Config5MVH = 0 then SignalException(ReservedInstruction) endif
reg = rd
if IsCoprocessorRegisterImplemented(0, reg, sel) and 
   IsCoprocessorRegisterExtended(0, reg, sel) then
   data = CPR[0, reg, sel]
   if (reg,sel = EntryLo1 or reg,sel = EntryLo0) then
       if (Config3LPA = 1 and PageGrainELPA = 1) then // PABITS > 36
           GPR[rt]31..0 = data61..30
           GPR[rt]63..32 = (data61)32 // sign-extend
      else
         GPR[rt] = 0
       endif
   else
       GPR[rt] = sign_extend(data63..32)
   endif
else
   if ArchitectureRevision() >= 6 then
       GPR[rt] = 0
   else
      UNDEFINED
   endif
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

MFH

00011

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

MFHC1 rt, fs

MIPS32 Release 2

Move Word From High Half of Floating Point Register

Purpose:

Move Word From High Half of Floating Point Register

To copy a word from the high half of an FPU (CP1) general register to a GPR.

Description:

 GPR[rt] = sign_extend(FPR[fs]63..32)

The contents of the high word of FPR fs are sign-extended and loaded into general register rt. This instruction is primarily intended to support 64-bit floating point units on a 32-bit CPU, but the semantics of the instruction are defined for all cases.

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The results are UNPREDICTABLE if StatusFR = 0 and fs is odd.

Operation:

data = ValueFPR(fs, UNINTERPRETED_DOUBLEWORD)63..32
GPR[rt] = sign_extend(data)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP2

010010

MFH

00011

rt

Impl

6

5

5

16

Format:

MFHC2, rt, rd, sel

MIPS32 Release 2

Move Word From High Half of Coprocessor 2 Register

Purpose:

Move Word From High Half of Coprocessor 2 Register

To copy a word from the high half of a COP2 general register to a GPR.

Description:

 GPR[rt] = sign_extend(CP2CPR[Impl]63..32)

The contents of the high word of the coprocessor 2 register denoted by the Impl field are sign-extended and placed into GPR rt. The interpretation of the Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the architecture.

Restrictions:

The results are UNPREDICTABLE if the Impl field specifies a coprocessor 2 register that does not exist, or if that register is not 64 bits wide.

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

data = CP2CPR[Impl]63..32
GPR[rt] = sign_extend(data)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL

000000

0

00 0000 0000

rd

0

00000

MFHI

010000

6

10

5

5

6

Format:

MFHI  rd

MIPS32, removed in Release 6

Move From HI Register

Purpose:

Move From HI Register

To copy the special purpose HI register to a GPR.

Description:

 GPR[rd] = HI

The contents of special register HI are loaded into GPR rd.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

GPR[rd] = HI

Exceptions:

None

Historical Information:

In the MIPS I, II, and III architectures, the two instructions which follow the MFHI must not modify the HI register.

If this restriction is violated, the result of the MFHI is UNPREDICTABLE. This restriction was removed in MIPS

IV and MIPS32, and all subsequent levels of the architecture.

Encoding:

SPECIAL

000000

0

00 0000 0000

rd

0

00000

MFLO

010010

6

10

5

5

6

Format:

MFLO rd 

MIPS32, removed in Release 6

Move From LO Register

Purpose:

Move From LO Register

To copy the special purpose LO register to a GPR.

Description:

 GPR[rd] = LO

The contents of special register LO are loaded into GPR rd.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

GPR[rd] = LO

Exceptions:

None

Historical Information:

In the MIPS I, II, and III architectures, the two instructions which follow the MFLO must not modify the HI register.

If this restriction is violated, the result of the MFLO is UNPREDICTABLE. This restriction was removed in MIPS

IV and MIPS32, and all subsequent levels of the architecture.

Encoding:

COP1

010001

fmt

0

00000

fs

fd

MOV

000110

6

5

5

5

5

6

Format:

MOV.fmt 

Floating Point Move

MOV.S fd, fs

MIPS32

Floating Point Move

MOV.D fd, fs

MIPS32

Floating Point Move

MOV.PS fd, fs

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Move

Purpose:

Floating Point Move

To move an FP value between FPRs.

Description:

 FPR[fd] = FPR[fs]

The value in FPR fs is placed into FPR fd. The source and destination are values in format fmt. In paired-single format, both the halves of the pair are copied to fd.

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of MOV.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

MOV.PS has been removed in Release 6.

Operation:

StoreFPR(fd, fmt, ValueFPR(fs, fmt))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

COP1

010001

fmt

cc

0

0

tf

0

fs

fd

MOVCF

010001

6

5

3

1

1

5

5

6

Format:

MOVF.fmt 

Floating Point Move Conditional on Floating Point False

MOVF.S fd, fs, cc

MIPS32, removed in Release 6

Floating Point Move Conditional on Floating Point False

MOVF.D fd, fs, cc

MIPS32, removed in Release 6

Floating Point Move Conditional on Floating Point False

MOVF.PS fd, fs, cc

MIPS64 removed in Release 6

Floating Point Move Conditional on Floating Point False

Purpose:

Floating Point Move Conditional on Floating Point False

To test an FP condition code then conditionally move an FP value.

Description:

 if FPConditionCode(cc) = 0 then FPR[fd] = FPR[fs]

If the floating point condition code specified by CC is zero, then the value in FPR fs is placed into FPR fd. The source and destination are values in format fmt.

If the condition code is not zero, then FPR fs is not copied and FPR fd retains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes UNPREDICTABLE.

MOVF.PS merges the lower half of FPR fs into the lower half of FPR fd if condition code CC is zero, and independently merges the upper half of FPR fs into the upper half of FPR fd if condition code CC+1 is zero. The CC field must be even; if it is odd, the result of this operation is UNPREDICTABLE.

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE. The operand must be a value in format fmt. If it is not, the result is UNPREDITABLE and the value of the operand FPR becomes UNPREDICTABLE.

The result of MOVF.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6 and has been replaced by the 'SEL.fmt' instruction. Refer to the

SEL fmt instruction in this manual for more information. Release 6 does not support Paired Single (PS).

Operation:

if fmt != PS
   if FPConditionCode(cc) = 0 then
      StoreFPR(fd, fmt, ValueFPR(fs, fmt))
   else
      StoreFPR(fd, fmt, ValueFPR(fd, fmt))
   endif
else
   mask = 0
   if FPConditionCode(cc+0) = 0 then mask = mask or 0xF0 endif
   if FPConditionCode(cc+1) = 0 then mask = mask or 0x0F endif
   StoreFPR(fd, PS, ByteMerge(mask, fd, fs))
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

SPECIAL

000000

rs

cc

0

0

tf

0

rd

0

00000

MOVCI

000001

6

5

3

1

1

5

5

6

Format:

MOVF rd, rs, cc

MIPS32, removed in Release 6

Move Conditional on Floating Point False

Purpose:

Move Conditional on Floating Point False

To test an FP condition code then conditionally move a GPR.

Description:

 if FPConditionCode(cc) = 0 then GPR[rd] = GPR[rs]

If the floating point condition code specified by CC is zero, then the contents of GPR rs are placed into GPR rd.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if FPConditionCode(cc) = 0 then
   GPR[rd] = GPR[rs]
endif

Exceptions:

Reserved Instruction, Coprocessor Unusable

Encoding:

COP1

010001

fmt

rt

fs

fd

MOVN

010011

6

5

5

5

5

6

Format:

MOVN.fmt 

Floating Point Move Conditional on Not Zero

MOVN.S fd, fs, rt

MIPS32, removed in Release 6

Floating Point Move Conditional on Not Zero

MOVN.D fd, fs, rt

MIPS32, removed in Release 6

Floating Point Move Conditional on Not Zero

MOVN.PS fd, fs, rt

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Move Conditional on Not Zero

Purpose:

Floating Point Move Conditional on Not Zero

To test a GPR then conditionally move an FP value.

Description:

 if GPR[rt] != 0 then FPR[fd] = FPR[fs]

If the value in GPR rt is not equal to zero, then the value in FPR fs is placed in FPR fd. The source and destination are values in format fmt.

If GPR rt contains zero, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes UNPREDICTABLE.

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of MOVN.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6 and has been replaced by the 'SELNEZ fmt' instruction. Refer to the

SELNEZ.fmt instruction in this manual for more information. Release 6 does not support Paired Single (PS).

Operation:

if GPR[rt] != 0 then
   StoreFPR(fd, fmt, ValueFPR(fs, fmt))
else
   StoreFPR(fd, fmt, ValueFPR(fd, fmt))
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

MOVN

001011

6

5

5

5

5

6

Format:

MOVN rd, rs, rt

MIPS32, removed in Release 6

Move Conditional on Not Zero

Purpose:

Move Conditional on Not Zero

To conditionally move a GPR after testing a GPR value.

Description:

 if GPR[rt] != 0 then GPR[rd] = GPR[rs]

If the value in GPR rt is not equal to zero, then the contents of GPR rs are placed into GPR rd.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6 and has been replaced by the 'SELNEZ' instruction. Refer to the

SELNEZ instruction in this manual for more information.

Operation:

if GPR[rt] != 0 then
   GPR[rd] = GPR[rs]
endif

Exceptions:

None

Programming Notes:

The non-zero value tested might be the condition true result from the SLT, SLTI, SLTU, and SLTIU comparison instructions or a boolean value read from memory.

Encoding:

COP1

010001

fmt

cc

0

0

tf

1

fs

fd

MOVCF

010001

6

5

3

1

1

5

5

6

Format:

MOVT.fmt 

Floating Point Move Conditional on Floating Point True

MOVT.S fd, fs, cc

MIPS32, removed in Release 6

Floating Point Move Conditional on Floating Point True

MOVT.D fd, fs, cc

MIPS32, removed in Release 6

Floating Point Move Conditional on Floating Point True

MOVT.PS fd, fs, cc

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Move Conditional on Floating Point True

Purpose:

Floating Point Move Conditional on Floating Point True

To test an FP condition code then conditionally move an FP value.

Description:

 if FPConditionCode(cc) = 1 then FPR[fd] = FPR[fs]

If the floating point condition code specified by CC is one, then the value in FPR fs is placed into FPR fd. The source and destination are values in format fmt.

If the condition code is not one, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes UNPREDICTABLE.

MOVT.PS merges the lower half of FPR fs into the lower half of FPR fd if condition code CC is one, and independently merges the upper half of FPR fs into the upper half of FPR fd if condition code CC+1 is one. The CC field should be even; if it is odd, the result of this operation is UNPREDICTABLE.

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE. The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand FPR becomes UNPREDICTABLE.

The result of MOVT.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility

This instruction has been removed in Release 6 and has been replaced by the 'SEL.fmt' instruction. Refer to the

SEL fmt instruction in this manual for more information. Release 6 does not support Paired Single (PS).

Operation:

if fmt != PS
   if FPConditionCode(cc) = 1 then
      StoreFPR(fd, fmt, ValueFPR(fs, fmt))
   else
      StoreFPR(fd, fmt, ValueFPR(fd, fmt))
   endif
else
   mask = 0
   if FPConditionCode(cc+0) = 0 then mask = mask or 0xF0 endif
   if FPConditionCode(cc+1) = 0 then mask = mask or 0x0F endif
   StoreFPR(fd, PS, ByteMerge(mask, fd, fs))
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

SPECIAL

000000

rs

cc

0

0

tf

1

rd

0

00000

MOVCI

000001

6

5

3

1

1

5

5

6

Format:

MOVT rd, rs, cc

MIPS32, removed in Release 6

Move Conditional on Floating Point True

Purpose:

Move Conditional on Floating Point True

To test an FP condition code then conditionally move a GPR.

Description:

 if FPConditionCode(cc) = 1 then GPR[rd] = GPR[rs]

If the floating point condition code specified by CC is one, then the contents of GPR rs are placed into GPR rd.

Restrictions:

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if FPConditionCode(cc) = 1 then
   GPR[rd] = GPR[rs]
endif

Exceptions:

Reserved Instruction, Coprocessor Unusable

Encoding:

COP1

010001

fmt

rt

fs

fd

MOVZ

010010

6

5

5

5

5

6

Format:

MOVZ.fmt 

Floating Point Move Conditional on Zero

MOVZ.S fd, fs, rt

MIPS32, removed in Release 6

Floating Point Move Conditional on Zero

MOVZ.D fd, fs, rt

MIPS32, removed in Release 6

Floating Point Move Conditional on Zero

MOVZ.PS fd, fs, rt

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Move Conditional on Zero

Purpose:

Floating Point Move Conditional on Zero

To test a GPR then conditionally move an FP value.

Description:

 if GPR[rt] = 0 then FPR[fd] = FPR[fs]

If the value in GPR rt is equal to zero then the value in FPR fs is placed in FPR fd. The source and destination are values in format fmt.

If GPR rt is not zero, then FPR fs is not copied and FPR fd contains its previous value in format fmt. If fd did not contain a value either in format fmt or previously unused data from a load or move-to operation that could be interpreted in format fmt, then the value of fd becomes UNPREDICTABLE.

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of MOVZ.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6 and has been replaced by the 'SELEQZ fmt' instruction. Refer to the

SELEQZ.fmt instruction in this manual for more information. Release 6 does not support Paired Single (PS).

Operation:

if GPR[rt] = 0 then
   StoreFPR(fd, fmt, ValueFPR(fs, fmt))
else
   StoreFPR(fd, fmt, ValueFPR(fd, fmt))
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

MOVZ

001010

6

5

5

5

5

6

Format:

MOVZ rd, rs, rt 

MIPS32, removed in Release 6

Move Conditional on Zero

Purpose:

Move Conditional on Zero

To conditionally move a GPR after testing a GPR value.

Description:

 if GPR[rt] = 0 then GPR[rd] = GPR[rs]

If the value in GPR rt is equal to zero, then the contents of GPR rs are placed into GPR rd.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6 and has been replaced by the 'SELEQZ' instruction. Refer to the

SELEQZ instruction in this manual for more information.

Operation:

if GPR[rt] = 0 then
   GPR[rd] = GPR[rs]
endif

Exceptions:

None

Programming Notes:

The zero value tested might be the condition false result from the SLT, SLTI, SLTU, and SLTIU comparison instructions or a boolean value read from memory.

Encoding:

COP1X

010011

fr

ft

fs

fd

MSUB

101

fmt

6

5

5

5

5

3

3

Format:

MSUB.fmt 

Floating Point Multiply Subtract

MSUB.S fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Multiply Subtract

MSUB.D fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Multiply Subtract

MSUB.PS fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Multiply Subtract

Purpose:

Floating Point Multiply Subtract

To perform a combined multiply-then-subtract of FP values.

Description:

 FPR[fd] = (FPR[fs] x FPR[ft]) - FPR[fr]

The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The intermediate product is rounded according to the current rounding mode in FCSR. The subtraction result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. The results and flags are as if separate floating-point multiply and subtract instructions were executed.

MSUB.PS multiplies then subtracts the upper and lower halves of FPR fr, FPR fs, and FPR ft independently, and ORs together any generated exceptional conditions.

The Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of MSUB.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model.

It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

MSUB.S and MSUB.D: Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32

Release 1. Required in MIPS32 Release 2 and all subsequent versions of MIPS32. When required, these instructions are to be implemented if an FPU is present, either in a 32-bit or 64-bit FPU or in a 32-bit or 64-bit FP Register Mode

(FIRF64=0 or 1, StatusFR=0 or 1).

This instruction has been removed in Release 6 and has been replaced by the fused multiply-subtract instruction.

Refer to the fused multiply-subtract instruction 'MSUBF.fmt' in this manual for more information. Release 6 does not support Paired Single (PS).

Operation:

vfr = ValueFPR(fr, fmt)
vfs = ValueFPR(fs, fmt)
vft = ValueFPR(ft, fmt)
StoreFPR(fd, fmt, (vfs xfmt vft) -fmt vfr))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

SPECIAL2

011100

rs

rt

0

00000

0

00000

MSUB

000100

6

5

5

5

5

6

Format:

MSUB rs, rt

MIPS32, removed in Release 6

Multiply and Subtract Word to Hi, Lo

Purpose:

Multiply and Subtract Word to Hi, Lo

To multiply two words and subtract the result from HI, LO.

Description:

 (HI,LO) = (HI,LO) - (GPR[rs] x GPR[rt])

The 32-bit word value in GPR rs is multiplied by the 32-bit value in GPR rt, treating both operands as signed values, to produce a 64-bit result. The product is subtracted from the 64-bit concatenated values of HI31..0 and LO31..0. The most significant 32 bits of the result are sign-extended and written into HI and the least significant 32 bits are signextended and written into LO. No arithmetic exception occurs under any circumstances.

Restrictions:

No restrictions in any architecture releases except Release 6.

If GPRs rs or rt do not contain sign-extended 32-bit values (bits 63..31 equal), then the results of the operation are

UNPREDICTABLE.

This instruction does not provide the capability of writing directly to a target GPR.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = (HI31..0 || LO31..0) - (GPR[rs]31..0 x GPR[rt]31..0)
HI = sign_extend(temp63..32)
LO = sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Encoding:

SPECIAL2

011100

rs

rt

0

00000

0

00000

MSUBU

000101

6

5

5

5

5

6

Format:

MSUBU rs, rt

MIPS32, removed in Release 6

Multiply and Subtract Word to Hi,Lo

Purpose:

Multiply and Subtract Word to Hi,Lo

To multiply two words and subtract the result from HI, LO.

Description:

 (HI,LO) = (HI,LO) - (GPR[rs] x GPR[rt])

The 32-bit word value in GPR rs is multiplied by the 32-bit word value in GPR rt, treating both operands as unsigned values, to produce a 64-bit result. The product is subtracted from the 64-bit concatenated values of HI31..0 and

LO31..0. The most significant 32 bits of the result are sign-extended and written into HI and the least significant 32

bits are sign-extended and written into LO. No arithmetic exception occurs under any circumstances.

Restrictions:

If GPRs rs or rt do not contain sign-extended 32-bit values (bits 63..31 equal), then the results of the operation are

UNPREDICTABLE.

This instruction does not provide the capability of writing directly to a target GPR.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = (HI31..0 || LO31..0) - ((032 || GPR[rs]31..0) <= (032 || GPR[rt]31..0))
HI = sign_extend(temp63..32)
LO = sign_extend(temp31..0)

Exceptions:

None

Programming Notes:

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Encoding:

COP0

010000

MT

00100

rt

rd

0

0000 000

sel

6

5

5

5

8

3

Format:

MTC0 rt, rd, sel

MIPS32

Move to Coprocessor 0

Purpose:

Move to Coprocessor 0

To move the contents of a general register to a coprocessor 0 register.

Description:

 CPR[0, rd, sel] = GPR[rt]

The contents of general register rt are loaded into the coprocessor 0 register specified by the combination of rd and

sel. Not all coprocessor 0 registers support the sel field. In those instances, the sel field must be set to zero.

When the CP0 destination register specified is the EntryLo0 or the EntryLo1 register, bits 31:30 appear in the RI/XI fields of the destination register. This feature supports MIPS32 backward compatibility on a MIPS64 implementation.

In Release 5, for a 32-bit processor, the MTC0 instruction writes all zeroes to the high-order bits of selected CP0 registers that have been extended beyond 32 bits. This is required for compatibility with legacy software that does not use MTHC0, yet has hardware support for extended CP0 registers (such as for Extended Physical Addressing (XPA)).

Because MTC0 overwrites the result of MTHC0, software must first read the high-order bits before writing the loworder bits, then write the high-order bits back either modified or unmodified. For initialization of an extended register, software may first write the low-order bits, then the high-order bits, without first reading the high-order bits.

Restrictions:

Pre-Release 6: The results are UNDEFINED if coprocessor 0 does not contain a register as specified by rd and sel.

Release 6: Writes to a register that is reserved or not defined for the current core configuration are ignored.

Operation:

data = GPR[rt]
reg = rd
if IsCoprocessorRegisterImplemented (0, reg, sel) then
   if (reg,sel = EntryLo1 or EntryLo0) then
      CPR[0,reg,sel]29..0 = data29..0
      CPR[0,reg,sel]63 = data31
      CPR[0,reg,sel]62 = data30
      CPR[0,reg,sel]61:30 = 032
   elseif (Width(CPR[0,reg,sel]) = 64) then
      CPR[0,reg,sel] = data
   else
      CPR[0,reg,sel] = data31..0
      if (Config5XPA = 1) then
         // The most-significant bit may vary by register. Only supported
         // bits should be written 0. Extended LLAddr is not written with 0s,
         // as it is a read-only register. BadVAddr is not written with 0s, as
         // it is read-only
         if (Config3LPA = 1) then
             if (reg,sel = EntryLo0 or EntryLo1) then CPR[0,reg,sel]63:32 = 032
             endif
             if (reg,sel = MAAR) then CPR[0,reg,sel]63:32 = 032 endif
                // TagLo is zeroed only if the implementation-dependent bits
                // are writeable
             if (reg,sel = TagLo) then CPR[0,reg,sel]63:32 = 032 endif
             if (Config3VZ = 1) then 
                if (reg,sel = EntryHi) then CPR[0,reg,sel]63:32 = 032 endif
             endif
          endif
      endif
   endif
else
   if ArchitectureRevision() >= 6 then
   // nop (no exceptions, coprocessor state not modified)
   else
      UNDEFINED
   endif
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

MT

00100

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

MTC1 rt, fs

MIPS32

Move Word to Floating Point

Purpose:

Move Word to Floating Point

To copy a word from a GPR to an FPU (CP1) general register.

Description:

 FPR[fs] = GPR[rt]

The low word in GPR rt is placed into the low word of FPR fs. If FPRs are 64 bits wide, bits 63..32 of FPR fs become

UNPREDICTABLE.

Restrictions:

Operation:

data = GPR[rt]31..0
StoreFPR(fs, UNINTERPRETED_WORD, data)

Exceptions:

Coprocessor Unusable

Historical Information:

For MIPS I, MIPS II, and MIPS III the value of FPR fs is UNPREDICTABLE for the instruction immediately following MTC1.

Encoding:

COP2

010010

MT

00100

rt

Impl

6

5

5

16

Format:

MTC2 rt, Impl, sel

MIPS32

Move Word to Coprocessor 2

Purpose:

Move Word to Coprocessor 2

To copy a word from a GPR to a COP2 general register.

Description:

 CP2CPR[Impl] = GPR[rt]

The low word in GPR rt is placed into the low word of a Coprocessor 2 general register denoted by the Impl field. If

Coprocessor 2 general registers are 64 bits wide; bits 63..32 of the register denoted by the Impl field become

UNPREDICTABLE. The interpretation of the Impl field is left entirely to the Coprocessor 2 implementation and is

not specified by the architecture.

Restrictions:

The results are UNPREDICTABLE if the Impl field specifies a Coprocessor 2 register that does not exist. Operation:

data = GPR[rt]31..0
CP2CPR[Impl] = data

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP0

010000

MTH

00110

rt

rd

0

0000 0000

sel

6

5

5

5

8

3

Format:

MTHC0 rt, rd, sel

MIPS32 Release 5

Move to High Coprocessor 0

Purpose:

Move to High Coprocessor 0

To copy a word from a GPR to the upper 32 bits of a CP0 general register that has been extended by 32 bits.

Description:

 CPR[0, rd, sel][63:32] = GPR[rt]

The contents of general register rt are loaded into the Coprocessor 0 register specified by the combination of rd and

sel. Not all Coprocessor 0 registers support the sel field; the sel field must be set to zero.

When the Coprocessor 0 destination register specified is the EntryLo0 or EntryLo1 register, bits 1:0 of the GPR appear at bits 31:30 of EntryLo0 or EntryLo1. This is to compensate for RI and XI, which were shifted to bits 63:62 by MTC0 to EntryLo0 or EntryLo1. If RI/XI are not supported, the shift must still occur, but an MFHC0 instruction returns 0s for these two fields. The GPR is right-shifted by two to vacate the lower two bits, and two 0s are shifted in from the left.

The result is written to the upper 32 bits of MIPS64 EntryLo0 or EntryLo1, excluding RI/XI, which were placed in bits

63:62, that is, the write must appear atomic, as if both MTC0 and MTHC0 occurred together.

This feature supports MIPS32 backward compatibility of MIPS64 systems.

Restrictions:

Pre-Release 6: The results are UNDEFINED if Coprocessor 0 does not contain a register as specified by rd and sel, or if the register exists but is not extended by 32 bits, or the register is extended for XPA, but XPA is not supported or enabled.

Release 6: A write to the high part of a register that is reserved, not implemented for the current core, or that is not extended beyond 32 bits is ignored.

In a 64-bit processor, the MTHC0 instruction writes only the lower 32 bits of register rt into the upper 32 bits of the

Coprocessor register specified by rd and sel if the register is extended as defined by IsCoprocessorRegisterExtended(). The registers extended by Release 5 are those required for the XPA feature. Release 6 extends WatchHi to support MemoryMapID. These registers are identical to the same registers in the MIPS64 Architecture, other than

EntryLo0 and EntryLo1.

Operation:

if Config5MVH = 0 then SignalException(ReservedInstruction) endif
data = GPR[rt]
reg = rd
if IsCoprocessorRegisterImplemented (0, reg, sel) and
   IsCoprocessorRegisterExtended (0, reg, sel) then
      if (reg,sel = EntryLo1 or reg,sel = EntryLo0) then
         if (Config3LPA = 1 and PageGrainELPA = 1) then // PABITS > 36
             CPR[0,reg,sel]31..30 = data1..0
             CPR[0,reg,sel]61:32 = data31..2 and ((1<<(PABITS-36))-1)
             CPR[0,reg,sel]63:62 = 02
      else
         CPR[0, reg, sel][63:32] = data31..0
      endif
else
   if ArchitectureRevision() >= 6 then
      // nop (no exceptions, coprocessor state not modified)
   else
      UNDEFINED
   endif
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

MTH

00111

rt

fs

0

000 0000 0000

6

5

5

5

11

Format:

MTHC1 rt, fs

MIPS32 Release 2

Move Word to High Half of Floating Point Register

Purpose:

Move Word to High Half of Floating Point Register

To copy a word from a GPR to the high half of an FPU (CP1) general register.

Description:

 FPR[fs]63..32 = GPR[rt]31..0

The low word in GPR rt is placed into the high word of FPR fs. This instruction is primarily intended to support 64bit floating point units on a 32-bit CPU, but the semantics of the instruction are defined for all cases.

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

The results are UNPREDICTABLE if StatusFR = 0 and fs is odd.

Operation:

newdata = GPR[rt]31..0
      olddata = ValueFPR(fs, UNINTERPRETED_DOUBLEWORD)31..0
StoreFPR(fs, UNINTERPRETED_DOUBLEWORD, newdata || olddata)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Programming Notes

When paired with MTC1 to write a value to a 64-bit FPR, the MTC1 must be executed first, followed by the MTHC1.

This is because of the semantic definition of MTC1, which is not aware that software is using an MTHC1 instruction to complete the operation, and sets the upper half of the 64-bit FPR to an UNPREDICTABLE value.

Encoding:

COP2

010010

MTH

00111

rt

Impl

6

5

5

16

Format:

MTHC2 rt, Impl, sel

MIPS32 Release 2

Move Word to High Half of Coprocessor 2 Register

Purpose:

Move Word to High Half of Coprocessor 2 Register

To copy a word from a GPR to the high half of a COP2 general register.

Description:

 CP2CPR[Impl]63..32 = GPR[rt]31..0

The low word in GPR rt is placed into the high word of coprocessor 2 general register denoted by the Impl field. The interpretation of the Impl field is left entirely to the Coprocessor 2 implementation and is not specified by the architecture.

Restrictions:

The results are UNPREDICTABLE if the Impl field specifies a coprocessor 2 register that does not exist, or if that register is not 64 bits wide.

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

data = GPR[rt]31..0
CP2CPR[Impl] = data || CPR[2,rd,sel]31..0 

Exceptions:

Coprocessor Unusable, Reserved Instruction

Programming Notes

When paired with MTC2 to write a value to a 64-bit CPR, the MTC2 must be executed first, followed by the

MTHC2. This is because of the semantic definition of MTC2, which is not aware that software is using an MTHC2 instruction to complete the operation, and sets the upper half of the 64-bit CPR to an UNPREDICTABLE value.

Encoding:

SPECIAL

000000

rs

0

000 0000 0000 0000

MTHI

010001

6

5

15

6

Format:

MTHI rs

MIPS32, removed in Release 6

Move to HI Register

Purpose:

Move to HI Register

To copy a GPR to the special purpose HI register.

Description:

 HI = GPR[rs]

The contents of GPR rs are loaded into special register HI.

Restrictions:

A computed result written to the HI/LO pair by DIV, DIVU, DDIV, DDIVU, DMULT, DMULTU, MULT, or MULTU must be read by MFHI or MFLO before a new result can be written into either HI or LO.

If an MTHI instruction is executed following one of these arithmetic instructions, but before an MFLO or MFHI instruction, the contents of LO are UNPREDICTABLE. The following example shows this illegal situation:

MULT  r2,r4 # start operation that will eventually write to HI,LO
...          # code not containing mfhi or mflo
MTHI  r6
...          # code not containing mflo
MFLO  r3    # this mflo would get an UNPREDICTABLE value

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

HI = GPR[rs]

Exceptions:

None

Historical Information:

In MIPS I-III, if either of the two preceding instructions is MFHI, the result of that MFHI is UNPREDICTABLE.

Reads of the HI or LO special register must be separated from any subsequent instructions that write to them by two or more instructions. In MIPS IV and later, including MIPS32 and MIPS64, this restriction does not exist.

Encoding:

SPECIAL

000000

rs

0

000 0000 0000 0000

MTLO

010011

6

5

15

6

Format:

MTLO rs

MIPS32, removed in Release 6

Move to LO Register

Purpose:

Move to LO Register

To copy a GPR to the special purpose LO register.

Description:

 LO = GPR[rs]

The contents of GPR rs are loaded into special register LO.

Restrictions:

A computed result written to the HI/LO pair by DIV, DIVU, DDIV, DDIVU, DMULT, DMULTU, MULT, or MULTU must be read by MFHI or MFLO before a new result can be written into either HI or LO.

If an MTLO instruction is executed following one of these arithmetic instructions, but before an MFLO or MFHI instruction, the contents of HI are UNPREDICTABLE. The following example shows this illegal situation:

MULT  r2,r4 # start operation that will eventually write to HI,LO
...          # code not containing mfhi or mflo
MTLO  r6
...          # code not containing mfhi
MFHI  r3    # this mfhi would get an UNPREDICTABLE value

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

LO = GPR[rs]

Exceptions:

None

Historical Information:

In MIPS I-III, if either of the two preceding instructions is MFHI, the result of that MFHI is UNPREDICTABLE.

Reads of the HI or LO special register must be separated from any subsequent instructions that write to them by two or more instructions. In MIPS IV and later, including MIPS32 and MIPS64, this restriction does not exist.

Encoding:

COP1

010001

fmt

ft

fs

fd

MUL

000010

6

5

5

5

5

6

Format:

MUL.fmt 

Floating Point Multiply

MUL.S fd, fs, ft

MIPS32

Floating Point Multiply

MUL.D fd, fs, ft

MIPS32

Floating Point Multiply

MUL.PS fd, fs, ft

MIPS64, MIPS32 Release 3, removed in Release 6

Floating Point Multiply

Purpose:

Floating Point Multiply

To multiply FP values.

Description:

 FPR[fd] = FPR[fs] x FPR[ft]

The value in FPR fs is multiplied by the value in FPR ft. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt.

MUL.PS multiplies the upper and lower halves of FPR fs and FPR ft independently, and ORs together any generated exceptional conditions.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of MUL.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

MUL.PS has been removed in Release 6.

Operation:

StoreFPR (fd, fmt, ValueFPR(fs, fmt) <=fmt ValueFPR(ft, fmt))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

SPECIAL

000000

rs

rt

rd

MUL

00010

SOP30

011000

SPECIAL

000000

rs

rt

rd

MUH

00011

SOP30

011000

SPECIAL

000000

rs

rt

rd

MULU

00010

SOP31

011001

SPECIAL

000000

rs

rt

rd

MUHU

00011

SOP31

011001

SPECIAL

000000

rs

rt

rd

DMUL

00010

SOP34

011100

SPECIAL

000000

rs

rt

rd

DMUH

00011

SOP34

011100

SPECIAL

000000

rs

rt

rd

DMULU

00010

SOP35

011101

SPECIAL

000000

rs

rt

rd

DMUHU

00011

SOP35

011101

6

5

5

5

5

6

Format:

MUL MUH MULU MUHU DMUL DMUH DMULU DMUHU 

Multiply Words Signed, Low Word

MUL rd,rs,rt

MIPS32 Release 6

Multiply Words Signed, Low Word

MUH rd,rs,rt

MIPS32 Release 6

Multiply Words Signed, High Word

MULU rd,rs,rt

MIPS32 Release 6

Multiply Words Unsigned, Low Word

MUHU rd,rs,rt

MIPS32 Release 6

Multiply Words Unsigned, High Word

DMUL rd,rs,rt

MIPS64 Release 6

Multiply Doublewords Signed, Low Doubleword

DMUH rd,rs,rt

MIPS64 Release 6

Multiply Doublewords Signed, High Doubleword

DMULU rd,rs,rt

MIPS64 Release 6

Multiply Doublewords Unsigned, Low Doubleword

DMUHU rd,rs,rt

MIPS64 Release 6

Multiply Doublewords Unsigned, High Doubleword

Purpose:

Multiply Integers (with result to GPR)

MUL: Multiply Words Signed, Low Word

MUH: Multiply Words Signed, High Word

MULU: Multiply Words Unsigned, Low Word

MUHU: Multiply Words Unsigned, High Word

DMUL: Multiply Doublewords Signed, Low Doubleword

DMUH: Multiply Doublewords Signed, High Doubleword

DMULU: Multiply Doublewords Unsigned, Low Doubleword

DMUHU: Multiply Doublewords Unsigned, High Doubleword

Description:

MUL:  GPR[rd] = sign_extend.32( lo_word( multiply.signed( GPR[rs] <= GPR[rt] ) ) )
MUH:  GPR[rd] = sign_extend.32( hi_word( multiply.signed( GPR[rs] <= GPR[rt] ) ) )
MULU:  GPR[rd] = sign_extend.32( lo_word( multiply.unsigned( GPR[rs] <= GPR[rt] ) ) )
MUHU:  GPR[rd] = sign_extend.32( hi_word( multiply.unsigned( GPR[rs] <= GPR[rt] ) ) )
DMUL:  GPR[rd] = lo_doubleword( multiply.signed( GPR[rs] <= GPR[rt] ) )
DMUH:  GPR[rd] = hi_doubleword( multiply.signed( GPR[rs] <= GPR[rt] ) )
DMULU: GPR[rd] = lo_doubleword( multiply.unsigned( GPR[rs] <= GPR[rt] ) )
DMUHU: GPR[rd] = hi_doubleword( multiply.unsigned( GPR[rs] <= GPR[rt] ) )

The Release 6 multiply instructions multiply the operands in GPR[rs] and GPR[rd], and place the specified high or low part of the result, of the same width, in GPR[rd].

MUL performs a signed 32-bit integer multiplication, and places the low 32 bits of the result in the destination register.

MUH performs a signed 32-bit integer multiplication, and places the high 32 bits of the result in the destination register.

MULU performs an unsigned 32-bit integer multiplication, and places the low 32 bits of the result in the destination register.

MUHU performs an unsigned 32-bit integer multiplication, and places the high 32 bits of the result in the destination register.

DMUL performs a signed 64-bit integer multiplication, and places the low 64 bits of the result in the destination register.

DMUH performs a signed 64-bit integer multiplication, and places the high 64 bits of the result in the destination register.

DMULU performs an unsigned 64-bit integer multiplication, and places the low 64 bits of the result in the destination register.

DMUHU performs an unsigned 64-bit integer multiplication, and places the high 64 bits of the result in the destination register.

Restrictions:

On a 64-bit CPU, MUH is UNPREDICTABLE if its inputs are not signed extended 32-bit integers.

MUL behaves correctly even if its inputs are not sign extended 32-bit integers. Bits 32-63 of its inputs do not affect the result.

On a 64-bit CPU, MUHU is UNPREDICTABLE if its inputs are not zero or sign extended 32-bit integers.

MULU behaves correctly even if its inputs are not zero or sign extended 32-bit integers. Bits 32-63 of its inputs do not affect the result.

On a 64-bit CPU, the 32-bit multiplications, both signed and unsigned, sign extend the result as if it is a 32-bit signed integer.

DMUL DMUH DMULU DMUHU: Reserved Instruction exception if 64-bit instructions are not enabled.

Availability and Compatibility:

These instructions are introduced by and required as of Release 6.

Programming Notes:

The low half of the integer multiplication result is identical for signed and unsigned. Nevertheless, there are distinct instructions MUL MULU DMUL DMULU. Implementations may choose to optimize a multiply that produces the low half followed by a multiply that produces the upper half. Programmers are recommended to use matching lower and upper half multiplications.

The Release 6 MUL instruction has the same opcode mnemonic as the pre-Release 6 MUL instruction. The semantics of these instructions are almost identical: both produce the low 32-bits of the 32<=32=64 product; but the pre-Release

6 MUL is unpredictable if its inputs are not properly sign extended 32-bit values on a 64 bit machine, and is defined to render the HI and LO registers unpredictable, whereas the Release 6 version ignores bits 32-63 of the input, and there are no HI/LO registers in Release 6 to be affected.

Operation:

MUH: if NotWordValue(GPR[rs]) then UNPREDICTABLE endif
MUH: if NotWordValue(GPR[rt]) then UNPREDICTABLE endif
MUHU: if NotWordValue(GPR[rs]) then UNPREDICTABLE endif
MUHU: if NotWordValue(GPR[rt]) then UNPREDICTABLE endif
/* recommended implementation: ignore bits 32-63 for MUL, MUH, MULU, MUHU */
MUL, MUH: 
 s1 = signed_word(GPR[rs])
  s2 = signed_word(GPR[rt])
MULU, MUHU: 
  s1 = unsigned_word(GPR[rs])
  s2 = unsigned_word(GPR[rt])
DMUL, DMUH: 
  s1 = signed_doubleword(GPR[rs])
  s2 = signed_doubleword(GPR[rt])
DMULU, DMUHU: 
  s1 = unsigned_doubleword(GPR[rs])
  s2 = unsigned_doubleword(GPR[rt])
product = s1 <= s2    /* product is twice the width of sources */
MUL:  GPR[rd] = sign_extend.32( lo_word( product ) )
MUH:  GPR[rd] = sign_extend.32( hi_word( product ) )
MULU:  GPR[rd] = sign_extend.32( lo_word( product ) )
MUHU:  GPR[rd] = sign_extend.32( hi_word( product ) )
DMUL:  GPR[rd] = lo_doubleword( product )
DMUH:  GPR[rd] = hi_doubleword( product )
DMULU: GPR[rd] = lo_doubleword( product )
DMUHU: GPR[rd] = hi_doubleword( product )

Exceptions:

MUL MUH MULU MUHU: None

DMUL DMUH DMULU DMUHU: Reserved Instruction

Encoding:

SPECIAL2

011100

rs

rt

rd

0

00000

MUL

000010

6

5

5

5

5

6

Format:

MUL rd, rs, rt

MIPS32, removed in Release 6

Multiply Word to GPR

Purpose:

Multiply Word to GPR

To multiply two words and write the result to a GPR.

Description:

 GPR[rd] = GPR[rs] x GPR[rt]

The 32-bit word value in GPR rs is multiplied by the 32-bit value in GPR rt, treating both operands as signed values, to produce a 64-bit result. The least significant 32 bits of the product are sign-extended and written to GPR rd. The contents of HI and LO are UNPREDICTABLE after the operation. No arithmetic exception occurs under any circumstances.

Restrictions:

On 64-bit processors, if either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Note that this instruction does not provide the capability of writing the result to the HI and LO registers.

Availability and Compatibility:

The pre-Release 6 MUL instruction has been removed in Release 6. It has been replaced by a similar instruction of the same mnemonic, MUL, but different encoding, which is a member of a family of single-width multiply instructions. Refer to the 'MUL' and 'MUH' instructions in this manual for more information.

Operation:

if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then
   UNPREDICTABLE 
endif
temp = GPR[rs] x GPR[rt]
GPR[rd] = sign_extend(temp31..0)
HI = UNPREDICTABLE
LO = UNPREDICTABLE

Exceptions:

None

Programming Notes:

In some processors the integer multiply operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read GPR rd before the results are written interlocks until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

MULT

011000

6

5

5

10

6

Format:

MULT rs, rt 

MIPS32, removed in Release 6

Multiply Word

Purpose:

Multiply Word

To multiply 32-bit signed integers.

Description:

 (HI, LO) = GPR[rs] x GPR[rt]

The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both operands as signed values, to produce a 64-bit result. The low-order 32-bit word of the result is sign-extended and placed into special register

LO, and the high-order 32-bit word is sign-extended and placed into special register HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

On 64-bit processors, if either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Availability and Compatibility:

The MULT instruction has been removed in Release 6. It has been replaced by the Multiply Low (MUL) and Multiply

High (MUH) instructions, whose output is written to a single GPR. Refer to the 'MUL' and 'MUH' instructions in this manual for more information.

Operation:

if (NotWordValue(GPR[rs]) or NotWordValue(GPR[rt])) then
   UNPREDICTABLE 
endif
   prod = GPR[rs]31..0 x GPR[rt]31..0
   LO = sign_extend(prod31..0)
   HI = sign_extend(prod63..32)

Exceptions:

None

Programming Notes:

In some processors the integer multiply operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written interlocks until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Implementation Note:

Encoding:

SPECIAL

000000

rs

rt

0

00 0000 0000

MULTU

011001

6

5

5

10

6

Format:

MULTU rs, rt

MIPS32, removed in Release 6

Multiply Unsigned Word

Purpose:

Multiply Unsigned Word

To multiply 32-bit unsigned integers.

Description:

 (HI, LO) = GPR[rs] x GPR[rt]

The 32-bit word value in GPR rt is multiplied by the 32-bit value in GPR rs, treating both operands as unsigned values, to produce a 64-bit result. The low-order 32-bit word of the result is sign-extended and placed into special register LO, and the high-order 32-bit word is sign-extended and placed into special register HI.

No arithmetic exception occurs under any circumstances.

Restrictions:

On 64-bit processors, if either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Availability and Compatibility:

The MULTU instruction has been removed in Release 6. It has been replaced by the Multiply Low (MULU) and Multiply High (MUHU) instructions, whose output is written to a single GPR. Refer to the 'MULU' and 'MUHU' instructions in this manual for more information.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE 
endif
   prod = (0 || GPR[rs]31..0) x (0 || GPR[rt]31..0)
   LO = sign_extend(prod31..0)
   HI = sign_extend(prod63..32)

Exceptions:

None

Programming Notes:

In some processors the integer multiply operation may proceed asynchronously and allow other CPU instructions to execute before it is complete. An attempt to read LO or HI before the results are written interlocks until the results are ready. Asynchronous execution does not affect the program result, but offers an opportunity for performance improvement by scheduling the multiply so that other instructions can execute in parallel.

Programs that require overflow detection must check for it explicitly.

Where the size of the operands are known, software should place the shorter operand in GPR rt. This may reduce the latency of the instruction on those processors which implement data-dependent instruction latencies.

Encoding:

REGIMM

000001

0

00000

NAL

10000

offset

6

5

5

16

Format:

NAL 

Assembly Idiom MIPS32 pre-Release 6, MIPS32 Release 6

No-op and Link

Purpose:

No-op and Link

Description:

GPR[31]= PC+8

NAL is an instruction used to read the PC.

NAL was originally an alias for pre-Release 6 instruction BLTZAL. The condition is false, so the 16-bit target offset field is ignored, but the link register, GPR 31, is unconditionally written with the address of the instruction past the delay slot.

Restrictions:

NAL is considered to be a not-taken branch, with a delay slot, and may not be followed by instructions not allowed in delay slots. Nor is NAL allowed in a delay slot or forbidden slot.

Availability and Compatibility:

This is a deprecated instruction in Release 6. It is strongly recommended not to use this deprecated instructions because it will be removed from a future revision of the MIPS Architecture.

The pre-Release 6 instruction BLTZAL when rs is not GPR[0], is removed in Release 6, and is required to signal a

Reserved Instruction exception. Release 6 adds BLTZALC, the equivalent compact conditional branch and link, with no delay slot.

This instruction, NAL, is introduced by and required as of Release 6, the mnemonic NAL becomes distinguished from the BLTZAL instruction removed in Release 6. The NAL instruction encoding, however, works on all implementations, both pre-Release 6, where it was a special case of BLEZAL, and Release 6, where it is an instruction in its own right.

NAL is provided only for compatibility with pre-Release 6 software. It is recommended that you use ADDIUPC to generate a PC-relative address.

Exceptions:

None

Operation:

GPR[31] = PC + 8

Encoding:

COP1

010001

fmt

0

00000

fs

fd

NEG

000111

6

5

5

5

5

6

Format:

NEG.fmt 

Floating Point Negate

NEG.S fd, fs

MIPS32

Floating Point Negate

NEG.D fd, fs

MIPS32

Floating Point Negate

NEG.PS fd, fs

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negate

Purpose:

Floating Point Negate

To negate an FP value.

Description:

 FPR[fd] = -FPR[fs]

The value in FPR fs is negated and placed into FPR fd. The value is negated by changing the sign bit value. The operand and result are values in format fmt. NEG.PS negates the upper and lower halves of FPR fs independently, and ORs together any generated exceptional conditions.

If FIRHas2008=0 or FCSRABS2008=0 then this operation is arithmetic. For this case, any NaN operand signals invalid operation.

If FCSRABS2008=1 then this operation is non-arithmetic. For this case, both regular floating point numbers and NAN values are treated alike, only the sign bit is affected by this instruction. No IEEE 754 exception can be generated for this case, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE. The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand FPR becomes UNPREDICTABLE.

The result of NEG.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

NEG.PS has been removed in Release 6.

Operation:

StoreFPR(fd, fmt, Negate(ValueFPR(fs, fmt)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation

Encoding:

COP1X

010011

fr

ft

fs

fd

NMADD

110

fmt

6

5

5

5

5

3

3

Format:

NMADD.fmt 

Floating Point Negative Multiply Add

NMADD.S fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negative Multiply Add

NMADD.D fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negative Multiply Add

NMADD.PS fd, fr, fs, ft 

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negative Multiply Add

Purpose:

Floating Point Negative Multiply Add

To negate a combined multiply-then-add of FP values.

Description:

 FPR[fd] = - ((FPR[fs] x FPR[ft]) + FPR[fr])

The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The intermediate product is rounded according to the current rounding mode in FCSR. The value in FPR fr is added to the product.

The result sum is calculated to infinite precision, rounded according to the current rounding mode in FCSR, negated by changing the sign bit, and placed into FPR fd. The operands and result are values in format fmt. The results and flags are as if separate floating-point multiply and add and negate instructions were executed.

NMADD.PS applies the operation to the upper and lower halves of FPR fr, FPR fs, and FPR ft independently, and

ORs together any generated exceptional conditions.

The Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of NMADD.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

NMADD.S and NMADD.D: Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32

Release 1. Required by MIPS32 Release 2 and subsequent versions of MIPS32. When required, these instructions are to be implemented if an FPU is present, either in a 32-bit or 64-bit FPU or in a 32-bit or 64-bit FP Register Mode

(FIRF64=0 or 1, StatusFR=0 or 1).

Operation:

vfr = ValueFPR(fr, fmt)
vfs = ValueFPR(fs, fmt)
vft = ValueFPR(ft, fmt)
StoreFPR(fd, fmt, -(vfr +fmt (vfs xfmt vft)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

COP1X

010011

fr

ft

fs

fd

NMSUB

111

fmt

6

5

5

5

5

3

3

Format:

NMSUB.fmt 

Floating Point Negative Multiply Subtract

NMSUB.S fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negative Multiply Subtract

NMSUB.D fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negative Multiply Subtract

NMSUB.PS fd, fr, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Negative Multiply Subtract

Purpose:

Floating Point Negative Multiply Subtract

To negate a combined multiply-then-subtract of FP values.

Description:

 FPR[fd] = ((FPR[fs] x FPR[ft]) - FPR[fr])

The value in FPR fs is multiplied by the value in FPR ft to produce an intermediate product. The intermediate product is rounded according to the current rounding mode in FCSR. The value in FPR fr is subtracted from the product.

The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, negated by changing the sign bit, and placed into FPR fd. The operands and result are values in format fmt. The results and flags are as if separate floating-point multiply and subtract and negate instructions were executed.

NMSUB.PS applies the operation to the upper and lower halves of FPR fr, FPR fs, and FPR ft independently, and

ORs together any generated exceptional conditions.

The Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

The fields fr, fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of NMSUB.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0 and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

NMSUB.S and NMSUB.D: Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32

Release 1. Required in MIPS32 Release 2 and all subsequent versions of MIPS32. When required, these instructions are to be implemented if an FPU is present, either in a 32-bit or 64-bit FPU or in a 32-bit or 64-bit FP Register Mode

(FIRF64=0 or 1, StatusFR=0 or 1).

Operation:

vfr = ValueFPR(fr, fmt)
vfs = ValueFPR(fs, fmt)
vft = ValueFPR(ft, fmt)
StoreFPR(fd, fmt, -((vfs xfmt vft) -fmt vfr))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

SPECIAL

000000

0

00000

0

00000

0

00000

0

00000

SLL

000000

6

5

5

5

5

6

Format:

NOP 

Assembly Idiom

No Operation

Purpose:

No Operation

To perform no operation.

Description:

NOP is the assembly idiom used to denote no operation. The actual instruction is interpreted by the hardware as SLL r0, r0, 0.

Restrictions:

None

Operations:

None

Exceptions:

None

Programming Notes:

The zero instruction word, which represents SLL, r0, r0, 0, is the preferred NOP for software to use to fill branch and jump delay slots and to pad out alignment sequences.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

NOR

100111

6

5

5

5

5

6

Format:

NOR rd, rs, rt

MIPS32

Not Or

Purpose:

Not Or

To do a bitwise logical NOT OR.

Description:

 GPR[rd] = GPR[rs] nor GPR[rt]

The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical NOR operation. The result is placed into GPR rd.

Restrictions:

None

Operation:

GPR[rd] = GPR[rs] nor GPR[rt]

Exceptions:

None

Encoding:

ORI

001101

rs

rt

immediate

6

5

5

16

Format:

ORI rt, rs, immediate

MIPS32

Or Immediate

Purpose:

Or Immediate

To do a bitwise logical OR with a constant.

Description:

 GPR[rt] = GPR[rs] or immediate

The 16-bit immediate is zero-extended to the left and combined with the contents of GPR rs in a bitwise logical OR operation. The result is placed into GPR rt.

Restrictions:

None

Operations:

GPR[rt] = GPR[rs] or zero_extend(immediate)

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

OR

100101

6

5

5

5

5

6

Format:

OR rd, rs, rt

MIPS32

Or

Purpose:

Or

To do a bitwise logical OR.

Description:

 GPR[rd] = GPR[rs] or GPR[rt]

The contents of GPR rs are combined with the contents of GPR rt in a bitwise logical OR operation. The result is placed into GPR rd.

Restrictions:

None

Operations:

GPR[rd] = GPR[rs] or GPR[rt]

Exceptions:

None

Encoding:

SPECIAL

000000

0

00000

0

00000

0

00000

5

00101

SLL

000000

6

5

5

5

5

6

Format:

PAUSE 

MIPS32 Release 2/MT Module

Wait for the LLBit to clear.

Purpose:

Wait for the LLBit to clear.

Description:

Locks implemented using the LL/SC (or LLD/SCD) instructions are a common method of synchronization between threads of control. A lock implementation does a load-linked instruction and checks the value returned to determine whether the software lock is set. If it is, the code branches back to retry the load-linked instruction, implementing an active busy-wait sequence. The PAUSE instruction is intended to be placed into the busy-wait sequence to block the instruction stream until such time as the load-linked instruction has a chance to succeed in obtaining the software lock.

The PAUSE instruction is implementation-dependent, but it usually involves descheduling the instruction stream until the LLBit is zero.

In either case, it is assumed that the instruction stream which gives up the software lock does so via a write to the lock variable, which causes the processor to clear the LLBit as seen by this thread of execution.

The encoding of the instruction is such that it is backward compatible with all previous implementations of the architecture. The PAUSE instruction can therefore be placed into existing lock sequences and treated as a NOP by the processor, even if the processor does not implement the PAUSE instruction.

Restrictions:

Pre-Release 6: The operation of the processor is UNPREDICTABLE if a PAUSE instruction is executed placed in the delay slot of a branch or jump instruction.

Release 6: Implementations are required to signal a Reserved Instruction exception if PAUSE is encountered in the delay slot or forbidden slot of a branch or jump instruction.

Operations:

if LLBit != 0 then
   EPC = PC + 4                 /* Resume at the following instruction */
   DescheduleInstructionStream()
endif

Exceptions:

None

Programming Notes:

The PAUSE instruction is intended to be inserted into the instruction stream after an LL instruction has set the LLBit and found the software lock set. The program may wait forever if a PAUSE instruction is executed and there is no possibility that the LLBit will ever be cleared.

An example use of the PAUSE instruction is shown below:

acquire_lock:
   ll    t0, 0(a0)              /* Read software lock, set hardware lock */
   bnezc t0, acquire_lock_retry: /* Branch if software lock is taken; */
                                 /* Release 6 branch */
   addiu t0, t0, 1              /* Set the software lock */
   sc    t0, 0(a0)              /* Try to store the software lock */
   bnezc t0, 10f                /* Branch if lock acquired successfully */
   sync
acquire_lock_retry:
   pause                         /* Wait for LLBIT to clear before retry */
   bc    acquire_lock           /* and retry the operation; Release 6 branch */
10:
   Critical region code
release_lock:
   sync
   sw    zero, 0(a0)            /* Release software lock, clearing LLBIT */
                                 /* for any PAUSEd waiters */

Encoding:

COP1

010001

fmt

10110

ft

fs

fd

PLL

101100

6

5

5

5

5

6

Format:

PLL.PS fd, fs, ft 

MIPS64, MIPS32 Release 2, removed in Release 6

Pair Lower Lower

Purpose:

Pair Lower Lower

To merge a pair of paired single values with realignment.

Description:

 FPR[fd] = lower(FPR[fs]) || lower(FPR[ft])

A new paired-single value is formed by catenating the lower single of FPR fs (bits 31..0) and the lower single of FPR

ft (bits 31..0).

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is

UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

StoreFPR(fd, PS, ValueFPR(fs, PS)31..0 || ValueFPR(ft, PS)31..0)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

fmt

10110

ft

fs

fd

PLU

101101

6

5

5

5

5

6

Format:

PLU.PS fd, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Pair Lower Upper

Purpose:

Pair Lower Upper

To merge a pair of paired single values with realignment.

Description:

 FPR[fd] = lower(FPR[fs]) || upper(FPR[ft])

A new paired-single value is formed by catenating the lower single of FPR fs (bits 31..0) and the upper single of FPR

ft (bits 63..32).

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is

UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

StoreFPR(fd, PS, ValueFPR(fs, PS)31..0 || ValueFPR(ft, PS)63..32)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL3

011111

base

hint

offset

0

PREFE

100011

6

5

5

9

1

6

Format:

PREFE hint,offset(base)

MIPS32

Prefetch EVA

Purpose:

Prefetch EVA

To move data between user mode virtual address space memory and cache while operating in kernel mode.

Description:

 prefetch_memory(GPR[base] + offset)

PREFE adds the 9-bit signed offset to the contents of GPR base to form an effective byte address. The hint field supplies information about the way that the data is expected to be used.

PREFE enables the processor to take some action, causing data to be moved to or from the cache, to improve program performance. The action taken for a specific PREFE instruction is both system and context dependent. Any action, including doing nothing, is permitted as long as it does not change architecturally visible state or alter the meaning of a program. Implementations are expected either to do nothing, or to take an action that increases the performance of the program. The PrepareForStore function is unique in that it may modify the architecturally visible state.

PREFE does not cause addressing-related exceptions, including TLB exceptions. If the address specified would cause an addressing exception, the exception condition is ignored and no data movement occurs.However even if no data is moved, some action that is not architecturally visible, such as writeback of a dirty cache line, can take place.

It is implementation dependent whether a Bus Error or Cache Error exception is reported if such an error is detected as a byproduct of the action taken by the PREFE instruction.

PREFE neither generates a memory operation nor modifies the state of a cache line for a location with an uncached memory access type, whether this type is specified by the address segment (for example, kseg1), the programmed cacheability and coherency attribute of a segment (for example, the use of the K0, KU, or K23 fields in the Config register), or the per-page cacheability and coherency attribute provided by the TLB.

If PREFE results in a memory operation, the memory access type and cacheability & coherency attribute used for the operation are determined by the memory access type and cacheability & coherency attribute of the effective address, just as it would be if the memory operation had been caused by a load or store to the effective address.

For a cached location, the expected and useful action for the processor is to prefetch a block of data that includes the effective address. The size of the block and the level of the memory hierarchy it is fetched into are implementation specific.

In coherent multiprocessor implementations, if the effective address uses a coherent Cacheability and Coherency

Attribute (CCA), then the instruction causes a coherent memory transaction to occur. This means a prefetch issued on one processor can cause data to be evicted from the cache in another processor.

The PREFE instruction and the memory transactions which are sourced by the PREFE instruction, such as cache refill or cache writeback, obey the ordering and completion rules of the SYNC instruction.

The PREFE instruction functions in exactly the same fashion as the PREF instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible. Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to one.

Table 5.3 Values of hint Field for PREFE Instruction

Value

Name

Data Use and Desired Prefetch Action

0

load

Use: Prefetched data is expected to be read (not modified). Action: Fetch data as if for a load.

1

store

Use: Prefetched data is expected to be stored or modified. Action: Fetch data as if for a store.

2

L1 LRU hint

Pre-Release 6: Reserved for Architecture. Release 6: Implementation dependent. This hint code marks the line as LRU in the L1 cache and thus preferred for next eviction. Implementations can choose to writeback and/or invalidate as long as no architectural state is modified.

3

Reserved for Implementation

Pre-Release 6: Reserved for Architecture. Release 6: Available for implementation-dependent use.

4

load_streamed

Use: Prefetched data is expected to be read (not modified) but not reused extensively; it "streams" through cache. Action: Fetch data as if for a load and place it in the cache so that it does not displace data prefetched as "retained."

5

store_streamed

Use: Prefetched data is expected to be stored or modified but not reused extensively; it "streams" through cache. Action: Fetch data as if for a store and place it in the cache so that it does not displace data prefetched as "retained."

6

load_retained

Use: Prefetched data is expected to be read (not modified) and reused extensively; it should be "retained" in the cache. Action: Fetch data as if for a load and place it in the cache so that it is not displaced by data prefetched as "streamed."

7

store_retained

Use: Prefetched data is expected to be stored or modified and reused extensively; it should be "retained" in the cache. Action: Fetch data as if for a store and place it in the cache so that it is not displaced by data prefetched as "streamed."

8-15

L2 operation

Pre-Release 6: Reserved for Architecture. Release 6: Hint codes 8 - 15 are treated the same as hint codes 0 - 7 respectively, but operate on the L2 cache.

16-23

L3 operation

Pre-Release 6: Reserved for Architecture. Release 6: Hint codes 16 - 23 are treated the same as hint codes 0 - 7 respectively, but operate on the L3 cache.

24

Reserved for Architecture

Pre-Release 6: Unassigned by the Architecture - available for implementationdependent use. Release 6: This hint code is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

25

writeback_invalidate (also known as "nudge") Reserved for Architecture in Release 6

Pre-Release 6: Use-Data is no longer expected to be used. Action-For a writeback cache, schedule a writeback of any dirty data. At the completion of the writeback, mark the state of any cache lines written back as invalid. If the cache line is not dirty, it is implementation dependent whether the state of the cache line is marked invalid or left unchanged. If the cache line is locked, no action is taken. Release 6: This hint code is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

26-29

Reserved for Architecture

Pre-Release 6: Unassigned by the Architecture - available for implementationdependent use. Release 6: These hint codes are not implemented in the Release 6 architecture and generate a Reserved Instruction exception (RI).

30

PrepareForStore Reserved for Architecture in Release 6

Pre-Release 6: Use-Prepare the cache for writing an entire line, without the overhead involved in filling the line from memory. Action-If the reference hits in the cache, no action is taken. If the reference misses in the cache, a line is selected for replacement, any valid and dirty victim is written back to memory, the entire line is filled with zero data, and the state of the line is marked as valid and dirty. Programming Note: Because the cache line is filled with zero data on a cache miss, software must not assume that this action, in and of itself, can be used as a fast bzero-type function. Release 6: This hint code is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

31

Reserved for Architecture

Pre-Release 6: Unassigned by the Architecture - available for implementationdependent use. Release 6: This hint code is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

This instruction does not produce an exception for a misaligned memory address, since it has no memory access size.

Operation:

vAddr = GGPR[base] + sign_extend(offset)
(pAddr, CCA) = AddressTranslation(vAddr, DATA, LOAD)
Prefetch(CCA, pAddr, vAddr, DATA, hint)

Exceptions:

Bus Error, Cache Error, Address Error, Reserved Instruction, Coprocessor Usable

Prefetch does not take any TLB-related or address-related exceptions under any circumstances.

Programming Notes:

In the Release 6 architecture, hint codes 0:23 behave as a NOP and never signal a Reserved Instruction exception

(RI). Hint codes 24:31 are not implemented (treated as reserved) and always signal a Reserved Instruction exception

(RI).

Prefetch cannot move data to or from a mapped location unless the translation for that location is present in the TLB.

Locations in memory pages that have not been accessed recently may not have translations in the TLB, so prefetch may not be effective for such locations.

Prefetch does not cause addressing exceptions. A prefetch may be used using an address pointer before the validity of the pointer is determined without worrying about an addressing exception.

It is implementation dependent whether a Bus Error or Cache Error exception is reported if such an error is detected as a byproduct of the action taken by the PREFE instruction. Typically, this only occurs in systems which have highreliability requirements.

Prefetch operations have no effect on cache lines that were previously locked with the CACHE instruction.

Hint field encodings whose function is described as "streamed" or "retained" convey usage intent from software to

hardware. Software should not assume that hardware will always prefetch data in an optimal way. If data is to be truly retained, software should use the Cache instruction to lock data into the cache.

Encoding:

pre-Release 6

PREF

110011

base

hint

offset

6

5

5

16

Release 6

SPECIAL3

011111

base

hint

offset

0

PREF

110101

6

5

5

9

1

6

Format:

PREF hint,offset(base)

MIPS32

Prefetch

Purpose:

Prefetch

To move data between memory and cache.

Description:

 prefetch_memory(GPR[base] + offset)

PREF adds the signed offset to the contents of GPR base to form an effective byte address. The hint field supplies information about the way that the data is expected to be used.

PREF enables the processor to take some action, typically causing data to be moved to or from the cache, to improve program performance. The action taken for a specific PREF instruction is both system and context dependent. Any action, including doing nothing, is permitted as long as it does not change architecturally visible state or alter the meaning of a program. Implementations are expected either to do nothing, or to take an action that increases the performance of the program. The PrepareForStore function is unique in that it may modify the architecturally visible state.

PREF does not cause addressing-related exceptions, including TLB exceptions. If the address specified would cause an addressing exception, the exception condition is ignored and no data movement occurs.However even if no data is moved, some action that is not architecturally visible, such as writeback of a dirty cache line, can take place.

It is implementation dependent whether a Bus Error or Cache Error exception is reported if such an error is detected as a byproduct of the action taken by the PREF instruction.

PREF neither generates a memory operation nor modifies the state of a cache line for a location with an uncached memory access type, whether this type is specified by the address segment (e.g., kseg1), the programmed cacheability and coherency attribute of a segment (e.g., the use of the K0, KU, or K23 fields in the Config register), or the perpage cacheability and coherency attribute provided by the TLB.

If PREF results in a memory operation, the memory access type and cacheability&coherency attribute used for the operation are determined by the memory access type and cacheability&coherency attribute of the effective address, just as it would be if the memory operation had been caused by a load or store to the effective address.

For a cached location, the expected and useful action for the processor is to prefetch a block of data that includes the effective address. The size of the block and the level of the memory hierarchy it is fetched into are implementation specific.

In coherent multiprocessor implementations, if the effective address uses a coherent Cacheability and Coherency

Attribute (CCA), then the instruction causes a coherent memory transaction to occur. This means a prefetch issued on one processor can cause data to be evicted from the cache in another processor.

The PREF instruction and the memory transactions which are sourced by the PREF instruction, such as cache refill or cache writeback, obey the ordering and completion rules of the SYNC instruction.

Table 5.2 Values of hint Field for PREF Instruction

Value

Name

Data Use and Desired Prefetch Action

0

load

Use: Prefetched data is expected to be read (not modified). Action: Fetch data as if for a load.

1

store

Use: Prefetched data is expected to be stored or modified. Action: Fetch data as if for a store.

2

L1 LRU hint

Pre-Release 6: Reserved for Architecture. Release 6: Implementation dependent. This hint code marks the line as LRU in the L1 cache and thus preferred for next eviction. Implementations can choose to writeback and/or invalidate as long as no architectural state is modified.

3

Reserved for Implementation

Pre-Release 6: Reserved for Architecture. Release 6: Available for implementation-dependent use.

4

load_streamed

Use: Prefetched data is expected to be read (not modified) but not reused extensively; it "streams" through cache. Action: Fetch data as if for a load and place it in the cache so that it does not displace data prefetched as "retained."

5

store_streamed

Use: Prefetched data is expected to be stored or modified but not reused extensively; it "streams" through cache. Action: Fetch data as if for a store and place it in the cache so that it does not displace data prefetched as "retained."

6

load_retained

Use: Prefetched data is expected to be read (not modified) and reused extensively; it should be "retained" in the cache. Action: Fetch data as if for a load and place it in the cache so that it is not displaced by data prefetched as "streamed."

7

store_retained

Use: Prefetched data is expected to be stored or modified and reused extensively; it should be "retained" in the cache. Action: Fetch data as if for a store and place it in the cache so that it is not displaced by data prefetched as "streamed."

8-15

L2 operation

Pre-Release 6: Reserved for Architecture. Release 6: In the Release 6 architecture, hint codes 8 - 15 are treated the same as hint codes 0 - 7 respectively, but operate on the L2 cache.

16-23

L3 operation

Pre-Release 6: Reserved for Architecture. Release 6: In the Release 6 architecture, hint codes 16 - 23 are treated the same as hint codes 0 - 7 respectively, but operate on the L3 cache.

24

Reserved for Architecture

Pre-Release 6: Unassigned by the Architecture - available for implementationdependent use. Release 6: This hint code is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

25

writeback_invalidate (also known as "nudge") Reserved for Architecture in Release 6

Pre-Release 6: Use-Data is no longer expected to be used. Action-For a writeback cache, schedule a writeback of any dirty data. At the completion of the writeback, mark the state of any cache lines written back as invalid. If the cache line is not dirty, it is implementation dependent whether the state of the cache line is marked invalid or left unchanged. If the cache line is locked, no action is taken. Release 6: This hint code is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

26-29

Reserved for Architecture

Pre-Release 6: Unassigned by the Architecture-available for implementation-dependent use. Release 6: These hints are not implemented in the Release 6 architecture and generate a Reserved Instruction exception (RI).

30

PrepareForStore Reserved for Architecture in Release 6

Pre-Release 6: Use-Prepare the cache for writing an entire line, without the overhead involved in filling the line from memory. Action-If the reference hits in the cache, no action is taken. If the reference misses in the cache, a line is selected for replacement, any valid and dirty victim is written back to memory, the entire line is filled with zero data, and the state of the line is marked as valid and dirty. Programming Note: Because the cache line is filled with zero data on a cache miss, software must not assume that this action, in and of itself, can be used as a fast bzero-type function. Release 6: This hint is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

31

Reserved for Architecture

Pre-Release 6: Unassigned by the Architecture-available for implementation-dependent use. Release 6: This hint is not implemented in the Release 6 architecture and generates a Reserved Instruction exception (RI).

Restrictions:

None

This instruction does not produce an exception for a misaligned memory address, since it has no memory access size.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

vAddr = GPR[base] + sign_extend(offset)
(pAddr, CCA) = AddressTranslation(vAddr, DATA, LOAD)
Prefetch(CCA, pAddr, vAddr, DATA, hint)

Exceptions:

Bus Error, Cache Error

Prefetch does not take any TLB-related or address-related exceptions under any circumstances.

Programming Notes:

In the Release 6 architecture, hint codes 2:3, 10:11, 18:19 behave as a NOP if not implemented. Hint codes 24:31 are not implemented (treated as reserved) and always signal a Reserved Instruction exception (RI).

As shown in the instruction drawing above, Release 6 implements a 9-bit offset, whereas all release levels lower than

Release 6 of the MIPS architecture implement a 16-bit offset.

Prefetch cannot move data to or from a mapped location unless the translation for that location is present in the TLB.

Locations in memory pages that have not been accessed recently may not have translations in the TLB, so prefetch may not be effective for such locations.

Prefetch does not cause addressing exceptions. A prefetch may be used using an address pointer before the validity of the pointer is determined without worrying about an addressing exception.

It is implementation dependent whether a Bus Error or Cache Error exception is reported if such an error is detected as a byproduct of the action taken by the PREF instruction. Typically, this only occurs in systems which have highreliability requirements.

Prefetch operations have no effect on cache lines that were previously locked with the CACHE instruction.

Hint field encodings whose function is described as "streamed" or "retained" convey usage intent from software to

hardware. Software should not assume that hardware will always prefetch data in an optimal way. If data is to be truly retained, software should use the Cache instruction to lock data into the cache.

Encoding:

COP1X

010011

base

index

hint

0

00000

PREFX

001111

6

5

5

5

5

6

Format:

PREFX hint, index(base)

MIPS64, MIPS32 Release 2, removed in Release 6

Prefetch Indexed

Purpose:

Prefetch Indexed

To move data between memory and cache.

Description:

 prefetch_memory[GPR[base] + GPR[index]]

PREFX adds the contents of GPR index to the contents of GPR base to form an effective byte address. The hint field supplies information about the way the data is expected to be used.

The only functional difference between the PREF and PREFX instructions is the addressing mode implemented by the two. Refer to the PREF instruction for all other details, including the encoding of the hint field.

Restrictions:

Availability and Compatibility:

Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32 Release 1. Required by

MIPS32 Release 2 and subsequent versions of MIPS32. When required, required whenever FPU is present, whether a

32-bit or 64-bit FPU, whether in 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1, StatusFR=0 or 1).

This instruction has been removed in Release 6.

Operation:

vAddr = GPR[base] + GPR[index]
(pAddr, CCA) = AddressTranslation(vAddr, DATA, LOAD)
Prefetch(CCA, pAddr, vAddr, DATA, hint)

Exceptions:

Coprocessor Unusable, Reserved Instruction, Bus Error, Cache Error

Programming Notes:

The PREFX instruction is only available on processors that implement floating point and should never by generated by compilers in situations other than those in which the corresponding load and store indexed floating point instructions are generated.

Refer to the corresponding section in the PREF instruction description.

Encoding:

COP1

010001

fmt

10110

ft

fs

fd

PUL

101110

6

5

5

5

5

6

Format:

PUL.PS fd, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Pair Upper Lower

Purpose:

Pair Upper Lower

To merge a pair of paired single values with realignment.

Description:

 FPR[fd] = upper(FPR[fs]) || lower(FPR[ft])

A new paired-single value is formed by catenating the upper single of FPR fs (bits 63..32) and the lower single of

FPR ft (bits 31..0).

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is

UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

StoreFPR(fd, PS, ValueFPR(fs, PS)63..32 || ValueFPR(ft, PS)31..0)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

COP1

010001

fmt

10110

ft

fs

fd

PUU

101111

6

5

5

5

5

6

Format:

PUU.PS fd, fs, ft 

MIPS64, MIPS32 Release 2,, removed in Release 6

Pair Upper Upper

Purpose:

Pair Upper Upper

To merge a pair of paired single values with realignment.

Description:

 FPR[fd] = upper(FPR[fs]) || upper(FPR[ft])

A new paired-single value is formed by catenating the upper single of FPR fs (bits 63..32) and the upper single of

FPR ft (bits 63..32).

The move is non-arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is

UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

StoreFPR(fd, PS, ValueFPR(fs, PS)63..32 || ValueFPR(ft, PS)63..32)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL3

011111

0

00000

rt

rd

0

00

sel

RDHWR

111011

6

5

5

5

2

3

6

Format:

RDHWR rt,rd,sel

MIPS32 Release 2

Read Hardware Register

Purpose:

Read Hardware Register

To move the contents of a hardware register to a general purpose register (GPR) if that operation is enabled by privileged software.

The purpose of this instruction is to give user mode access to specific information that is otherwise only visible in kernel mode.

In Release 6, a sel field has been added to allow a register with multiple instances to be read selectively. Specifically it is used for PerfCtr.

Description:

 GPR[rt] = HWR[rd]; GPR[rt] = HWR[rd, sel]

If access is allowed to the specified hardware register, the contents of the register specified by rd (optionally sel in

Release 6) is sign-extended and loaded into general register rt. Access control for each register is selected by the bits in the coprocessor 0 HWREna register.

The available hardware registers, and the encoding of the rd field for each, are shown in Table 5.4.

Table 5.4 RDHWR Register Numbers

Register

Number

(rd Value)

Mnemonic

Description

0

CPUNum

Number of the CPU on which the program is currently running. This register provides read access to the coprocessor 0 EBaseCPUNum field.

1

SYNCI_Step

Address step size to be used with the SYNCI instruction, or zero if no caches need be synchronized. See that instruction's description for the use of this value.

2

CC

High-resolution cycle counter. This register provides read access to the coprocessor 0 Count Register.

3

CCRes

Resolution of the CC register. This value denotes the number of cycles between update of the register. For example:

CCRes ValueMeaning

1CC register increments every CPU cycle 2CC register increments every second CPU cycle 3CC register increments every third CPU cycle etc.

4

PerfCtr

Performance Counter Pair. Even sel selects the Control register, while odd sel selects the Counter register in the pair. The value of sel corresponds to the value of sel used by MFC0 to read the CP0 register.

5

XNP

Indicates support for the Release 6 Paired LL/SC family of instructions. If set to 1, the LL/SC family of instructions is not present, otherwise, it is present in the implementation. In absence of hardware support for double-width or extended atomics, user software may emulate the instruction's behavior through other means. See Config5XNP.

6-28

These registers numbers are reserved for future architecture use. Access results in a Reserved Instruction Exception.

29

ULR

User Local Register. This register provides read access to the coprocessor 0 UserLocal register, if it is implemented. In some operating environments, the UserLocal register is a pointer to a thread-specific storage block.

30-31

These register numbers are reserved for implementation-dependent use. If they are not implemented, access results in a Reserved Instruction Exception.

Restrictions:

In implementations of Release 1 of the Architecture, this instruction resulted in a Reserved Instruction Exception.

Access to the specified hardware register is enabled if Coprocessor 0 is enabled, or if the corresponding bit is set in the HWREna register. If access is not allowed or the register is not implemented, a Reserved Instruction Exception is signaled.

In Release 6, when the 3-bit sel is undefined for use with a specific register number, then a Reserved Instruction

Exception is signaled.

Availability and Compatibility:

This instructions has been recoded for Release 6. The instruction supports a sel field in Release 6.

Operation:

   if ((rs!=4) and (sel==0))
case rd
      0: temp = sign_extend(EBaseCPUNum)
   1: temp = sign_extend(SYNCI_StepSize())
   2: temp = sign_extend(Count)
   3: temp = sign_extend(CountResolution())
      if (>=2) // #5 - Release 6
         5: temp = sign_extend(0x00000001 && Config5XNP)//zero-extend really
      endif
   29: temp = sign_extend_if_32bit_op(UserLocal)
          endif 
   30: temp = sign_extend_if_32bit_op(Implementation-Dependent-Value)
   31: temp = sign_extend_if_32bit_op(Implementation-Dependent-Value)
   otherwise: SignalException(ReservedInstruction)
endcase
   elseif ((rs==4) and (>=2) and (sel==defined)// #4 - Release 6
      temp = sign_extend_if_32bit_op(PerfCtr[sel])
   else 
   endif
GPR[rt] = temp
function sign_extend_if_32bit_op(value)
   if (width(value) = 64) and Are64BitOperationsEnabled() then
      sign_extend_if_32bit_op = value
   else
      sign_extend_if_32bit_op = sign_extend(value)
   endif
end sign_extend_if_32bit_op

Exceptions:

Reserved Instruction

For a register that does not require sel, the compiler must support an assembly syntax without sel that is 'RDHWR rt, rd'. Another valid syntax is for sel to be 0 to map to pre-Release 6 register numbers which do not require use of sel that is, 'RDHWR rt, rd, 0'.

Encoding:

COP0

0100 00

RDPGPR

01 010

rt

rd

0

000 0000 0000

6

5

5

5

11

Format:

RDPGPR rd, rt

MIPS32 Release 2

Read GPR from Previous Shadow Set

Purpose:

Read GPR from Previous Shadow Set

To move the contents of a GPR from the previous shadow set to a current GPR.

Description:

 GPR[rd] = SGPR[SRSCtlPSS, rt]

The contents of the shadow GPR register specified by SRSCtlPSS (signifying the previous shadow set number) and rt

(specifying the register number within that set) is moved to the current GPR rd.

Restrictions:

In implementations prior to Release 2 of the Architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

GPR[rd] = SGPR[SRSCtlPSS, rt]

Exceptions:

Coprocessor Unusable

Reserved Instruction

Encoding:

COP1

010001

fmt

0

00000

fs

fd

RECIP

010101

6

5

5

5

5

6

Format:

RECIP.fmt 

Reciprocal Approximation

RECIP.S   fd, fs

MIPS64, MIPS32 Release 2

Reciprocal Approximation

RECIP.D   fd, fs

MIPS64, MIPS32 Release 2

Reciprocal Approximation

Purpose:

Reciprocal Approximation

To approximate the reciprocal of an FP value (quickly).

Description:

 FPR[fd] = 1.0 / FPR[fs]

The reciprocal of the value in FPR fs is approximated and placed into FPR fd. The operand and result are values in format fmt.

The numeric accuracy of this operation is implementation dependent. It does not meet the accuracy specified by the

IEEE 754 Floating Point standard. The computed result differs from the both the exact result and the IEEE-mandated representation of the exact result by no more than one unit in the least-significant place (ULP).

It is implementation dependent whether the result is affected by the current rounding mode in FCSR.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Availability and Compatibility:

RECIP.S and RECIP.D: Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32

Release 1. Required in MIPS32 Release 2 and all subsequent versions of MIPS32. When required, required whenever

FPU is present, whether a 32-bit or 64-bit FPU, whether in 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1,

StatusFR=0 or 1).

Operation:

StoreFPR(fd, fmt, 1.0 / valueFPR(fs, fmt))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Division-by-zero, Unimplemented Op, Invalid Op, Overflow, Underflow

Encoding:

COP1

010001

fmt

00000

fs

fd

RINT

011010

6

5

5

5

5

6

Format:

RINT.fmt 

MIPS32 Release 6

Floating-Point Round to Integral

RINT.S fd,fs

MIPS32 Release 6

Floating-Point Round to Integral

RINT.D fd,fs

MIPS32 Release 6

Floating-Point Round to Integral

Purpose:

Floating-Point Round to Integral

Scalar floating-point round to integral floating point value.

Description:

FPR[fd] =round_int(FPR[fs])

The scalar floating-point value in the register fs is rounded to an integral valued floating-point number in the same format based on the rounding mode bits RM in the FPU Control and Status Register FCSR. The result is written to fd.

The operands and results are values in floating-point data format fmt.

The RINT.fmt instruction corresponds to the roundToIntegralExact operation in the IEEE Standard for FloatingPoint Arithmetic 754TM-2008. The Inexact exception is signaled if the result does not have the same numerical value as the input operand.

The floating point scalar instruction RINT.fmt corresponds to the MSA vector instruction FRINT.df. I.e. RINT.S corresponds to FRINT.W, and RINT.D corresponds to FRINT.D.

Restrictions:

Data-dependent exceptions are possible as specified by the IEEE Standard for Floating-Point Arithmetic 754TM2008.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Operation:

RINT fmt:

if not IsCoprocessorEnabled(1) 
   then SignalException(CoprocessorUnusable, 1) endif
if not IsFloatingPointImplemented(fmt)) 
   then SignalException(ReservedInstruction) endif
fin = ValueFPR(fs,fmt)
ftmp =RoundIntFP(fin, fmt)
if( fin != ftmp ) SignalFPException(InExact)
StoreFPR (fd, fmt, ftmp )
function RoundIntFP(tt, n)
   /* Round to integer operation, using rounding mode FCSR.RM*/
endfunction RoundIntFP

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Inexact, Overflow, Underflow

Encoding:

SPECIAL

000000

0000

R

1

rt

rd

sa

SRL

000010

6

4

1

5

5

5

6

Format:

ROTR rd, rt, sa 

SmartMIPS Crypto, MIPS32 Release 2

Rotate Word Right

Purpose:

Rotate Word Right

To execute a logical right-rotate of a word by a fixed number of bits.

Description:

 GPR[rd] = GPR[rt] <=(right) sa

The contents of the low-order 32-bit word of GPR rt are rotated right; the word result is sign-extended and placed in

GPR rd. The bit-rotate amount is specified by sa.

Restrictions:

If GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) or
   ((ArchitectureRevision() < 2) and (Config3SM = 0)) then
   UNPREDICTABLE 
endif
s = sa
temp = GPR[rt]s-1..0 || GPR[rt]31..s
GPR[rd] = sign_extend(temp)

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

rs

rt

rd

0000

R

1

SRLV

000110

6

5

5

5

4

1

6

Format:

ROTRV rd, rt, rs

SmartMIPS Crypto, MIPS32 Release 2

Rotate Word Right Variable

Purpose:

Rotate Word Right Variable

To execute a logical right-rotate of a word by a variable number of bits.

Description:

 GPR[rd] = GPR[rt] <=(right) GPR[rs]

The contents of the low-order 32-bit word of GPR rt are rotated right; the word result is sign-extended and placed in

GPR rd. The bit-rotate amount is specified by the low-order 5 bits of GPR rs.

Restrictions:

If GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) or
   ((ArchitectureRevision() < 2) and (Config3SM = 0)) then
   UNPREDICTABLE 
endif
s = GPR[rs]4..0
temp = GPR[rt]s-1..0 || GPR[rt]31..s
GPR[rd] = sign_extend(temp)

Exceptions:

Reserved Instruction

Encoding:

COP1

010001

fmt

0

00000

fs

fd

ROUND.L

001000

6

5

5

5

5

6

Format:

ROUND.L.fmt 

Floating Point Round to Long Fixed Point

ROUND.L.S   fd, fs

MIPS64, MIPS32 Release 2

Floating Point Round to Long Fixed Point

ROUND.L.D   fd, fs

MIPS64, MIPS32 Release 2

Floating Point Round to Long Fixed Point

Purpose:

Floating Point Round to Long Fixed Point

To convert an FP value to 64-bit fixed point, rounding to nearest.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 64-bit long fixed point format and rounded to nearest/ even (rounding mode 0). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The Invalid Operation flag is set in the FCSR.

If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

263-1. On cores with FCSRNAN2008=1, the default result is:

263-1 when the input value is +inf or rounds to a number larger than 263-1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for long fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. It is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Operation:

StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation

Encoding:

COP1

010001

fmt

0

00000

fs

fd

ROUND.W

001100

6

5

5

5

5

6

Format:

ROUND.W.fmt 

Floating Point Round to Word Fixed Point

ROUND.W.S   fd, fs

MIPS32

Floating Point Round to Word Fixed Point

ROUND.W.D   fd, fs

MIPS32

Floating Point Round to Word Fixed Point

Purpose:

Floating Point Round to Word Fixed Point

To convert an FP value to 32-bit fixed point, rounding to nearest.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 32-bit word fixed point format rounding to nearest/even

(rounding mode 0). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. The Invalid Operation flag is set in the FCSR.

If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is

231-1. On cores with FCSRNAN2008=1, the default result is:

231-1 when the input value is +inf or rounds to a number larger than 231-1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for word fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Unimplemented Operation, Invalid Operation

Encoding:

COP1

010001

fmt

0

00000

fs

fd

RSQRT

010110

6

5

5

5

5

6

Format:

RSQRT.fmt 

Reciprocal Square Root Approximation

RSQRT.S   fd, fs

MIPS64, MIPS32 Release 2

Reciprocal Square Root Approximation

RSQRT.D   fd, fs

MIPS64, MIPS32 Release 2

Reciprocal Square Root Approximation

Purpose:

Reciprocal Square Root Approximation

To approximate the reciprocal of the square root of an FP value (quickly).

Description:

 FPR[fd] = 1.0 / sqrt(FPR[fs])

The reciprocal of the positive square root of the value in FPR fs is approximated and placed into FPR fd. The operand and result are values in format fmt.

The numeric accuracy of this operation is implementation dependent; it does not meet the accuracy specified by the

IEEE 754 Floating Point standard. The computed result differs from both the exact result and the IEEE-mandated representation of the exact result by no more than two units in the least-significant place (ULP).

The effect of the current FCSR rounding mode on the result is implementation dependent.

Restrictions:

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Availability and Compatibility:

RSQRT.S and RSQRT.D: Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32

Release 1. Required in MIPS32 Release 2 and all subsequent versions of MIPS32. When required, required whenever

FPU is present, whether a 32-bit or 64-bit FPU, whether in 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1,

StatusFR=0 or 1).

Operation:

StoreFPR(fd, fmt, 1.0 / SquareRoot(valueFPR(fs, fmt)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Division-by-zero, Unimplemented Operation, Invalid Operation, Overflow, Underflow

Encoding:

SPECIAL3

011111

base

rt

offset

0

SBE

011100

6

5

5

9

1

6

Format:

SBE rt, offset(base)

MIPS32

Store Byte EVA

Purpose:

Store Byte EVA

To store a byte to user mode virtual address space when executing in kernel mode.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The least-significant 8-bit byte of GPR rt is stored in memory at the location specified by the effective address. The

9-bit signed offset is added to the contents of GPR base to form the effective address.

The SBE instruction functions the same as the SB instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to 1.

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
bytesel = vAddr2..0 xor BigEndianCPU3
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
StoreMemory (CCA, BYTE, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable,

Encoding:

SB

101000

base

rt

offset

6

5

5

16

Format:

SB rt, offset(base)

MIPS32

Store Byte

Purpose:

Store Byte

To store a byte to memory.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The least-significant 8-bit byte of GPR rt is stored in memory at the location specified by the effective address. The

16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

None

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
bytesel = vAddr2..0 xor BigEndianCPU3
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
StoreMemory (CCA, BYTE, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Watch

Encoding:

SPECIAL3

011111

base

rt

rd

0

0000

1

SCD

100111

6

5

5

5

4

1

6

Format:

SCDP rt, rd, (base)

MIPS32 Release 6

Store Conditional DoubleWord Paired

Purpose:

Store Conditional DoubleWord Paired

Conditionally store a paired double-word to memory to complete an atomic read-modify-write

Description:

 if atomic_update then memory[GPR[base]]= {GPR[rd],GPR[rt]}, GPR[rt] = 1 
else GPR[rt] = 0

The LLDP and SCDP instructions provide primitives to implement a paired double-word atomic read-modify-write

(RMW) operation at a synchronizable memory location.

Release 6 (with Config5ULS =1) formalizes support for uncached LLDP and SCDP sequences. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

A paired double-word is conditionally written to memory in a single atomic memory operation. GPR rd is the mostsignificant double-word and GPR rt is the least-significant double-word of the quad-word in memory. The write occurs to a quad-word aligned effective address from GPR base.

A paired double-word read or write occurs as a pair of double-word reads or writes that is quad-word atomic.

The instruction has no offset. The effective address is equal to the contents of GPR base.

The SCDP completes the RMW sequence begun by the preceding LLDP instruction executed on the processor. To complete the RMW sequence atomically, the following occur:

Otherwise, memory is not modified and a 0, indicating failure, is written into GPR rt.

Though legal programming requires LLDP to start the atomic read-modify-write sequence and SCDP to end the same sequence, whether the SCDP completes is only dependent on the state of LLbit and LLAddr, which are set by a preceding load-linked instruction of any type. Software must assume that pairing load-linked and store-conditional instructions in an inconsistent manner causes UNPREDICTABLE behavior.

The SCDP must always compare its quad-word aligned address against that of the preceding LLDP. The SCDP will fail if the address does not match that of the preceding LLDP.

Events that occur between the execution of load-linked and store-conditional instruction types that must cause the sequence to fail are given in the legacy SCD instruction definition, except the block of synchronizable memory is a

quadword, not doubleword.

Additional events that occur between the execution of load-linked and store-conditional instruction types that may cause success of the sequence to be UNPREDICTABLE are defined in the SCD instruction definition.

A load that executes on the processor executing the LLDP/SCDP sequence to the block of synchronizable physical memory containing the paired double-word, will not cause the SCDP to fail.

Effect of CACHE operations, both local and remote, on a paired double-word atomic operation are defined in the SC instruction definition.

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LL/SC semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location. Requirements for Uniprocessor, MP and I/O atomicity are given in the SC definition.

The definition for SCDP is extended for uncached memory types in a manner identical to SC. The extension is defined in the SC instruction description.

Restrictions:

Load-Linked and Store-Conditional instruction types require that the addressed location must have a memory access type of cached noncoherent or cached coherent, that is the processor must have a cache. If it does not, the result is

UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.

Providing misaligned support is not a requirement for this instruction.

Availability and Compatibility

This instruction is introduced by Release 6. It is only present if Config5XNP=0.

Operation:

vAddr = GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
dataquadword = {GPR[rd],GPR[rt]}
if (LLbit && (pAddr == LLAddr))then // quadword aligned monitor
   // PAIREDDOUBLEWORD: two double-word data-type that is quad-word atomic
   StoreMemory (CCA, PAIREDDOUBLEWORD, dataquadword, pAddr, vAddr, DATA)
   GPR[rt] = 063 || 1’b1
else
   GPR[rt] = 064
endif
LLbit = 0

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Reserved Instruction, Address Error, Watch

Programming Notes:

LLDP and SCDP are used to atomically update memory locations, as shown below.

L1:
   LLDP  T2, T3, (T0)# load T2 and T3
   BOVC  T2, 1, U32# check whether least-significant double-word may overflow
   ADDI  T2, T2, 1 # increment lower - only
   SCDP  T2, T3, (T0)  # store T2 and T3
   BEQC  T2, 0, L1 # if not atomic (0), try again
U32:
   ADDI  T2, T2, 1 # increment lower
   ADDI  T3, T3, 1 # increment upper
   SCDP  T2, T3, (T0)
   BEQC  T2, 0, L1 # if not atomic (0), try again

Exceptions between the LLDP and SCDP cause SC to fail, so persistent exceptions must be avoided. Some examples of these are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LLDP and SCDP function on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.

Encoding:

pre-Release 6

SCD

111100

base

rt

offset

6

5

5

16

Release 6

SPECIAL3

011111

base

rt

offset

0

SCD

100111

6

5

5

9

1

6

Format:

SCD rt, offset(base)

MIPS64

Store Conditional Doubleword

Purpose:

Store Conditional Doubleword

To store a doubleword to memory to complete an atomic read-modify-write.

Description:

if atomic_update then memory[GPR[base] + offset] = GPR[rt], GPR[rt] = 1 else 
GPR[rt] = 0

The LLD and SCD instructions provide primitives to implement atomic read-modify-write (RMW) operations for synchronizable memory locations.

Release 6 (with Config5ULS =1) formalizes support for uncached LLD and SCD sequences, whereas the preRelease 6 LLD and SCD description applies to cached (coherent/non-coherent) memory types. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

The 64-bit doubleword in GPR rt is conditionally stored in memory at the location specified by the aligned effective address. The signed offset is added to the contents of GPR base to form an effective address.

The SCD completes the RMW sequence begun by the preceding LLD instruction executed on the processor. If SCD completes the RMW sequence atomically, the following occurs:

The 64-bit doubleword of GPR rt is stored into memory at the location specified by the aligned effective address.

If either of the following events occurs between the execution of LLD and SCD, the SCD fails:

If either of the following events occurs between the execution of LLD and SCD, the SCD may succeed or it may fail; the success or failure is not predictable. Portable programs should not cause the following events:

The following two conditions must be true or the result of the SCD is UNPREDICTABLE:

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LL/SC semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location:

Uniprocessor atomicity: To provide atomic RMW on a single processor, all accesses to the location must be

MP atomicity: To provide atomic RMW among multiple processors, all accesses to the location must be made

I/O System: To provide atomic RMW with a coherent I/O system, all accesses to the location must be made with

Release 6 (with Config5ULS =1) formally defines support for uncached LLD and SCD with the following constraints.

It is implementation dependent as to what form the monitor takes. It is however differentiated from cached LLD and SCD which rely on a coherence protocol to make the determination as to whether the sequence succeeds.

As emphasized above, it is not recommended that software mix memory access types during LLD and SCD sequences. That is all memory accesses must be of the same type, otherwise this may result in UNPREDICTABLE behavior.

Conditions that cause UNPREDICTABLE behavior for legacy cached LLD and SCD sequences may also cause such behavior for uncached sequences.

A PAUSE instruction is no-op'd when it is preceded by an uncached LLD.

The semantics of an uncached LLD/SCD atomic operation applies to any uncached CCA including UCA (UnCached

Accelerated). An implementation that supports UCA must guarantee that SCD does not participate in store gathering and that it ends any gathering initiated by stores preceding the SCD in program order when the SCD address coincides with a gathering address.

Restrictions:

The addressed location must have a memory access type of cached non coherent or cached coherent; if it does not, the result is UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The effective address must be naturally-aligned. If any of the 3 least-significant bits of the address is non-zero, an

Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr2..0 != 03 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
datadoubleword = GPR[rt]
if LLbit then
   StoreMemory (CCA, DOUBLEWORD, datadoubleword, pAddr, vAddr, DATA)
endif
GPR[rt] = 063 || LLbit

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Reserved Instruction, Watch

Programming Notes:

LLD and SCD are used to atomically update memory locations, as shown below.

L1:
   LLD   T1, (T0)  # load counter
   ADDI  T2, T1, 1 # increment
   SCD   T2, (T0)  # try to store,
                   #  checking for atomicity
   BEQ   T2, 0, L1 # if not atomic (0), try again
   NOP             # branch-delay slot

Exceptions between the LLD and SCD cause SCD to fail, so persistent exceptions must be avoided. Examples are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LLD and SCD function on a single processor for cached non coherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.

Encoding:

SPECIAL3

011111

base

rt

offset

0

SCE

011110

6

5

5

9

1

6

Format:

SCE rt, offset(base)

MIPS32

Store Conditional Word EVA

Purpose:

Store Conditional Word EVA

To store a word to user mode virtual memory while operating in kernel mode to complete an atomic read-modifywrite.

Description:

 if atomic_update then memory[GPR[base] + offset] = GPR[rt], GPR[rt] = 1 else 
GPR[rt] = 0

The LL and SC instructions provide primitives to implement atomic read-modify-write (RMW) operations for synchronizable memory locations.

Release 6 (with Config5ULS =1) formalizes support for uncached LLE and SCE sequences. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

The least-significant 32-bit word in GPR rt is conditionally stored in memory at the location specified by the aligned effective address. The 9-bit signed offset is added to the contents of GPR base to form an effective address.

The SCE completes the RMW sequence begun by the preceding LLE instruction executed on the processor. To complete the RMW sequence atomically, the following occurs:

Otherwise, memory is not modified and a 0, indicating failure, is written into GPR rt.

If either of the following events occurs between the execution of LL and SC, the SC fails:

If either of the following events occurs between the execution of LLE and SCE, the SCE may succeed or it may fail; the success or failure is not predictable. Portable programs should not cause one of these events.

The following conditions must be true or the result of the SCE is UNPREDICTABLE:

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LLE/SCE semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location:

made with memory access type of either cached non coherent or cached coherent. All accesses must be to one or the other access type, and they may not be mixed.

with a memory access type of cached coherent.

a memory access type of cached coherent. If the I/O system does not use coherent memory operations, then atomic RMW cannot be provided with respect to the I/O reads and writes.

The SCE instruction functions the same as the SC instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to 1.

The definition for SCE is extended for uncached memory types in a manner identical to SC. The extension is defined in the SC instruction description.

Restrictions:

The addressed location must have a memory access type of cached non coherent or cached coherent; if it does not, the result is UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an

Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr1..0 != 02 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
if LLbit then
   StoreMemory (CCA, WORD, datadoubleword, pAddr, vAddr, DATA)
endif
GPR[rt] = 063 || LLbit

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Programming Notes:

LLE and SCE are used to atomically update memory locations, as shown below.

L1:
   LLE   T1, (T0)  # load counter
   ADDI  T2, T1, 1 # increment
   SCE   T2, (T0)  # try to store, checking for atomicity
   BEQ   T2, 0, L1 # if not atomic (0), try again
   NOP             # branch-delay slot

Exceptions between the LLE and SCE cause SCE to fail, so persistent exceptions must be avoided. Examples are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LLE and SCE function on a single processor for cached non coherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.

Encoding:

pre-Release 6

SC

111000

base

rt

offset

6

5

5

16

Release 6

SPECIAL3

011111

base

rt

offset

0

SC

100110

6

5

5

9

1

6

Format:

SC rt, offset(base)

MIPS32

Store Conditional Word

Purpose:

Store Conditional Word

To store a word to memory to complete an atomic read-modify-write

Description:

 if atomic_update then memory[GPR[base] + offset] = GPR[rt], GPR[rt] = 1 else GPR[rt] = 0

The LL and SC instructions provide primitives to implement atomic read-modify-write (RMW) operations on synchronizable memory locations. In Release 5, the behavior of SC is modified when Config5LLB=1.

Release 6 (with Config5ULS =1) formalizes support for uncached LL and SC sequences, whereas the pre-Release 6

LL and SC description applies to cached (coherent/non-coherent) memory types. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

The least-significant 32-bit word in GPR rt is conditionally stored in memory at the location specified by the aligned effective address. The signed offset is added to the contents of GPR base to form an effective address.

The SC completes the RMW sequence begun by the preceding LL instruction executed on the processor. To complete the RMW sequence atomically, the following occur:

The least-significant 32-bit word of GPR rt is stored to memory at the location specified by the aligned effective

A one, indicating success, is written into GPR rt.

If either of the following events occurs between the execution of LL and SC, the SC fails:

Furthermore, an SC must always compare its address against that of the LL. An SC will fail if the aligned address of the SC does not match that of the preceding LL.

A load that executes on the processor executing the LL/SC sequence to the block of synchronizable physical memory containing the word, will not cause the SC to fail (if Config5LLB=1; else such a load may cause the SC to fail).

If any of the events listed below occurs between the execution of LL and SC, the SC may fail where it could have succeeded, i.e., success is not predictable. Portable programs should not cause any of these events.

CACHE operations that are local to the processor executing the LL/SC sequence will result in unpredictable behaviour of the SC if executed between the LL and SC, that is, they may cause the SC to fail where it could have succeeded. Non-local CACHE operations (address-type with coherent CCA) may cause an SC to fail on either the local processor or on the remote processor in multiprocessor or multi-threaded systems. This definition of the effects of

CACHE operations is mandated if Config5LLB=1. If Config5LLB=0, then CACHE effects are implementation-dependent.

The following conditions must be true or the result of the SC is not predictable-the SC may fail or succeed (if

Config5LLB=1, then either success or failure is mandated, else the result is UNPREDICTABLE):

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LL/SC semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location:

made with memory access type of either cached noncoherent or cached coherent. All accesses must be to one or the other access type, and they may not be mixed.

with a memory access type of cached coherent.

a memory access type of cached coherent. If the I/O system does not use coherent memory operations, then atomic RMW cannot be provided with respect to the I/O reads and writes.

Release 6 (with Config5ULS =1) formally defines support for uncached LL and SC with the following constraints.

It is implementation dependent as to what form the monitor takes. It is however differentiated from cached LL and SC which rely on a coherence protocol to make the determination as to whether the sequence succeeds.

As emphasized above, it is not recommended that software mix memory access types during LL and SC sequences.

That is all memory accesses must be of the same type, otherwise this may result in UNPREDICTABLE behavior.

Conditions that cause UNPREDICTABLE behavior for legacy cached LL and SC sequences may also cause such behavior for uncached sequences.

A PAUSE instruction is no-op'd when it is preceded by an uncached LL.

The semantics of an uncached LL/SC atomic operation applies to any uncached CCA including UCA (UnCached

Accelerated). An implementation that supports UCA must guarantee that SC does not participate in store gathering and that it ends any gathering initiated by stores preceding the SC in program order when the SC address coincides with a gathering address.

Restrictions:

The addressed location must have a memory access type of cached noncoherent or cached coherent; if it does not, the result is UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an

Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Availability and Compatibility

This instruction has been recoded for Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr1..0 != 02 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
if LLbit then
   StoreMemory (CCA, WORD, datadoubleword, pAddr, vAddr, DATA)
endif
GPR[rt] = 063 || LLbit
LLbit = 0 // if Config5LLB=1, SC always clears LLbit regardless of address match.

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Programming Notes:

LL and SC are used to atomically update memory locations, as shown below.

L1:
   LL    T1, (T0)  # load counter
   ADDI  T2, T1, 1 # increment
   SC    T2, (T0)  # try to store, checking for atomicity
   BEQ   T2, 0, L1 # if not atomic (0), try again
   NOP             # branch-delay slot

Exceptions between the LL and SC cause SC to fail, so persistent exceptions must be avoided. Some examples of these are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LL and SC function on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.

As shown in the instruction drawing above, Release 6 implements a 9-bit offset, whereas all release levels lower than

Release 6 of the MIPS architecture implement a 16-bit offset.

Encoding:

SPECIAL3

011111

base

rt

rd

0

0000

1

SCE

011110

6

5

5

5

4

1

6

Format:

SCWPE rt, rd, (base)

MIPS32 Release 6

Store Conditional Word Paired EVA

Purpose:

Store Conditional Word Paired EVA

Conditionally store a paired word to memory to complete an atomic read-modify-write. The store occurs in kernel mode to user virtual address space.

Description:

 if atomic_update then memory[GPR[base]]= {GPR[rd],GPR[rt]}, GPR[rt] = 1 else GPR[rt] = 0

The LLWPE and SCWPE instructions provide primitives to implement a paired word atomic read-modify-write

(RMW) operation at a synchronizable memory location.

Release 6 (with Config5ULS =1) formalizes support for uncached LLWPE and SCWPE sequences. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

A paired word is formed from the concatentation of GPR rd and GPR rt. GPR rd is the most-significant word of the double-word, and GPR rt is the least-significant word of the double-word. The paired word is conditionally stored in memory at the location specified by the double-word aligned effective address from GPR base.

A paired word read or write occurs as a pair of word reads or writes that is double-word atomic.

The instruction has no offset. The effective address is equal to the contents of GPR base.

The SCWPE completes the RMW sequence begun by the preceding LLWPE instruction executed on the processor.

To complete the RMW sequence atomically, the following occur:

Otherwise, memory is not modified and a 0, indicating failure, is written into GPR rt.

Though legal programming requires LLWPE to start the atomic read-modify-write sequence and SCWPE to end the same sequence, whether the SCWPE completes is only dependent on the state of LLbit and LLAddr, which are set by a preceding load-linked instruction of any type. Software must assume that pairing load-linked and store-conditional instructions in an inconsistent manner causes UNPREDICTABLE behavior.

The SCWPE must always compare its double-word aligned address against that of the preceding LLWPE. The

SCWPE will fail if the address does not match that of the preceding LLWPE.

The SCWPE instruction functions the same as the SCWP instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible. Refer to Volume III, Segmentation Control for additional information.

Events that occur between the execution of load-linked and store-conditional instruction types that must cause the sequence to fail are given in the legacy SC instruction definition..

Additional events that occur between the execution of load-linked and store-conditional instruction types that may cause success of the sequence to be UNPREDICTABLE are defined in the SC instruction definition.

A load that executes on the processor executing the LLWPE/SCWPE sequence to the block of synchronizable physical memory containing the paired word, will not cause the SCWPE to fail.

Effect of CACHE operations, both local and remote, on a paired word atomic operation are defined in the SC instruction definition.

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LL/SC semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location. Requirements for Uniprocessor, MP and I/O atomicity are given in the SC definition.

The definition for SCWPE is extended for uncached memory types in a manner identical to SC. The extension is defined in the SC instruction description.

Restrictions:

Load-Linked and Store-Conditional instruction types require that the addressed location must have a memory access type of cached noncoherent or cached coherent, that is the processor must have a cache. If it does not, the result is

UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.

Providing misaligned support is not a requirement for this instruction.

Availability and Compatibility

This instruction is introduced by Release 6. It is only present if Config5XNP=0 and Config5EVA=1.

Operation:

vAddr = GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
datadoubleword31..0 = GPR[rt]31..0
datadoubleword63..32 = GPR[rd]31..0
if (LLbit && (pAddr == LLAddr))then
   // PAIREDWORD: two word data-type that is double-word atomic
   StoreMemory (CCA, PAIREDWORD, datadoubleword, pAddr, vAddr, DATA)
   GPR[rt] = 063 || 1’b1
else
   GPR[rt] = 064
endif
LLbit = 0

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Reserved Instruction, Address Error, Watch, Coprocessor Unusable.

Programming Notes:

LLWPE and SCWPE are used to atomically update memory locations, as shown below.

L1:
   LLWPE T2, T3,(T0)  # load T2 and T3 
   BOVC  T2, 1, U32   # check whether least-significant word may overflow
   ADDI  T2, T2, 1    # increment lower - only
   SCWPE T2, T3, (T0)  # store T2 and T3
   BEQC  T2, 0, L1    # if not atomic (0), try again
U32:
   ADDI  T2, T2, 1    # increment lower
   ADDI  T3, T3, 1    # increment upper
   SCWPE T2, T3, (T0)
   BEQC  T2, 0, L1    # if not atomic (0), try again

Exceptions between the LLWPE and SCWPE cause SC to fail, so persistent exceptions must be avoided. Some examples of these are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LLWPE and SCWPE function on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.

Encoding:

SPECIAL3

011111

base

rt

rd

0

0000

1

SC

100110

6

5

5

5

4

1

6

Format:

SCWP rt, rd, (base)

MIPS32 Release 6

Store Conditional Word Paired

Purpose:

Store Conditional Word Paired

Conditionally store a paired word to memory to complete an atomic read-modify-write.

Description:

 if atomic_update then memory[GPR[base]] = {GPR[rd],GPR[rt]}, GPR[rt] = 1 else GPR[rt] = 0

The LLWP and SCWP instructions provide primitives to implement a paired word atomic read-modify-write (RMW) operation at a synchronizable memory location.

Release 6 (with Config5ULS =1) formalizes support for uncached LLWP and SCWP sequences. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

A paired word is formed from the concatenation of GPR rd and GPR rt. GPR rd is the most-significant word of the paired word, and GPR rt is the least-significant word of the paired word. The paired word is conditionally stored in memory at the location specified by the double-word aligned effective address from GPR base.

A paired word read or write occurs as a pair of word reads or writes that is double-word atomic.

The instruction has no offset. The effective address is equal to the contents of GPR base.

The SCWP completes the RMW sequence begun by the preceding LLWP instruction executed on the processor. To complete the RMW sequence atomically, the following occur:

Otherwise, memory is not modified and a 0, indicating failure, is written into GPR rt.

Though legal programming requires LLWP to start the atomic read-modify-write sequence and SCWP to end the same sequence, whether the SCWP completes is only dependent on the state of LLbit and LLAddr, which are set by a preceding load-linked instruction of any type. Software must assume that pairing load-linked and store-conditional instructions in an inconsistent manner causes UNPREDICTABLE behavior.

The SCWP must always compare its double-word aligned address against that of the preceding LLWP. The SCWP will fail if the address does not match that of the preceding LLWP.

Events that occur between the execution of load-linked and store-conditional instruction types that must cause the sequence to fail are given in the legacy SC instruction description.

Additional events that occur between the execution of load-linked and store-conditional instruction types that may cause success of the sequence to be UNPREDICTABLE are defined in the SC instruction description.

A load that executes on the processor executing the LLWP/SCWP sequence to the block of synchronizable physical memory containing the paired word, will not cause the SCWP to fail.

Effect of CACHE operations, both local and remote, on a paired word atomic operation are defined in the SC instruction description.

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LL/SC semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location. Requirements for Uniprocessor, MP and I/O atomicity are given in the SC definition.

The definition for SCWP is extended for uncached memory types in a manner identical to SC. The extension is defined in the SC instruction description.

Restrictions:

Load-Linked and Store-Conditional instruction types require that the addressed location must have a memory access type of cached noncoherent or cached coherent, that is the processor must have a cache. If it does not, the result is

UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.

Providing misaligned support is not a requirement for this instruction.

Availability and Compatibility

This instruction is introduced by Release 6. It is only present if Config5XNP=0.

Operation:

vAddr = GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
datadoubleword31..0 = GPR[rt]31..0
datadoubleword63..32 = GPR[rd]31..0
if (LLbit && (pAddr == LLAddr))then
// PAIREDWORD: two word data-type that is double-word atomic
   StoreMemory (CCA, PAIREDWORD, datadoubleword, pAddr, vAddr, DATA)
   GPR[rt] = 063 || 1’b1
else
   GPR[rt] = 064
endif
LLbit = 0

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Reserved Instruction, Address Error, Watch

Programming Notes:

LLWP and SCWP are used to atomically update memory locations, as shown below.

L1:
   LLWP  T2, T3, (T0)  # load T2 and T3
   BOVC  T2, 1, U32   # check whether least-significant word may overflow
   ADDI  T2, T2, 1    # increment lower - only
   SCWP  T2, T3, (T0) # store T2 and T3
   BEQC  T2, 0, L1    # if not atomic (0), try again
U32:
   ADDI  T2, T2, 1    # increment lower
   ADDI  T3, T3, 1    # increment upper
   SCWP  T2, T3, (T0)
   BEQC  T2, 0, L1    # if not atomic (0), try again

Exceptions between the LLWP and SCWP cause SC to fail, so persistent exceptions must be avoided. Some examples of these are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LLWP and SCWP function on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.

Encoding:

pre-Release 6

SPECIAL2

011100

code - use syscall

SDBBP

111111

6

20

6

Release 6

SPECIAL

000000

code - use syscall

SDBBP

001110

6

20

6

Format:

SDBBP code

EJTAG

Software Debug Breakpoint

Purpose:

Software Debug Breakpoint

To cause a debug breakpoint exception

Description:

This instruction causes a debug exception, passing control to the debug exception handler. If the processor is executing in Debug Mode when the SDBBP instruction is executed, the exception is a Debug Mode Exception, which sets the DebugDExcCode field to the value 0x9 (Bp). The code field can be used for passing information to the debug exception handler, and is retrieved by the debug exception handler only by loading the contents of the memory word containing the instruction, using the DEPC register. The CODE field is not used in any way by the hardware.

Restrictions:

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

if Config5.SBRI=1 then /* SBRI is a MIPS Release 6 feature */
   SignalException(ReservedInstruction) endif
If DebugDM = 1 then SignalDebugModeBreakpointException() endif // nested 
SignalDebugBreakpointException() // normal

Exceptions:

Debug Breakpoint Exception

Debug Mode Breakpoint Exception

Programming Notes:

Release 6 changes the instruction encoding. The primary opcode changes from SPECIAL2 to SPECIAL. Also it defines a different function field value for SDBBP.

Encoding:

SDC1

111101

base

ft

offset

6

5

5

16

Format:

SDC1 ft, offset(base)

MIPS32

Store Doubleword from Floating Point

Purpose:

Store Doubleword from Floating Point

To store a doubleword from an FPR to memory.

Description:

 memory[GPR[base] + offset] = FPR[ft]

The 64-bit doubleword in FPR ft is stored in memory at the location specified by the aligned effective address. The

16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress2..0 != 0 (not doubleword-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
datadoubleword = ValueFPR(ft, UNINTERPRETED_DOUBLEWORD)
StoreMemory(CCA, DOUBLEWORD, datadoubleword, pAddr, vAddr, DATA)
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
StoreMemory(CCA, WORD, datadoubleword31..0, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
StoreMemory(CCA, WORD, datadoubleword63..32, pAddr, vAddr+4, DATA)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Encoding:

pre-Release 6

SDC2

111110

base

rt

offset

6

5

5

16

Release 6

COP2

010010

SDC2

01111

rt

base

offset

6

5

5

5

11

Format:

SDC2 rt, offset(base)

MIPS32

Store Doubleword from Coprocessor 2

Purpose:

Store Doubleword from Coprocessor 2

To store a doubleword from a Coprocessor 2 register to memory

Description:

 memory[GPR[base] + offset] = CPR[2,rt,0]

The 64-bit doubleword in Coprocessor 2 register rt is stored in memory at the location specified by the aligned effective address. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress2..0 != 0 (not doubleword-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Availability and Compatibility:

This instruction has been recoded for Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
datadoubleword = CPR[2,rt,0]
StoreMemory(CCA, DOUBLEWORD, datadoubleword, pAddr, vAddr, DATA)
lsw = CPR[2,rt,0]
msw = CPR[2,rt+1,0]
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
StoreMemory(CCA, WORD, lsw, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
StoreMemory(CCA, WORD, msw, pAddr, vAddr+4, DATA)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Programming Notes:

As shown in the instruction drawing above, Release 6 implements an 11-bit offset, whereas all release levels lower than Release 6 of the MIPS architecture implement a 16-bit offset.

Encoding:

SDL

101100

base

rt

offset

6

5

5

16

Format:

SDL rt, offset(base)

MIPS64, removed in Release 6

Store Doubleword Left

Purpose:

Store Doubleword Left

To store the most-significant part of a doubleword to an unaligned memory address.

Description:

 memory[GPR[base] + offset] = Some_Bytes_From GPR[rt]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 8 consecutive bytes forming a doubleword (DW) in memory, starting at an arbitrary byte boundary.

A part of DW, the most-significant 1 to 8 bytes, is in the aligned doubleword containing EffAddr. The same number of most-significant (left) bytes of GPR rt are stored into these bytes of DW.

The figure below illustrates this operation for big-endian byte ordering. The 8 consecutive bytes in 2..9 form an unaligned doubleword starting at location 2. A part of DW (6 bytes) is located in the aligned doubleword containing the most-significant byte at 2.

Restrictions:

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory access.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then 
   pAddr = pAddrPSIZE-1..3 || 03
endif
bytesel = vAddr2..0 xor BigEndianCPU3
datadoubleword = 056-8*bytesel || GPR[rt]63..56-8*bytesel
StoreMemory (CCA, byte, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Reserved Instruction, Watch

Encoding:

SDR

101101

base

rt

offset

6

5

5

16

Format:

SDR rt, offset(base)

MIPS64, removed in Release 6

Store Doubleword Right

Purpose:

Store Doubleword Right

To store the least-significant part of a doubleword to an unaligned memory address.

Description:

 memory[GPR[base] + offset] = Some_Bytes_From GPR[rt]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the least-significant of 8 consecutive bytes forming a doubleword (DW) in memory, starting at an arbitrary byte boundary.

A part of DW, the least-significant 1 to 8 bytes, is in the aligned doubleword containing EffAddr. The same number of least-significant (right) bytes of GPR rt are stored into these bytes of DW.

Figure 3-25 illustrates this operation for big-endian byte ordering. The 8 consecutive bytes in 2..9 form an unaligned doubleword starting at location 2. A part of DW (2 bytes) is located in the aligned doubleword containing the leastsignificant byte at 9.

Restrictions:

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0  xor  ReverseEndian3)
If BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..3 || 03
endif
bytesel = vAddr1..0 xor BigEndianCPU3
datadoubleword = GPR[rt]63-8*bytesel || 08*bytesel
StoreMemory (CCA, DOUBLEWORD-byte, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Reserved Instruction, Watch

Encoding:

SD

111111

base

rt

offset

6

5

5

16

Format:

SD rt, offset(base)

MIPS64

Store Doubleword

Purpose:

Store Doubleword

To store a doubleword to memory.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The 64-bit doubleword in GPR rt is stored in memory at the location specified by the aligned effective address. The

16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-release 6: The effective address must be naturally-aligned. If any of the 3 least-significant bits of the effective address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
datadoubleword = GPR[rt]
StoreMemory (CCA, DOUBLEWORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Reserved Instruction, Watch

Encoding:

COP1X

010011

base

index

fs

0

00000

SDXC1

001001

6

5

5

5

5

6

Format:

SDXC1 fs, index(base)

MIPS64, MIPS32 Release 2, removed in Release 6

Store Doubleword Indexed from Floating Point

Purpose:

Store Doubleword Indexed from Floating Point

To store a doubleword from an FPR to memory (GPR+GPR addressing).

Description:

 memory[GPR[base] + GPR[index]] = FPR[fs]

The 64-bit doubleword in FPR fs is stored in memory at the location specified by the aligned effective address. The contents of GPR index and GPR base are added to form the effective address.

Restrictions:

An Address Error exception occurs if EffectiveAddress2..0 != 0 (not doubleword-aligned). Availability and Compatibility:

This instruction has been removed in Release 6.

Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32 Release 1. Required in

MIPS32 Release 2 and all subsequent versions of MIPS32. When required, these instructions are to be implemented if an FPU is present either in a 32-bit or 64-bit FPU or in a 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1,

StatusFR=0 or 1).

Operation:

vAddr = GPR[base] + GPR[index]
if vAddr2..0 != 03 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
datadoubleword = ValueFPR(fs, UNINTERPRETED_DOUBLEWORD)
StoreMemory(CCA, DOUBLEWORD, datadoubleword, pAddr, vAddr, DATA)
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
StoreMemory(CCA, WORD, datadoubleword31..0, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
StoreMemory(CCA, WORD, datadoubleword63..32, pAddr, vAddr+4, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Coprocessor Unusable, Address Error, Reserved Instruction, Watch.

Encoding:

SPECIAL3

011111

0

00000

rt

rd

SEB

10000

BSHFL

100000

6

5

5

5

5

6

Format:

SEB rd, rt

MIPS32 Release 2

Sign-Extend Byte

Purpose:

Sign-Extend Byte

To sign-extend the least significant byte of GPR rt and store the value into GPR rd.

Description:

 GPR[rd] = SignExtend(GPR[rt]7..0)

The least significant byte from GPR rt is sign-extended and stored in GPR rd.

Restrictions:

Prior to architecture Release 2, this instruction resulted in a Reserved Instruction exception.

If GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
GPR[rd] = sign_extend(GPR[rt]7..0)

Exceptions:

Reserved Instruction

Programming Notes:

For symmetry with the SEB and SEH instructions, you expect that there would be ZEB and ZEH instructions that zero-extend the source operand and expect that the SEW and ZEW instructions would exist to sign- or zero-extend a word to a doubleword. These instructions do not exist because there are functionally-equivalent instructions already in the instruction set. The following table shows the instructions providing the equivalent functions.

Expected InstructionFunctionEquivalent Instruction

Zero-Extend Byte

            ZEB rx,ry                          ANDI rx,ry,0xFF 

Zero-Extend Halfword

            ZEH rx,ry                         ANDI rx,ry,0xFFFF

Sign-Extend Word

            SEW rx,ry                            SLL rx,ry,0
           ZEW rx,rx1Zero-Extend Word

                                             DINSP32 rx,r0,32,32

1. The equivalent instruction uses rx for both source and destination, so the expected instruction is limited to one register

Encoding:

SPECIAL3

011111

0

00000

rt

rd

SEH

11000

BSHFL

100000

6

5

5

5

5

6

Format:

SEH rd, rt

MIPS32 Release 2

Sign-Extend Halfword

Purpose:

Sign-Extend Halfword

To sign-extend the least significant halfword of GPR rt and store the value into GPR rd.

Description:

 GPR[rd] = SignExtend(GPR[rt]15..0)

The least significant halfword from GPR rt is sign-extended and stored in GPR rd.

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

If GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
GPR[rd] = sign_extend(GPR[rt]15..0)

Exceptions:

Reserved Instruction

Programming Notes:

The SEH instruction can be used to convert two contiguous halfwords to sign-extended word values in three instructions. For example:

lw    t0, 0(a1)           /* Read two contiguous halfwords */
seh   t1, t0              /* t1 = lower halfword sign-extended to word */
sra   t0, t0, 16          /* t0 = upper halfword sign-extended to word */

Zero-extended halfwords can be created by changing the SEH and SRA instructions to ANDI and SRL instructions, respectively.

For symmetry with the SEB and SEH instructions, you expect that there would be ZEB and ZEH instructions that zero-extend the source operand and expect that the SEW and ZEW instructions would exist to sign- or zero-extend a word to a doubleword. These instructions do not exist because there are functionally-equivalent instructions already in the instruction set. The following table shows the instructions providing the equivalent functions.

Expected InstructionFunctionEquivalent Instruction

Zero-Extend Byte

            ZEB rx,ry                          ANDI rx,ry,0xFF 

Zero-Extend Halfword

            ZEH rx,ry                         ANDI rx,ry,0xFFFF

Sign-Extend Word

            SEW rx,ry                            SLL rx,ry,0
           ZEW rx,rx1Zero-Extend Word

                                             DINSP32 rx,r0,32,32

1. The equivalent instruction uses rx for both source and destination, so the expected instruction is limited to one register

Encoding:

COP1

010001

fmt

S, D only

ft

fs

fd

SEL

010000

6

5

5

5

5

6

Format:

SEL.fmt 

Select floating point values with FPR condition

SEL.S fd,fs,ft

MIPS32 Release 6

Select floating point values with FPR condition

SEL.D fd,fs,ft

MIPS32 Release 6

Select floating point values with FPR condition

Purpose:

Select floating point values with FPR condition

Description:

FPR[fd] =FPR[fd].bit0 ? FPR[ft] : FPR[fs]

SEL fmt is a select operation, with a condition input in FPR fd, and 2 data inputs in FPRs ft and fs.

The condition input is specified by FPR fd, and is overwritten by the result.

The condition is true only if bit 0 of the condition input FPR fd is set. Other bits are ignored.

This instruction has floating point formats S and D, but these specify only the width of the operands. SEL.S can be used for 32-bit W data, and SEL.D can be used for 64 bit L data.

This instruction does not cause data-dependent exceptions. It does not trap on NaNs, and the FCSRCause and

FCSRFlags fields are not modified.

Restrictions:

None

Availability and Compatibility:

SEL fmt is introduced by and required as of MIPS32 Release 6.

Special Considerations:

Only formats S and D are valid. Other format values may be used to encode other instructions. Unused format encodings are required to signal the Reserved Instruction exception.

Operation:

tmp = ValueFPR(fd, UNINTERPRETED_WORD)
cond = tmp.bit0
if cond then
   tmp = ValueFPR(ft, fmt) 
else
   tmp = ValueFPR(fs, fmt)
endif
StoreFPR(fd, fmt, tmp)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

None

Encoding:

COP1

010001

fmt

S, D only

ft

fs

fd

SELEQZ

010100

COP1

010001

fmt

S, D only

ft

fs

fd

SELNEZ

010111

6

5

5

5

5

6

Format:

SELEQZ.fmt SELNEQZ.fmt 

Select floating point value or zero with FPR condition.

SELEQZ.S fd,fs,ft

MIPS32 Release 6

Select floating point value or zero with FPR condition.

SELEQZ.D fd,fs,ft

MIPS32 Release 6

Select floating point value or zero with FPR condition.

SELNEZ.S fd,fs,ft

MIPS32 Release 6

Select floating point value or zero with FPR condition.

SELNEZ.D fd,fs,ft

MIPS32 Release 6

Select floating point value or zero with FPR condition.

Purpose:

Select floating point value or zero with FPR condition.

Description:

SELEQZ.fmt: FPR[fd] = FPR[ft].bit0 ? 0 : FPR[fs]
SELNEZ.fmt: FPR[fd] = FPR[ft].bit0 ? FPR[fs]: 0

If the condition is true, the value of fs is written to fd.

If the condition is false, the value that has all bits zero is written to fd.

This instruction has floating point formats S and D, but these specify only the width of the operands. Format S can be used for 32-bit W data, and format D can be used for 64 bit L data. The condition test is restricted to bit 0 of FPR ft.

Other bits are ignored.

This instruction has no execution exception behavior. It does not trap on NaNs, and the FCSRCause and FCSRFlags fields are not modified.

Restrictions:

FPR fd destination register bits beyond the format width are UNPREDICTABLE. For example, if fmt is S, then fd bits 0-31 are defined, but bits 32 and above are UNPREDICTABLE. If fmt is D, then fd bits 0-63 are defined.

Availability and Compatibility:

These instructions are introduced by and required as of MIPS32 Release 6.

Special Considerations:

Only formats S and D are valid. Other format values may be used to encode other instructions. Unused format encodings are required to signal the Reserved Instruction exception.

Operation:

tmp = ValueFPR(ft, UNINTERPRETED_WORD)
SELEQZ: cond = tmp.bit0 = 0
SELNEZ: cond = tmp.bit0 != 0
if cond then
   tmp = ValueFPR(fs, fmt)
else
   tmp = 0 /* all bits set to zero */
endif
StoreFPR(fd, fmt, tmp)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Encoding:

SPECIAL

000000

rs

rt

rd

00000

SELEQZ

110101

SPECIAL

000000

rs

rt

rd

00000

SELNEZ

110111

6

5

5

5

5

6

Format:

SELEQZ SELNEZ 

Select integer GPR value or zero

SELEQZ rd,rs,rt

MIPS32 Release 6

Select integer GPR value or zero

SELNEZ rd,rs,rt

MIPS32 Release 6

Select integer GPR value or zero

Purpose:

Select integer GPR value or zero

Description:

SELEQZ: GPR[rd] = GPR[rt] ? 0 : GPR[rs]
SELNEZ: GPR[rd] = GPR[rt] ? GPR[rs] : 0

If the condition is true, the value of rs is written to rd.

If the condition is false, the zero written to rd.

This instruction operates on all GPRLEN bits of the CPU registers, that is, all 32 bits on a 32-bit CPU, and all 64 bits on a 64-bit CPU. All GPRLEN bits of rt are tested.

Restrictions:

None

Availability and Compatibility:

These instructions are introduced by and required as of MIPS32 Release 6.

Special Considerations:

None

Operation:

SELNEZ: cond = GPR[rt] != 0
SELEQZ: cond = GPR[rt] = 0
if cond then
   tmp = GPR[rs]
else
   tmp = 0
endif
GPR[rd] = tmp

Exceptions:

None

Programming Note:

Release 6 removes the Pre-Release 6 instructions MOVZ and MOVN:

MOVZ: if GPR[rt] = 0 then GPR[rd] = GPR[rs] 
MOVN: if GPR[rt] != 0 then GPR[rd] = GPR[rs] 

MOVZ can be emulated using Release 6 instructions as follows:

SELEQZ at, rs, rt
SELNEZ rd, rd, rt
OR rd, rd, at

Similarly MOVN:

SELNEZ at, rs, rt
SELEQZ rd, rd, rt
OR rd, rd, at

The more general select operation requires 4 registers (1 output + 3 inputs (1 condition + 2 data)) and can be expressed:

rD = if rC then rA else rB

The more general select can be created using Release 6 instructions as follows:

SELNEZ at, rB, rC
SELNEZ rD, rA, rC
OR rD, rD, at

Encoding:

SPECIAL3

011111

base

rt

offset

0

SHE

011101

6

5

5

9

1

6

Format:

SHE rt, offset(base) 

MIPS32

Store Halfword EVA

Purpose:

Store Halfword EVA

To store a halfword to user mode virtual address space when executing in kernel mode.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The least-significant 16-bit halfword of register rt is stored in memory at the location specified by the aligned effective address. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The SHE instruction functions the same as the SH instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to 1.

Restrictions:

Only usable in kernel mode when accessing an address within a segment configured using UUSK, MUSK or

MUSUK access mode.

Pre-Release 6: The effective address must be naturally-aligned. If the least-significant bit of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
bytesel = vAddr2..0 xor (BigEndianCPU2 || 0)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
StoreMemory (CCA, HALFWORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

SH

101001

base

rt

offset

6

5

5

16

Format:

SH rt, offset(base) 

MIPS32

Store Halfword

Purpose:

Store Halfword

To store a halfword to memory.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The least-significant 16-bit halfword of register rt is stored in memory at the location specified by the aligned effective address. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If the least-significant bit of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian2 || 0))
bytesel = vAddr2..0 xor (BigEndianCPU2 || 0)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
StoreMemory (CCA, HALFWORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Encoding:

REGIMM

000001

00000

SIGRIE

10111

code

6

5

5

16

Format:

SIGRIE code

MIPS32 Release 6

Signal Reserved Instruction Exception

Purpose:

Signal Reserved Instruction Exception

The SIGRIE instruction signals a Reserved Instruction exception.

Description:

 SignalException(ReservedInstruction)

The SIGRIE instruction signals a Reserved Instruction exception. Implementations should use exactly the same mechanisms as they use for reserved instructions that are not defined by the Architecture.

The 16-bit code field is available for software use.

Restrictions:

The 16-bit code field is available for software use. The value zero is considered the default value. Software may provide extended functionality by interpreting nonzero values of the code field in a manner that is outside the scope of this architecture specification.

Availability and Compatibility:

This instruction is introduced by and required as of Release 6.

Pre-Release 6: this instruction encoding was reserved, and required to signal a Reserved Instruction exception. Therefore this instruction can be considered to be both backwards and forwards compatible.

Operation:

SignalException(ReservedInstruction)

Exceptions:

Reserved Instruction

Encoding:

SPECIAL

000000

0

00000

rt

rd

sa

SLL

000000

6

5

5

5

5

6

Format:

SLL rd, rt, sa 

MIPS32

Shift Word Left Logical

Purpose:

Shift Word Left Logical

To left-shift a word by a fixed number of bits.

Description:

 GPR[rd] = GPR[rt] << sa

The contents of the low-order 32-bit word of GPR rt are shifted left, inserting zeros into the emptied bits. The word result is sign-extended and placed in GPR rd. The bit-shift amount is specified by sa.

Restrictions:

None

Operation:

s = sa
temp = GPR[rt](31-s)..0 || 0s
GPR[rd] = sign_extend(temp)

Exceptions:

None

Programming Notes:

The SLL input operand does not have to be a properly sign-extended word value to produce a valid sign-extended

32-bit result. The result word is always sign-extended into a 64-bit destination register; this instruction with a zero shift amount truncates a 64-bit value to 32 bits and sign-extends it.

SLL r0, r0, 0, expressed as NOP, is the assembly idiom used to denote no operation.

SLL r0, r0, 1, expressed as SSNOP, is the assembly idiom used to denote no operation that causes an issue break on superscalar processors.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

SLLV

000100

6

5

5

5

5

6

Format:

SLLV rd, rt, rs 

MIPS32

Shift Word Left Logical Variable

Purpose:

Shift Word Left Logical Variable

To left-shift a word by a variable number of bits.

Description:

 GPR[rd] = GPR[rt] << GPR[rs]

The contents of the low-order 32-bit word of GPR rt are shifted left, inserting zeros into the emptied bits. The resulting word is sign-extended and placed in GPR rd. The bit-shift amount is specified by the low-order 5 bits of GPR rs.

Restrictions:

None

Operation:

s = GPR[rs]4..0
temp = GPR[rt](31-s)..0 || 0s
GPR[rd] = sign_extend(temp)

Exceptions:

None

Programming Notes:

The input operand does not have to be a properly sign-extended word value to produce a valid sign-extended 32-bit result. The result word is always sign-extended into a 64-bit destination register; this instruction with a zero shift amount truncates a 64-bit value to 32 bits and sign-extends it.

Encoding:

SLTI

001010

rs

rt

immediate

6

5

5

16

Format:

SLTI rt, rs, immediate 

MIPS32

Set on Less Than Immediate

Purpose:

Set on Less Than Immediate

To record the result of a less-than comparison with a constant.

Description:

 GPR[rt] = (GPR[rs] < sign_extend(immediate) )

Compare the contents of GPR rs and the 16-bit signed immediate as signed integers; record the Boolean result of the comparison in GPR rt. If GPR rs is less than immediate, the result is 1 (true); otherwise, it is 0 (false).

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if GPR[rs] < sign_extend(immediate) then
   GPR[rt] = 0GPRLEN-1|| 1
else
   GPR[rt] = 0GPRLEN
endif

Exceptions:

None

Encoding:

SLTIU

001011

rs

rt

immediate

6

5

5

16

Format:

SLTIU rt, rs, immediate 

MIPS32

Set on Less Than Immediate Unsigned

Purpose:

Set on Less Than Immediate Unsigned

To record the result of an unsigned less-than comparison with a constant.

Description:

 GPR[rt] = (GPR[rs] < sign_extend(immediate))

Compare the contents of GPR rs and the sign-extended 16-bit immediate as unsigned integers; record the Boolean result of the comparison in GPR rt. If GPR rs is less than immediate, the result is 1 (true); otherwise, it is 0 (false).

Because the 16-bit immediate is sign-extended before comparison, the instruction can represent the smallest or largest unsigned numbers. The representable values are at the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the unsigned range.

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if (0 || GPR[rs]) < (0 || sign_extend(immediate)) then
   GPR[rt] = 0GPRLEN-1 || 1
else
   GPR[rt] = 0GPRLEN
endif

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

SLT

101010

6

5

5

5

5

6

Format:

SLT rd, rs, rt 

MIPS32

Set on Less Than

Purpose:

Set on Less Than

To record the result of a less-than comparison.

Description:

 GPR[rd] = (GPR[rs] < GPR[rt])

Compare the contents of GPR rs and GPR rt as signed integers; record the Boolean result of the comparison in

GPR rd. If GPR rs is less than GPR rt, the result is 1 (true); otherwise, it is 0 (false).

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if GPR[rs] < GPR[rt] then
   GPR[rd] = 0GPRLEN-1 || 1
else
   GPR[rd] = 0GPRLEN
endif

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

SLTU

101011

6

5

5

5

5

6

Format:

SLTU rd, rs, rt 

MIPS32

Set on Less Than Unsigned

Purpose:

Set on Less Than Unsigned

To record the result of an unsigned less-than comparison.

Description:

 GPR[rd] = (GPR[rs] < GPR[rt])

Compare the contents of GPR rs and GPR rt as unsigned integers; record the Boolean result of the comparison in

GPR rd. If GPR rs is less than GPR rt, the result is 1 (true); otherwise, it is 0 (false).

The arithmetic comparison does not cause an Integer Overflow exception.

Restrictions:

None

Operation:

if (0 || GPR[rs]) < (0 || GPR[rt]) then
   GPR[rd] = 0GPRLEN-1 || 1
else
   GPR[rd] = 0GPRLEN
endif

Exceptions:

None

Encoding:

COP1

010001

fmt

0

00000

fs

fd

SQRT

000100

6

5

5

5

5

6

Format:

SQRT.fmt 

Floating Point Square Root

SQRT.S fd, fs

MIPS32

Floating Point Square Root

SQRT.D fd, fs

MIPS32

Floating Point Square Root

Purpose:

Floating Point Square Root

To compute the square root of an FP value.

Description:

 FPR[fd] = SQRT(FPR[fs])

The square root of the value in FPR fs is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operand and result are values in format fmt.

If the value in FPR fs corresponds to - 0, the result is - 0.

Restrictions:

If the value in FPR fs is less than 0, an Invalid Operation condition is raised.

The fields fs and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, fmt, SquareRoot(ValueFPR(fs, fmt)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Invalid Operation, Inexact, Unimplemented Operation

Encoding:

SPECIAL

000000

0

00000

rt

rd

sa

SRA

000011

6

5

5

5

5

6

Format:

SRA rd, rt, sa

MIPS32

Shift Word Right Arithmetic

Purpose:

Shift Word Right Arithmetic

To execute an arithmetic right-shift of a word by a fixed number of bits.

Description:

 GPR[rd] = GPR[rt] >> sa   (arithmetic)

The contents of the low-order 32-bit word of GPR rt are shifted right, duplicating the sign-bit (bit 31) in the emptied bits; the word result is sign-extended and placed in GPR rd. The bit-shift amount is specified by sa.

Restrictions:

On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then 
   UNPREDICTABLE 
endif
s = sa
temp = GPR[rt]31)s || GPR[rt]31..s
GPR[rd] = sign_extend(temp)

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

SRAV

000111

6

5

5

5

5

6

Format:

SRAV rd, rt, rs

MIPS32

Shift Word Right Arithmetic Variable

Purpose:

Shift Word Right Arithmetic Variable

To execute an arithmetic right-shift of a word by a variable number of bits.

Description:

 GPR[rd] = GPR[rt] >> GPR[rs]  (arithmetic)

The contents of the low-order 32-bit word of GPR rt are shifted right, duplicating the sign-bit (bit 31) in the emptied bits; the word result is sign-extended and placed in GPR rd. The bit-shift amount is specified by the low-order 5 bits of GPR rs.

Restrictions:

On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then 
   UNPREDICTABLE 
endif
s = GPR[rs]4..0
temp = (GPR[rt]31)s || GPR[rt]31..s
GPR[rd] = sign_extend(temp)

Exceptions:

None

Encoding:

SPECIAL

000000

0000

R

0

rt

rd

sa

SRL

000010

6

4

1

5

5

5

6

Format:

SRL rd, rt, sa

MIPS32

Shift Word Right Logical

Purpose:

Shift Word Right Logical

To execute a logical right-shift of a word by a fixed number of bits.

Description:

 GPR[rd] = GPR[rt] >> sa   (logical)

The contents of the low-order 32-bit word of GPR rt are shifted right, inserting zeros into the emptied bits. The word result is sign-extended and placed in GPR rd. The bit-shift amount is specified by sa.

Restrictions:

On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then 
   UNPREDICTABLE 
endif
s = sa
temp = 0s || GPR[rt]31..s
GPR[rd] = sign_extend(temp)

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0000

R

0

SRLV

000110

6

5

5

5

4

1

6

Format:

SRLV rd, rt, rs

MIPS32

Shift Word Right Logical Variable

Purpose:

Shift Word Right Logical Variable

To execute a logical right-shift of a word by a variable number of bits.

Description:

 GPR[rd] = GPR[rt] >> GPR[rs]   (logical)

The contents of the low-order 32-bit word of GPR rt are shifted right, inserting zeros into the emptied bits; the word result is sign-extended and placed in GPR rd. The bit-shift amount is specified by the low-order 5 bits of GPR rs.

Restrictions:

On 64-bit processors, if GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then 
   UNPREDICTABLE 
endif
s = GPR[rs]4..0
temp = 0s || GPR[rt]31..s
GPR[rd] = sign_extend(temp)

Exceptions:

None

Encoding:

SPECIAL

000000

0

00000

0

00000

0

00000

1

00001

SLL

000000

6

5

5

5

5

6

Format:

SSNOP 

Assembly Idiom MIPS32

Superscalar No Operation

Purpose:

Superscalar No Operation

Break superscalar issue on a superscalar processor.

Description:

SSNOP is the assembly idiom used to denote superscalar no operation. The actual instruction is interpreted by the hardware as SLL r0, r0, 1.

This instruction alters the instruction issue behavior on a superscalar processor by forcing the SSNOP instruction to single-issue. The processor must then end the current instruction issue between the instruction previous to the SSNOP and the SSNOP. The SSNOP then issues alone in the next issue slot.

On a single-issue processor, this instruction is a NOP that takes an issue slot.

Restrictions:

None

Availability and Compatibility

Release 6: the special no-operation instruction SSNOP is deprecated: it behaves the same as a conventional NOP. Its special behavior with respect to instruction issue is no longer guaranteed. The EHB and JR.HB instructions are provided to clear execution and instruction hazards.

Assemblers targeting specifically Release 6 should reject the SSNOP instruction with an error.

Operation:

None

Exceptions:

None

Programming Notes:

SSNOP is intended for use primarily to allow the programmer control over CP0 hazards by converting instructions into cycles in a superscalar processor. For example, to insert at least two cycles between an MTC0 and an ERET, one would use the following sequence:

mtc0  x,y
ssnop
ssnop
eret

The MTC0 issues in cycle T. Because the SSNOP instructions must issue alone, they may issue no earlier than cycle

T+1 and cycle T+2, respectively. Finally, the ERET issues no earlier than cycle T+3. Although the instruction after an

SSNOP may issue no earlier than the cycle after the SSNOP is issued, that instruction may issue later. This is because other implementation-dependent issue rules may apply that prevent an issue in the next cycle. Processors should not introduce any unnecessary delay in issuing SSNOP instructions.

Encoding:

COP1

010001

fmt

ft

fs

fd

SUB

000001

6

5

5

5

5

6

Format:

SUB.fmt 

Floating Point Subtract

SUB.S fd, fs, ft

MIPS32

Floating Point Subtract

SUB.D fd, fs, ft

MIPS32

Floating Point Subtract

SUB.PS fd, fs, ft

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Subtract

Purpose:

Floating Point Subtract

To subtract FP values.

Description:

 FPR[fd] = FPR[fs] - FPR[ft]

The value in FPR ft is subtracted from the value in FPR fs. The result is calculated to infinite precision, rounded according to the current rounding mode in FCSR, and placed into FPR fd. The operands and result are values in format fmt. SUB.PS subtracts the upper and lower halves of FPR fs and FPR ft independently, and ORs together any generated exceptional conditions.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type fmt. If the fields are not valid, the result is

UNPREDICTABLE.

The operands must be values in format fmt; if they are not, the result is UNPREDICTABLE and the value of the operand FPRs becomes UNPREDICTABLE.

The result of SUB.PS is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

SUB.PS has been removed in Release 6.

Operation:

StoreFPR (fd, fmt, ValueFPR(fs, fmt) -fmt ValueFPR(ft, fmt))

CPU Exceptions:

Coprocessor Unusable, Reserved Instruction

FPU Exceptions:

Inexact, Overflow, Underflow, Invalid Op, Unimplemented Op

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

SUB

100010

6

5

5

5

5

6

Format:

SUB rd, rs, rt

MIPS32

Subtract Word

Purpose:

Subtract Word

To subtract 32-bit integers. If overflow occurs, then trap.

Description:

 GPR[rd] = GPR[rs] - GPR[rt]

The 32-bit word value in GPR rt is subtracted from the 32-bit value in GPR rs to produce a 32-bit result. If the subtraction results in 32-bit 2's complement arithmetic overflow, then the destination register is not modified and an Integer Overflow exception occurs. If it does not overflow, the 32-bit result is sign-extended and placed into GPR rd.

Restrictions:

On 64-bit processors, if either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = (GPR[rs]31||GPR[rs]31..0) - (GPR[rt]31||GPR[rt]31..0)
if temp32 != temp31 then
   SignalException(IntegerOverflow)
else
   GPR[rd] = sign_extend(temp31..0)
endif

Exceptions:

Integer Overflow

Programming Notes:

SUBU performs the same arithmetic operation but does not trap on overflow.

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

SUBU

100011

6

5

5

5

5

6

Format:

SUBU rd, rs, rt

MIPS32

Subtract Unsigned Word

Purpose:

Subtract Unsigned Word

To subtract 32-bit integers.

Description:

 GPR[rd] = GPR[rs] - GPR[rt]

The 32-bit word value in GPR rt is subtracted from the 32-bit value in GPR rs and the 32-bit arithmetic result is signextended and placed into GPR rd.

No integer overflow exception occurs under any circumstances.

Restrictions:

On 64-bit processors, if either GPR rt or GPR rs does not contain sign-extended 32-bit values (bits 63..31 equal), then the result of the operation is UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rs]) or NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
temp = GPR[rs] - GPR[rt]
GPR[rd] = sign_extend(temp)

Exceptions:

None

Programming Notes:

The term "unsigned" in the instruction name is a misnomer; this operation is 32-bit modulo arithmetic that does not trap on overflow. It is appropriate for unsigned arithmetic, such as address arithmetic, or integer arithmetic environments that ignore overflow, such as C language arithmetic.

Encoding:

COP1X

010011

base

index

fs

0

00000

SUXC1

001101

6

5

5

5

5

6

Format:

SUXC1 fs, index(base)

MIPS64, MIPS32 Release 2, removed in Release 6

Store Doubleword Indexed Unaligned from Floating Point

Purpose:

Store Doubleword Indexed Unaligned from Floating Point

To store a doubleword from an FPR to memory (GPR+GPR addressing) ignoring alignment.

Description:

 memory[(GPR[base] + GPR[index])PSIZE-1..3] = FPR[fs]

The contents of the 64-bit doubleword in FPR fs is stored at the memory location specified by the effective address.

The contents of GPR index and GPR base are added to form the effective address. The effective address is doubleword-aligned; EffectiveAddress2..0 are ignored.

Restrictions:

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. The instruction is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility

This instruction has been removed in Release 6.

Operation:

vAddr = (GPR[base]+GPR[index])63..3  || 03
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
datadoubleword = ValueFPR(fs, UNINTERPRETED_DOUBLEWORD)
StoreMemory(CCA, DOUBLEWORD, datadoubleword, pAddr, vAddr, DATA)
paddr = paddr xor ((BigEndianCPU xor ReverseEndian) || 02)
StoreMemory(CCA, WORD, datadoubleword31..0, pAddr, vAddr, DATA)
paddr = paddr xor 0b100
StoreMemory(CCA, WORD, datadoubleword63..32, pAddr, vAddr+4, DATA)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, TLB Modified, Watch

Encoding:

SWC1

111001

base

ft

offset

6

5

5

16

Format:

SWC1 ft, offset(base)

MIPS32

Store Word from Floating Point

Purpose:

Store Word from Floating Point

To store a word from an FPR to memory.

Description:

 memory[GPR[base] + offset] = FPR[ft]

The low 32-bit word from FPR ft is stored in memory at the location specified by the aligned effective address. The

16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress1..0 != 0 (not word-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = ValueFPR(ft, UNINTERPRETED_WORD) || 08*bytesel
StoreMemory(CCA, WORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Encoding:

pre-Release 6

SWC2

111010

base

rt

offset

6

5

5

16

Release 6

COP2

010010

SWC2

01011

rt

base

offset

6

5

5

5

11

Format:

SWC2 rt, offset(base)

MIPS32

Store Word from Coprocessor 2

Purpose:

Store Word from Coprocessor 2

To store a word from a COP2 register to memory

Description:

 memory[GPR[base] + offset] = CPR[2,rt,0]

The low 32-bit word from COP2 (Coprocessor 2) register rt is stored in memory at the location specified by the aligned effective address. The signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: An Address Error exception occurs if EffectiveAddress1..0 != 0 (not word-aligned).

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Availability and Compatibility

This instruction has been recoded for Release 6.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = CPR[2,rt,0]63-8*bytesel..0 || 08*bytesel
StoreMemory(CCA, WORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

Coprocessor Unusable, Reserved Instruction, TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Programming Notes:

As shown in the instruction drawing above, Release 6 implements an 11-bit offset, whereas all release levels lower than Release 6 of the MIPS architecture implement a 16-bit offset.

Encoding:

SPECIAL3

011111

base

rt

offset

0

SWE

011111

6

5

5

9

1

6

Format:

SWE rt, offset(base)

MIPS32

Store Word EVA

Purpose:

Store Word EVA

To store a word to user mode virtual address space when executing in kernel mode.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The least-significant 32-bit word of GPR rt is stored in memory at the location specified by the aligned effective address. The 9-bit signed offset is added to the contents of GPR base to form the effective address.

The SWE instruction functions the same as the SW instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to 1.

Restrictions:

Only usable in kernel mode when accessing an address within a segment configured using UUSK, MUSK or

MUSUK access mode.

Pre-Release 6: The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
StoreMemory (CCA, WORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

SPECIAL3

011111

base

rt

offset

0

SWLE

100001

6

5

5

9

1

6

Format:

SWLE rt, offset(base)

MIPS32, removed in Release 6

Store Word Left EVA

Purpose:

Store Word Left EVA

To store the most-significant part of a word to an unaligned user mode virtual address while operating in kernel mode.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The 9-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

A part of W (the most-significant 1 to 4 bytes) is in the aligned word containing EffAddr. The same number of the most-significant (left) bytes from the word in GPR rt are stored into these bytes of W.

If GPR rt is a 64-bit register, the source word is the low word of the register.

The following figure shows this operation using big-endian byte ordering for 32-bit and 64-bit registers. The 4 consecutive bytes in 2..5 form an unaligned word starting at location 2. A part of W (2 bytes) is located in the aligned word containing the most-significant byte at 2.

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0  xor  ReverseEndian3)
If BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..2 || 02
endif
byte = vAddr1..0 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
   datadoubleword = 032 || 024-8*byte || GPR[rt]31..24-8*byte
else
   datadoubleword = 024-8*byte || GPR[rt]31..24-8*byte || 032
endif
dataword = 024-8*byte || GPR[rt]31..24-8*byte
StoreMemory(CCA, byte, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Encoding:

SWL

101010

base

rt

offset

6

5

5

16

Format:

SWL rt, offset(base)

MIPS32, removed in Release 6

Store Word Left

Purpose:

Store Word Left

To store the most-significant part of a word to an unaligned memory address.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the most-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

A part of W (the most-significant 1 to 4 bytes) is in the aligned word containing EffAddr. The same number of the most-significant (left) bytes from the word in GPR rt are stored into these bytes of W.

If GPR rt is a 64-bit register, the source word is the low word of the register.

The following figure illustrates this operation using big-endian byte ordering for 32-bit and 64-bit registers. The four consecutive bytes in 2..5 form an unaligned word starting at location 2. A part of W (2 bytes) is located in the aligned word containing the most-significant byte at 2.

Restrictions:

None

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0  xor  ReverseEndian3)
If BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..2 || 02
endif
byte = vAddr1..0 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
   datadoubleword = 032 || 024-8*byte || GPR[rt]31..24-8*byte
else
   datadoubleword = 024-8*byte || GPR[rt]31..24-8*byte || 032
endif
dataword = 024-8*byte || GPR[rt]31..24-8*byte
StoreMemory(CCA, byte, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Watch

Encoding:

SPECIAL3

011111

base

rt

offset

0

SWRE

100010

6

5

5

9

1

6

Format:

SWRE rt, offset(base)

MIPS32, removed in Release 6

Store Word Right EVA

Purpose:

Store Word Right EVA

To store the least-significant part of a word to an unaligned user mode virtual address while operating in kernel mode.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The 9-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the least-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

A part of W (the least-significant 1 to 4 bytes) is in the aligned word containing EffAddr. The same number of the least-significant (right) bytes from the word in GPR rt are stored into these bytes of W.

If GPR rt is a 64-bit register, the source word is the low word of the register.

The following figure illustrates this operation using big-endian byte ordering for 32-bit and 64-bit registers. The 4 consecutive bytes in 2..5 form an unaligned word starting at location 2. A part of W (2 bytes) is contained in the aligned word containing the least-significant byte at 5.

Restrictions:

Only usable when access to Coprocessor0 is enabled and when accessing an address within a segment configured using UUSK, MUSK or MUSUK access mode.

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..2 || 02
endif
byte = vAddr1..0 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
   datadoubleword = 032 || GPR[rt]31-8*byte..0 || 08*byte
else
   datadoubleword = GPR[rt]31-8*byte..0 || 08*byte || 032
endif
dataword = GPR[rt]31-8*byte || 08*byte
StoreMemory(CCA, WORD-byte, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Watch, Coprocessor Unusable

Encoding:

SWR

101110

base

rt

offset

6

5

5

16

Format:

SWR rt, offset(base)

MIPS32, removed in Release 6

Store Word Right

Purpose:

Store Word Right

To store the least-significant part of a word to an unaligned memory address.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The 16-bit signed offset is added to the contents of GPR base to form an effective address (EffAddr). EffAddr is the address of the least-significant of 4 consecutive bytes forming a word (W) in memory starting at an arbitrary byte boundary.

A part of W (the least-significant 1 to 4 bytes) is in the aligned word containing EffAddr. The same number of the least-significant (right) bytes from the word in GPR rt are stored into these bytes of W.

If GPR rt is a 64-bit register, the source word is the low word of the register.

The following figure illustrates this operation using big-endian byte ordering for 32-bit and 64-bit registers. The 4 consecutive bytes in 2..5 form an unaligned word starting at location 2. A part of W (2 bytes) is contained in the aligned word containing the least-significant byte at 5.

Restrictions:

None

Availability and Compatibility:

Release 6 removes the load/store-left/right family of instructions, and requires the system to support misaligned memory accesses.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor ReverseEndian3)
If BigEndianMem = 0 then
   pAddr = pAddrPSIZE-1..2 || 02
endif
byte = vAddr1..0 xor BigEndianCPU2
if (vAddr2 xor BigEndianCPU) = 0 then
   datadoubleword = 032 || GPR[rt]31-8*byte..0 || 08*byte
else
   datadoubleword = GPR[rt]31-8*byte..0 || 08*byte || 032
endif
dataword = GPR[rt]31-8*byte || 08*byte
StoreMemory(CCA, WORD-byte, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Bus Error, Address Error, Watch

Encoding:

SW

101011

base

rt

offset

6

5

5

16

Format:

SW rt, offset(base)

MIPS32

Store Word

Purpose:

Store Word

To store a word to memory.

Description:

 memory[GPR[base] + offset] = GPR[rt]

The least-significant 32-bit word of GPR rt is stored in memory at the location specified by the aligned effective address. The 16-bit signed offset is added to the contents of GPR base to form the effective address.

Restrictions:

Pre-Release 6: The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an Address Error exception occurs.

Release 6 allows hardware to provide address misalignment support in lieu of requiring natural alignment.

Note: The pseudocode is not completely adapted for Release 6 misalignment support as the handling is implementation dependent.

Operation:

vAddr = sign_extend(offset) + GPR[base]
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
StoreMemory (CCA, WORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch

Encoding:

COP1X

010011

base

index

fs

0

00000

SWXC1

001000

6

5

5

5

5

6

Format:

SWXC1 fs, index(base)

MIPS64, MIPS32 Release 2, removed in Release 6

Store Word Indexed from Floating Point

Purpose:

Store Word Indexed from Floating Point

To store a word from an FPR to memory (GPR+GPR addressing)

Description:

 memory[GPR[base] + GPR[index]] = FPR[fs]

The low 32-bit word from FPR fs is stored in memory at the location specified by the aligned effective address. The contents of GPR index and GPR base are added to form the effective address.

Restrictions:

An Address Error exception occurs if EffectiveAddress1..0 != 0 (not word-aligned).

Availability and Compatibility:

This instruction has been removed in Release 6.

Required in all versions of MIPS64 since MIPS64 Release 1. Not available in MIPS32 Release 1. Required in

MIPS32 Release 2 and all subsequent versions of MIPS32. When required, required whenever FPU is present, whether a 32-bit or 64-bit FPU, whether in 32-bit or 64-bit FP Register Mode (FIRF64=0 or 1, StatusFR=0 or 1).

Operation:

vAddr = GPR[base] + GPR[index]
if vAddr1..0 != 03 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation(vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = ValueFPR(fs, UNINTERPRETED_WORD) || 08*bytesel
StoreMemory(CCA, WORD, datadoubleword, pAddr, vAddr, DATA)

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Reserved Instruction, Coprocessor Unusable, Watch

Encoding:

Release 6

REGIMM

000001

base

SYNCI

11111

offset

6

5

5

16

Format:

SYNCI offset(base)

MIPS32 Release 2

Synchronize Caches to Make Instruction Writes Effective

Purpose:

Synchronize Caches to Make Instruction Writes Effective

To synchronize all caches to make instruction writes effective.

Description:

This instruction is used after a new instruction stream is written to make the new instructions effective relative to an instruction fetch, when used in conjunction with the SYNC and JALR.HB, JR.HB, or ERET instructions, as described below. Unlike the CACHE instruction, the SYNCI instruction is available in all operating modes in an implementation of Release 2 of the architecture.

The 16-bit offset is sign-extended and added to the contents of the base register to form an effective address. The effective address is used to address the cache line in all caches which may need to be synchronized with the write of the new instructions. The operation occurs only on the cache line which may contain the effective address. One

SYNCI instruction is required for every cache line that was written. See the Programming Notes below.

A TLB Refill and TLB Invalid (both with cause code equal TLBL) exception can occur as a by product of this instruction. This instruction never causes TLB Modified exceptions nor TLB Refill exceptions with a cause code of

TLBS. This instruction never causes Execute-Inhibit nor Read-Inhibit exceptions.

A Cache Error exception may occur as a by product of this instruction. For example, if a writeback operation detects a cache or bus error during the processing of the operation, that error is reported via a Cache Error exception. Similarly, a Bus Error Exception may occur if a bus operation invoked by this instruction is terminated in an error.

An Address Error Exception (with cause code equal AdEL) may occur if the effective address references a portion of the kernel address space which would normally result in such an exception. It is implementation dependent whether such an exception does occur.

It is implementation dependent whether a data watch is triggered by a SYNCI instruction whose address matches the

Watch register address match conditions.

Restrictions:

The operation of the processor is UNPREDICTABLE if the effective address references any instruction cache line that contains instructions to be executed between the SYNCI and the subsequent JALR.HB, JR.HB, or ERET instruction required to clear the instruction hazard.

The SYNCI instruction has no effect on cache lines that were previously locked with the CACHE instruction. If correct software operation depends on the state of a locked line, the CACHE instruction must be used to synchronize the caches.

Full visibility of the new instruction stream requires execution of a subsequent SYNC instruction, followed by a

JALR.HB, JR.HB, DERET, or ERET instruction. The operation of the processor is UNPREDICTABLE if this sequence is not followed.

SYNCI globalization:

The SYNCI instruction acts on the current processor at a minimum. Implementations are required to affect caches outside the current processor to perform the operation on the current processor (as might be the case if multiple processors share an L2 or L3 cache).

In multiprocessor implementations where instruction caches are coherently maintained by hardware, the SYNCI instruction should behave as a NOP instruction.

In multiprocessor implementations where instruction caches are not coherently maintained by hardware, the SYNCI instruction may optionally affect all coherent icaches within the system. If the effective address uses a coherent

Cacheability and Coherency Attribute (CCA), then the operation may be globalized, meaning it is broadcast to all of the coherent instruction caches within the system. If the effective address does not use one of the coherent CCAs, there is no broadcast of the SYNCI operation. If multiple levels of caches are to be affected by one SYNCI instruction, all of the affected cache levels must be processed in the same manner - either all affected cache levels use the globalized behavior or all affected cache levels use the non-globalized behavior.

Pre-Release 6: Portable software could not rely on the optional globalization of SYNCI. Strictly portable software without implementation specific awareness could only rely on expensive "instruction cache shootdown" using interprocessor interrupts.

Release 6: SYNCI globalization is required. Compliant implementations must globalize SYNCI, and portable software can rely on this behavior.

Operation:

vaddr = GPR[base] + sign_extend(offset)
SynchronizeCacheLines(vaddr)     /* Operate on all caches */

Exceptions:

Reserved Instruction exception (Release 1 implementations only)

TLB Refill Exception

TLB Invalid Exception

Address Error Exception

Cache Error Exception

Bus Error Exception

Programming Notes:

When the instruction stream is written, the SYNCI instruction should be used in conjunction with other instructions to make the newly-written instructions effective. The following example shows a routine which can be called after the new instruction stream is written to make those changes effective. The SYNCI instruction could be replaced with the corresponding sequence of CACHE instructions (when access to Coprocessor 0 is available), and that the JR.HB instruction could be replaced with JALR.HB, ERET, or DERET instructions, as appropriate. A SYNC instruction is required between the final SYNCI instruction in the loop and the instruction that clears instruction hazards.

/*
 * This routine makes changes to the instruction stream effective to the
 * hardware.  It should be called after the instruction stream is written.
 * On return, the new instructions are effective.
 *
 * Inputs:
 *    a0 = Start address of new instruction stream
 *    a1 = Size, in bytes, of new instruction stream
 */
   beq   a1, zero, 20f       /* If size==0, */
   nop                       /*   branch around */
   addu  a1, a0, a1          /* Calculate end address + 1 */
                              /* (daddu for 64-bit addressing) */
   rdhwr v0, HW_SYNCI_Step   /* Get step size for SYNCI from new */
                              /*   Release 2 instruction */
   beq   v0, zero, 20f       /* If no caches require synchronization, */
   nop                       /*   branch around */
10: synci 0(a0)               /* Synchronize all caches around address */
   addu  a0, a0, v0          /* Add step size in delay slot */
                              /* (daddu for 64-bit addressing) */
   sltu  v1, a0, a1          /* Compare current with end address */
   bne   v1, zero, 10b       /* Branch if more to do */
   nop                       /*   branch around */
   sync                      /* Clear memory hazards */
20: jr.hb ra                 /* Return, clearing instruction hazards */
   nop

Encoding:

SPECIAL

000000

0

00 0000 0000 0000 0

stype

SYNC

001111

6

15

5

6

Format:

SYNC stype

MIPS32

Synchronize Shared Memory

Purpose:

Synchronize Shared Memory

To order loads and stores for shared memory.

Release 6 (with Config5GI =10/11) extends SYNC for Global Invalidate instructions (GINVI/GINVT).

Description:

These types of ordering guarantees are available through the SYNC instruction:

Completion Barrier - Simple Description:

Completion Barrier - Detailed Description:

SYNC behavior when the stype field is zero:

Ordering Barrier - Simple Description:

Ordering Barrier - Detailed Description:

SYNC instruction in the instruction stream reaches the same stage in the load/store datapath.

As compared to the completion barrier, the ordering barrier is a lighter-weight operation as it does not require the specified instructions before the SYNC to be already completed. Instead it only requires that those specified instructions which are subsequent to the SYNC in the instruction stream are never re-ordered for processing ahead of the specified instructions which are before the SYNC in the instruction stream. This potentially reduces how many cycles the barrier instruction must stall before it completes.

The Acquire and Release barrier types are used to minimize the memory orderings that must be maintained and still have software synchronization work.

Implementations that do not use any of the non-zero values of stype to define different barriers, such as ordering barriers, must make those stype values act the same as stype zero.

For the purposes of this description, the CACHE, PREF and PREFX instructions are treated as loads and stores. That is, these instructions and the memory transactions sourced by these instructions obey the ordering and completion rules of the SYNC instruction.

If Global Invalidate instructions are supported in Release 6, then SYNC (stype=0x14) acts as a completion barrier to ensure completion of any preceding GINVI or GINVT operation. This SYNC operation is globalized and only completes if all preceding GINVI or GINVT operations related to the same program have completed in the system. (Any references to GINVT also imply GINVGT, available in a virtualized MIPS system. GINVT however will be used exclusively.)

A system that implements the Global Invalidates also requires that the completion of this SYNC be constrained by legacy SYNCI operations. Thus SYNC (stype=0x14) can also be used to determine whether preceding (in program order) SYNCI operations have completed.

The SYNC (stype=0x14) also act as an ordering barrier as described in Table 5.5.

In the typical use cases, a single GINVI is used by itself to invalidate caches and would be followed by a SYNC

(stype=0x14).

In the case of GINVT, multiple GINVT could be used to invalidate multiple TLB mappings, and the SYNC

(stype=0x14) would be used to guaranteed completion of any number of GINVTs preceding it.

Table 5.5 lists the available completion barrier and ordering barriers behaviors that can be specified using the stype field.

Table 5.5 Encodings of the Bits[10:6] of the SYNC instruction; the SType Field

Code

Name

Older instructions

which must reach

the load/store

ordering point

before the SYNC

instruction

completes.

Younger

instructions

which must reach

the load/store

ordering point

only after the

SYNC instruction

completes.

Older instructions

which must be

globally

performed when

the SYNC

instruction

completes

Compliance

0x0

SYNC or

SYNC 0

Loads, Stores

Loads, Stores

Loads, Stores

Required

0x4

SYNC_WMB or

SYNC 4

Stores

Stores

Optional

0x10

SYNC_MB or

SYNC 16

Loads, Stores

Loads, Stores

Optional

0x11

SYNC_ACQUIRE or

SYNC 17

Loads

Loads, Stores

Optional

0x12

SYNC_RELEASE or

SYNC 18

Loads, Stores

Stores

Optional

0x13

SYNC_RMB or

SYNC 19

Loads

Loads

Optional

0x1-0x3, 0x5-0xF

Implementation-Specific and Vendor Specific Sync Types

0x14

SYNC_GINV

Loads, Stores

Loads, Stores

GINVI, GINVT,

SYNCI

Release 6 w/

Config5GI =10/11 otherwise Reserved

0x15 - 0x1F

RESERVED

Reserved for MIPS Technologies for future extension of the architecture.

Terms:

Synchronizable: A load or store instruction is synchronizable if the load or store occurs to a physical location in

shared memory using a virtual location with a memory access type of either uncached or cached coherent. Shared

memory is memory that can be accessed by more than one processor or by a coherent I/O system module.

Performed load: A load instruction is performed when the value returned by the load has been determined. The result

of a load on processor A has been determined with respect to processor or coherent I/O module B when a subsequent store to the location by B cannot affect the value returned by the load. The store by B must use the same memory access type as the load.

Performed store: A store instruction is performed when the store is observable. A store on processor A is observable

with respect to processor or coherent I/O module B when a subsequent load of the location by B returns the value written by the store. The load by B must use the same memory access type as the store.

Globally performed load: A load instruction is globally performed when it is performed with respect to all processors

and coherent I/O modules capable of storing to the location.

Globally performed store: A store instruction is globally performed when it is globally observable. It is globally observable when it is observable by all processors and I/O modules capable of loading from the location.

Coherent I/O module: A coherent I/O module is an Input/Output system component that performs coherent Direct

Memory Access (DMA). It reads and writes memory independently as though it were a processor doing loads and stores to locations with a memory access type of cached coherent.

Load/Store Datapath: The portion of the processor which handles the load/store data requests coming from the processor pipeline and processes those requests within the cache and memory system hierarchy.

Restrictions:

The effect of SYNC on the global order of loads and stores for memory access types other than uncached and cached

coherent is UNPREDICTABLE.

Operation:

SyncOperation(stype)

Exceptions:

None

Programming Notes:

A processor executing load and store instructions observes the order in which loads and stores using the same memory access type occur in the instruction stream; this is known as program order.

A parallel program has multiple instruction streams that can execute simultaneously on different processors. In multiprocessor (MP) systems, the order in which the effects of loads and stores are observed by other processors-the

global order of the loads and store-determines the actions necessary to reliably share data in parallel programs.

When all processors observe the effects of loads and stores in program order, the system is strongly ordered. On such systems, parallel programs can reliably share data without explicit actions in the programs. For such a system, SYNC has the same effect as a NOP. Executing SYNC on such a system is not necessary, but neither is it an error.

If a multiprocessor system is not strongly ordered, the effects of load and store instructions executed by one processor may be observed out of program order by other processors. On such systems, parallel programs must take explicit actions to reliably share data. At critical points in the program, the effects of loads and stores from an instruction stream must occur in the same order for all processors. SYNC separates the loads and stores executed on the processor into two groups, and the effect of all loads and stores in one group is seen by all processors before the effect of any load or store in the subsequent group. In effect, SYNC causes the system to be strongly ordered for the executing processor at the instant that the SYNC is executed.

Many MIPS-based multiprocessor systems are strongly ordered or have a mode in which they operate as strongly ordered for at least one memory access type. The MIPS architecture also permits implementation of MP systems that are not strongly ordered; SYNC enables the reliable use of shared memory on such systems. A parallel program that does not use SYNC generally does not operate on a system that is not strongly ordered. However, a program that does use SYNC works on both types of systems. (System-specific documentation describes the actions needed to reliably share data in parallel programs for that system.)

The behavior of a load or store using one memory access type is UNPREDICTABLE if a load or store was previously made to the same physical location using a different memory access type. The presence of a SYNC between the references does not alter this behavior.

SYNC affects the order in which the effects of load and store instructions appear to all processors; it does not generally affect the physical memory-system ordering or synchronization issues that arise in system programming. The effect of SYNC on implementation-specific aspects of the cached memory system, such as writeback buffers, is not defined.

# Processor A (writer)
# Conditions at entry: 
# The value 0 has been stored in FLAG and that value is observable by B
SW    R1, DATA        # change shared DATA value
LI    R2, 1
SYNC                   # Perform DATA store before performing FLAG store
SW    R2, FLAG        # say that the shared DATA value is valid
   # Processor B (reader)
      LI    R2, 1
   1: LW    R1, FLAG  # Get FLAG
      BNE   R2, R1, 1B# if it says that DATA is not valid, poll again
      NOP
      SYNC            # FLAG value checked before doing DATA read
      LW    R1, DATA  # Read (valid) shared DATA value

The code fragments above shows how SYNC can be used to coordinate the use of shared data between separate writer and reader instruction streams in a multiprocessor environment. The FLAG location is used by the instruction streams to determine whether the shared data item DATA is valid. The SYNC executed by processor A forces the store of

DATA to be performed globally before the store to FLAG is performed. The SYNC executed by processor B ensures that DATA is not read until after the FLAG value indicates that the shared data is valid.

Encoding:

SPECIAL

000000

code

SYSCALL

001100

6

20

6

Format:

SYSCALL 

MIPS32

System Call

Purpose:

System Call

To cause a System Call exception.

Description:

A system call exception occurs, immediately and unconditionally transferring control to the exception handler.

The code field is available for use as software parameters, but may be retrieved by the exception handler by loading the contents of the memory word containing the instruction. Alternatively, if CP0 BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

SignalException(SystemCall)

Exceptions:

System Call

Encoding:

REGIMM

000001

rs

TEQI

01100

immediate

6

5

5

16

Format:

TEQI rs, immediate

MIPS32, removed in Release 6

Trap if Equal Immediate

Purpose:

Trap if Equal Immediate

To compare a GPR to a constant and do a conditional trap.

Description:

 if GPR[rs] = immediate then Trap

Compare the contents of GPR rs and the 16-bit signed immediate as signed integers. If GPR rs is equal to immediate, then take a Trap exception.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if GPR[rs] = sign_extend(immediate) then
      SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

SPECIAL

000000

rs

rt

code

TEQ

110100

6

5

5

10

6

Format:

TEQ rs, rt

MIPS32

Trap if Equal

Purpose:

Trap if Equal

To compare GPRs and do a conditional trap.

Description:

 if GPR[rs] = GPR[rt] then Trap

Compare the contents of GPR rs and GPR rt as signed integers. If GPR rs is equal to GPR rt, then take a Trap exception.

The contents of the code field are ignored by hardware and may be used to encode information for system software.

To retrieve the information, system software may load the instruction word from memory. Alternatively, if CP0

BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

if GPR[rs] = GPR[rt] then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

REGIMM

000001

rs

TGEI

01000

immediate

6

5

5

16

Format:

TGEI rs, immediate

MIPS32, removed in Release 6

Trap if Greater or Equal Immediate

Purpose:

Trap if Greater or Equal Immediate

To compare a GPR to a constant and do a conditional trap.

Description:

 if GPR[rs] >= immediate then Trap

Compare the contents of GPR rs and the 16-bit signed immediate as signed integers. If GPR rs is greater than or equal to immediate, then take a Trap exception.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if GPR[rs] >= sign_extend(immediate) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

REGIMM

000001

rs

TGEIU

01001

immediate

6

5

5

16

Format:

TGEIU rs, immediate

MIPS32, removed in Release 6

Trap if Greater or Equal Immediate Unsigned

Purpose:

Trap if Greater or Equal Immediate Unsigned

To compare a GPR to a constant and do a conditional trap.

Description:

 if GPR[rs] >= immediate then Trap

Compare the contents of GPR rs and the 16-bit sign-extended immediate as unsigned integers. If GPR rs is greater than or equal to immediate, then take a Trap exception.

Because the 16-bit immediate is sign-extended before comparison, the instruction can represent the smallest or largest unsigned numbers. The representable values are at the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the unsigned range.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if (0 || GPR[rs]) >= (0 || sign_extend(immediate)) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

SPECIAL

000000

rs

rt

code

TGE

110000

6

5

5

10

6

Format:

TGE rs, rt

MIPS32

Trap if Greater or Equal

Purpose:

Trap if Greater or Equal

To compare GPRs and do a conditional trap.

Description:

 if GPR[rs] >= GPR[rt] then Trap

Compare the contents of GPR rs and GPR rt as signed integers. If GPR rs is greater than or equal to GPR rt, then take a Trap exception.

The contents of the code field are ignored by hardware and may be used to encode information for system software.

To retrieve the information, the system software may load the instruction word from memory. Alternatively, if CP0

BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

if GPR[rs] >= GPR[rt] then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

SPECIAL

000000

rs

rt

code

TGEU

110001

6

5

5

10

6

Format:

TGEU rs, rt

MIPS32

Trap if Greater or Equal Unsigned

Purpose:

Trap if Greater or Equal Unsigned

To compare GPRs and do a conditional trap.

Description:

 if GPR[rs] >= GPR[rt] then Trap

Compare the contents of GPR rs and GPR rt as unsigned integers. If GPR rs is greater than or equal to GPR rt, then take a Trap exception.

The contents of the code field are ignored by hardware and may be used to encode information for system software.

To retrieve the information, the system software may load the instruction word from memory. Alternatively, if CP0

BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

if (0 || GPR[rs]) >= (0 || GPR[rt]) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

TLBINV

000011

6

1

19

6

Format:

TLBINV 

MIPS32

TLB Invalidate

Purpose:

TLB Invalidate

TLBINV invalidates a set of TLB entries based on ASID and Index match. The virtual address is ignored in the entry match. TLB entries which have their G bit set to 1 are not modified.

Implementation of the TLBINV instruction is optional. The implementation of this instruction is indicated by the IE field in Config4.

Support for TLBINV is recommend for implementations supporting VTLB/FTLB type of MMU.

Implementation of EntryHIEHINV field is required for implementation of TLBINV instruction.

Description:

On execution of the TLBINV instruction, the set of TLB entries with matching ASID are marked invalid, excluding those TLB entries which have their G bit set to 1.

The EntryHIASID field has to be set to the appropriate ASID value before executing the TLBINV instruction.

Behavior of the TLBINV instruction applies to all applicable TLB entries and is unaffected by the setting of the Wired register.

All matching entries in the JTLB are invalidated. The Index register is unused.

If TLB invalidate walk is implemented in software (Config4IE=2), then software must do these steps to flush the entire MMU:

1.one TLBINV instruction is executed with an index in VTLB range (invalidates all matching VTLB entries)

2.a TLBINV instruction is executed for each FTLB set (invalidates all matching entries in FTLB set)

If TLB invalidate walk is implemented in hardware (Config4IE=3), then software must do these steps to flush the entire MMU:

1.one TLBINV instruction is executed (invalidates all matching entries in both FTLB & VTLB). In this case,

Index is unused.

Restrictions:

When Config4MT = 4 and Config4IE = 2, the operation is UNDEFINED if the contents of the Index register are greater than or equal to the number of available TLB entries.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Availability and Compatibility:

Implementation of the TLBINV instruction is optional. The implementation of this instruction is indicated by the IE field in Config4.

Implementation of EntryHIEHINV field is required for implementation of TLBINV instruction.

Pre-Release 6, support for TLBINV is recommended for implementations supporting VTLB/FTLB type of MMU.

Release 6 (and subsequent releases) support for TLBINV is required for implementations supporting VTLB/FTLB type of MMU.

Release 6: On processors that include a Block Address Translation (BAT) or Fixed Mapping (FM) MMU (ConfigMT =

2 or 3), the operation of this instruction causes a Reserved Instruction exception (RI).

Operation:

   if ( ConfigMT=1 or (ConfigMT=4 & Config4IE=2 & Index < VTLBsize() )) 
      startnum = 0
      endnum = VTLBsize() - 1
   endif
   // treating VTLB and FTLB as one array 
   if (ConfigMT=4 & Config4IE=2 & Index >= VTLBsize(); ) 
      startnum = start of selected FTLB set // implementation specific
      endnum = end of selected FTLB set - 1 //implementation specifc
   endif
   if (ConfigMT=4 & Config4IE=3)) 
      startnum = 0
      endnum = VTLBsize() + FTLBsize() - 1;
   endif
   for (i = startnum to endnum) 
      if (TLB[i]ASID = EntryHiASID & TLB[i]G = 0) 
         TLB[i]VPN2_invalid = 1
      endif
   endfor
function VTLBsize 
   SizeExt = ArchRev() >= 6          ?  Config4VTLBSizeExt
         : Config4MMUExtDef == 3     ?  Config4VTLBSizeExt
         : Config4MMUExtDef == 1     ?  Config4MMUSizeExt
         :               0
         ;
    return 1 + ( (SizeExt << 6) | Config1.MMUSize );
endfunction
function FTLBsize 
   if ( Config1MT == 4 ) then
      return ( Config4FTLBWays + 2 ) * ( 1 << C0_Config4FTLBSets );
   else 
      return 0;
   endif
endfunction

Exceptions:

Coprocessor Unusable,

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

TLBINVF

000100

6

1

19

6

Format:

TLBINVF 

MIPS32

TLB Invalidate Flush

Purpose:

TLB Invalidate Flush

TLBINVF invalidates a set of TLB entries based on Index match. The virtual address and ASID are ignored in the entry match.

Implementation of the TLBINVF instruction is optional. The implementation of this instruction is indicated by the IE field in Config4.

Support for TLBINVF is recommend for implementations supporting VTLB/FTLB type of MMU.

Implementation of the EntryHIEHINV field is required for implementation of TLBINV and TLBINVF instructions.

Description:

On execution of the TLBINVF instruction, all entries within range of Index are invalidated.

Behavior of the TLBINVF instruction applies to all applicable TLB entries and is unaffected by the setting of the

Wired register.

TLBINVF causes all entries in the JTLB to be invalidated. Index is unused.

If TLB invalidate walk is implemented in your software (Config4IE=2), then your software must do these steps to flush the entire MMU:

1.one TLBINVF instruction is executed with an index in VTLB range (invalidates all VTLB entries)

2.a TLBINVF instruction is executed for each FTLB set (invalidates all entries in FTLB set)

If TLB invalidate walk is implemented in hardware (Config4IE=3), then software must do these steps to flush the entire MMU:

1.one TLBINVF instruction is executed (invalidates all entries in both FTLB & VTLB). In this case, Index is unused.

Restrictions:

When ConfigMT=4 and ConfigIE=2, the operation is UNDEFINED if the contents of the Index register are greater than or equal to the number of available TLB entries.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Availability and Compatibility:

Implementation of the TLBINVF instruction is optional. The implementation of this instruction is indicated by the IE field in Config4.

Implementation of EntryHIEHINV field is required for implementation of TLBINVF instruction.

Pre-Release 6, support for TLBINVF is recommended for implementations supporting VTLB/FTLB type of MMU.

Release 6 (and subsequent releases) support for TLBINV is required for implementations supporting VTLB/FTLB type of MMU.

Release 6: On processors that include a Block Address Translation (BAT) or Fixed Mapping (FM) MMU (ConfigMT =

2 or 3), the operation of this instruction causes a Reserved Instruction exception (RI).

Operation:

if ( ConfigMT=1 or (ConfigMT=4 & Config4IE=2 & Index < VTLBsize() )) 
   startnum = 0
   endnum = VTLBsize() - 1
endif
// treating VTLB and FTLB as one array 
if (ConfigMT=4 & Config4IE=2 & Index >= VTLBsize(); ) 
   startnum = start of selected FTLB set // implementation specific
   endnum = end of selected FTLB set - 1 //implementation specifc
endif
if (ConfigMT=4 & Config4IE=3)) 
   startnum = 0
   endnum = TLBsize() + FTLBsize() - 1;
endif
for (i = startnum to endnum) 
   TLB[i]VPN2_invalid = 1
endfor
function VTLBsize 
   SizeExt = ArchRev() >= 6          ?  Config4VTLBSizeExt
         : Config4MMUExtDef == 3     ?  Config4VTLBSizeExt
         : Config4MMUExtDef == 1     ?  Config4MMUSizeExt
         :               0
         ;
    return 1 + ( (SizeExt << 6) | Config1.MMUSize );
endfunction
function FTLBsize 
   if ( Config1MT == 4 ) then
      return ( Config4FTLBWays + 2 ) * ( 1 << C0_Config4FTLBSets );
   else 
      return 0;
   endif
endfunction

Exceptions:

Coprocessor Unusable,

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

TLBP

001000

6

1

19

6

Format:

TLBP 

MIPS32

Probe TLB for Matching Entry

Purpose:

Probe TLB for Matching Entry

To find a matching entry in the TLB.

Description:

The Index register is loaded with the address of the TLB entry whose contents match the contents of the EntryHi register. If no TLB entry matches, the high-order bit of the Index register is set.

TLBP. However, implementations are strongly encouraged to report multiple TLB matches only on a TLB write.

Restrictions:

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Release 6: Processors that include a Block Address Translation (BAT) or Fixed Mapping (FM) MMU (ConfigMT = 2 or 3), the operation of this instruction causes a Reserved Instruction exception (RI).

Operation:

Index = 1 || UNPREDICTABLE31
for i in 00 ... TLBEntries-1
   if ((TLB[i]VPN2 and not (TLB[i]Mask)) =
               (EntryHiVPN2 and not (TLB[i]Mask))) and
       (TLB[i]R = EntryHiR) and
       ((TLB[i]G = 1) or (TLB[i]ASID = EntryHiASID)) then
       Index = i
   endif
endfor

Exceptions:

Coprocessor Unusable, Machine Check

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

TLBR

000001

6

1

19

6

Format:

TLBR 

MIPS32

Read Indexed TLB Entry

Purpose:

Read Indexed TLB Entry

To read an entry from the TLB.

Description:

The EntryHi, EntryLo0, EntryLo1, and PageMask registers are loaded with the contents of the TLB entry pointed to by the Index register.

TLBR. However, implementations are strongly encouraged to report multiple TLB matches only on a TLB write.

In an implementation supporting TLB entry invalidation (Config4IE >= 1), reading an invalidated TLB entry causes

EntryLo0 and EntryLo1 to be set to 0, EntryHiEHINV to be set to 1, all other EntryHi bits to be set to 0, and

PageMask to be set to a value representing the minimum supported page size..

The value written to the EntryHi, EntryLo0, and EntryLo1 registers may be different from the original written value to the TLB via these registers in that:

The value returned in the VPN2 field of the EntryHi register may have those bits set to zero corresponding to the

The value returned in the PFN field of the EntryLo0 and EntryLo1 registers may have those bits set to zero cor-

The value returned in the G bit in both the EntryLo0 and EntryLo1 registers comes from the single G bit in the

Restrictions:

The operation is UNDEFINED if the contents of the Index register are greater than or equal to the number of TLB entries in the processor.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled. Release 6: Processors that include a Block Address Translation (BAT) or Fixed Mapping (FM) MMU (ConfigMT = 2 or 3), the operation of this instruction causes a Reserved Instruction exception (RI).

Operation:

i = Index
if i > (TLBEntries - 1) then
   UNDEFINED
endif
if ( (Config4IE >= 1) and TLB[i]VPN2_invalid = 1) then
   PagemaskMask = 0 // or value representing minimum page size
   EntryHi = 0
   EntryLo1 = 0
   EntryLo0 = 0
   EntryHiEHINV = 1
else
   PageMaskMask = TLB[i]Mask
   EntryHi = TLB[i]R || 0Fill ||
             (TLB[i]VPN2 and not TLB[i]Mask) || # Masking implem dependent
             05 || TLB[i]ASID
   EntryLo1 = 0Fill ||
             (TLB[i]PFN1 and not TLB[i]Mask) || # Masking mplem dependent
             TLB[i]C1 || TLB[i]D1 || TLB[i]V1 || TLB[i]G
   EntryLo0 = 0Fill ||
          (TLB[i]PFN0 and not TLB[i]Mask) || # Masking mplem dependent
          TLB[i]C0 || TLB[i]D0 || TLB[i]V0 || TLB[i]G
endif

Exceptions:

Coprocessor Unusable, Machine Check

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

TLBWI

000010

6

1

19

6

Format:

TLBWI 

MIPS32

Write Indexed TLB Entry

Purpose:

Write Indexed TLB Entry

To write or invalidate a TLB entry indexed by the Index register.

Description:

If Config4IE == 0 or EntryHiEHINV=0:

The TLB entry pointed to by the Index register is written from the contents of the EntryHi, EntryLo0, EntryLo1, and PageMask registers. It is implementation dependent whether multiple TLB matches are detected on a

TLBWI. In such an instance, a Machine Check Exception is signaled.

In Release 2 of the Architecture, multiple TLB matches may only be reported on a TLB write. The information written to the TLB entry may be different from that in the EntryHi, EntryLo0, and EntryLo1 registers, in that:

If Config4IE >= 1 and EntryHiEHINV = 1:

The TLB entry pointed to by the Index register has its VPN2 field marked as invalid. This causes the entry to be ignored on TLB matches for memory accesses. No Machine Check is generated.

Restrictions:

The operation is UNDEFINED if the contents of the Index register are greater than or equal to the number of TLB entries in the processor.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Release 6: Processors that include a Block Address Translation (BAT) or Fixed Mapping (FM) MMU (ConfigMT = 2 or 3), the operation of this instruction causes a Reserved Instruction exception (RI).

Operation:

i = Index
if (Config4IE >= 1) then 
   TLB[i]VPN2_invalid = 0
   if ( EntryHIEHINV=1 ) then
      TLB[i]VPN2_invalid = 1
      break
   endif
endif
TLB[i]Mask = PageMaskMask
TLB[i]R = EntryHiR
TLB[i]VPN2 = EntryHiVPN2 and not PageMaskMask # Implementation dependent
TLB[i]ASID = EntryHiASID
TLB[i]G = EntryLo1G and EntryLo0G
TLB[i]PFN1 = EntryLo1PFN and not PageMaskMask # Implementation dependent
TLB[i]C1 = EntryLo1C
TLB[i]D1 = EntryLo1D
TLB[i]V1 = EntryLo1V
TLB[i]PFN0 = EntryLo0PFN and not PageMaskMask # Implementation dependent
TLB[i]C0 = EntryLo0C
TLB[i]D0 = EntryLo0D
TLB[i]V0 = EntryLo0V

Exceptions:

Coprocessor Unusable, Machine Check

Encoding:

COP0

010000

CO

1

0

000 0000 0000 0000 0000

TLBWR

000110

6

1

19

6

Format:

TLBWR 

MIPS32

Write Random TLB Entry

Purpose:

Write Random TLB Entry

To write a TLB entry indexed by the Random register, or, in Release 6, write a TLB entry indexed by an implementation-defined location.

Description:

The TLB entry pointed to by the Random register is written from the contents of the EntryHi, EntryLo0, EntryLo1, and PageMask registers. It is implementation dependent whether multiple TLB matches are detected on a TLBWR.

In such an instance, a Machine Check Exception is signaled.

In Release 6, the Random register has been removed. References to Random refer to an implementation-determined value that is not visible to software.

In Release 2 of the Architecture, multiple TLB matches may only be reported on a TLB write. The information written to the TLB entry may be different from that in the EntryHi, EntryLo0, and EntryLo1 registers, in that:

TLB write.

TLB write.

Restrictions:

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Release 6: Processors that include a Block Address Translation (BAT) or Fixed Mapping (FM) MMU (ConfigMT = 2 or 3), the operation of this instruction causes a Reserved Instruction exception (RI).

Operation:

i = Random
if (Config4IE >= 1) then 
   TLB[i]VPN2_invalid = 0
   endif
TLB[i]Mask = PageMaskMask
TLB[i]R = EntryHiR
TLB[i]VPN2 = EntryHiVPN2 and not PageMaskMask # Implementation dependent
TLB[i]ASID = EntryHiASID
TLB[i]G = EntryLo1G and EntryLo0G
TLB[i]PFN1 = EntryLo1PFN and not PageMaskMask # Implementation dependent
TLB[i]C1 = EntryLo1C
TLB[i]D1 = EntryLo1D
TLB[i]V1 = EntryLo1V
TLB[i]PFN0 = EntryLo0PFN and not PageMaskMask # Implementation dependent
TLB[i]C0 = EntryLo0C
TLB[i]D0 = EntryLo0D
TLB[i]V0 = EntryLo0V

Exceptions:

Coprocessor Unusable, Machine Check

Encoding:

REGIMM

000001

rs

TLTI

01010

immediate

6

5

5

16

Format:

TLTI rs, immediate

MIPS32, removed in Release 6

Trap if Less Than Immediate

Purpose:

Trap if Less Than Immediate

To compare a GPR to a constant and do a conditional trap.

Description:

 if GPR[rs] < immediate then Trap

Compare the contents of GPR rs and the 16-bit signed immediate as signed integers. If GPR rs is less than immediate, then take a Trap exception.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if GPR[rs] < sign_extend(immediate) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

REGIMM

000001

rs

TLTIU

01011

immediate

6

5

5

16

Format:

TLTIU rs, immediate

MIPS32, removed in Release 6

Trap if Less Than Immediate Unsigned

Purpose:

Trap if Less Than Immediate Unsigned

To compare a GPR to a constant and do a conditional trap.

Description:

 if GPR[rs] < immediate then Trap

Compare the contents of GPR rs and the 16-bit sign-extended immediate as unsigned integers. If GPR rs is less than

immediate, then take a Trap exception.

Because the 16-bit immediate is sign-extended before comparison, the instruction can represent the smallest or largest unsigned numbers. The representable values are at the minimum [0, 32767] or maximum [max_unsigned-32767, max_unsigned] end of the unsigned range.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if (0 || GPR[rs]) < (0 || sign_extend(immediate)) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

SPECIAL

000000

rs

rt

code

TLT

110010

6

5

5

10

6

Format:

TLT rs, rt

MIPS32

Trap if Less Than

Purpose:

Trap if Less Than

To compare GPRs and do a conditional trap.

Description:

 if GPR[rs] < GPR[rt] then Trap

Compare the contents of GPR rs and GPR rt as signed integers. If GPR rs is less than GPR rt, then take a Trap exception.

The contents of the code field are ignored by hardware and may be used to encode information for system software.

To retrieve the information, system software must load the instruction word from memory. Alternatively, if CP0

BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

if GPR[rs] < GPR[rt] then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

SPECIAL

000000

rs

rt

code

TLTU

110011

6

5

5

10

6

Format:

TLTU rs, rt

MIPS32

Trap if Less Than Unsigned

Purpose:

Trap if Less Than Unsigned

To compare GPRs and do a conditional trap.

Description:

 if GPR[rs] < GPR[rt] then Trap

Compare the contents of GPR rs and GPR rt as unsigned integers. If GPR rs is less than GPR rt, then take a Trap exception.

The contents of the code field are ignored by hardware and may be used to encode information for system software.

To retrieve the information, system software must load the instruction word from memory. Alternatively, if CP0

BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

if (0 || GPR[rs]) < (0 || GPR[rt]) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

REGIMM

000001

rs

TNEI

01110

immediate

6

5

5

16

Format:

TNEI rs, immediate

MIPS32, removed in Release 6

Trap if Not Equal Immediate

Purpose:

Trap if Not Equal Immediate

To compare a GPR to a constant and do a conditional trap.

Description:

 if GPR[rs] != immediate then Trap

Compare the contents of GPR rs and the 16-bit signed immediate as signed integers. If GPR rs is not equal to immediate, then take a Trap exception.

Restrictions:

None

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if GPR[rs] != sign_extend(immediate) then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

SPECIAL

000000

rs

rt

code

TNE

110110

6

5

5

10

6

Format:

TNE rs, rt

MIPS32

Trap if Not Equal

Purpose:

Trap if Not Equal

To compare GPRs and do a conditional trap.

Description:

 if GPR[rs] != GPR[rt] then Trap 

Compare the contents of GPR rs and GPR rt as signed integers. If GPR rs is not equal to GPR rt, then take a Trap exception.

The contents of the code field are ignored by hardware and may be used to encode information for system software.

To retrieve the information, system software must load the instruction word from memory. Alternatively, if CP0

BadInstr is implemented, the code field may be obtained from BadInstr.

Restrictions:

None

Operation:

if GPR[rs] != GPR[rt] then
   SignalException(Trap) 
endif

Exceptions:

Trap

Encoding:

COP1

010001

fmt

0

00000

fs

fd

TRUNC.L

001001

6

5

5

5

5

6

Format:

TRUNC.L.fmt 

Floating Point Truncate to Long Fixed Point

TRUNC.L.S fd, fs

MIPS64, MIPS32 Release 2

Floating Point Truncate to Long Fixed Point

TRUNC.L.D fd, fs

MIPS64, MIPS32 Release 2

Floating Point Truncate to Long Fixed Point

Purpose:

Floating Point Truncate to Long Fixed Point

To convert an FP value to 64-bit fixed point, rounding toward zero.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 64-bit long-fixed point format and rounded toward zero

(rounding mode 1). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -263 to 263-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. In this case the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is 263-1. On cores with FCSRNAN2008=1, the default result is:

263-1 when the input value is +inf or rounds to a number larger than 263-1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for long fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model; it is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Operation:

StoreFPR(fd, L, ConvertFmt(ValueFPR(fs, fmt), fmt, L))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Inexact

Encoding:

COP1

010001

fmt

0

00000

fs

fd

TRUNC.W

001101

6

5

5

5

5

6

Format:

TRUNC.W.fmt 

Floating Point Truncate to Word Fixed Point

TRUNC.W.S fd, fs

MIPS32

Floating Point Truncate to Word Fixed Point

TRUNC.W.D fd, fs

MIPS32

Floating Point Truncate to Word Fixed Point

Purpose:

Floating Point Truncate to Word Fixed Point

To convert an FP value to 32-bit fixed point, rounding toward zero.

Description:

 FPR[fd] = convert_and_round(FPR[fs])

The value in FPR fs, in format fmt, is converted to a value in 32-bit word fixed point format using rounding toward zero (rounding mode 1). The result is placed in FPR fd.

When the source value is Infinity, NaN, or rounds to an integer outside the range -231 to 231-1, the result cannot be represented correctly and an IEEE Invalid Operation condition exists. In this case the Invalid Operation flag is set in the FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise, a default result is written to fd. On cores with FCSRNAN2008=0, the default result is 231-1. On cores with FCSRNAN2008=1, the default result is:

231-1 when the input value is +inf or rounds to a number larger than 231-1

The fields fs and fd must specify valid FPRs: fs for type fmt and fd for word fixed point. If the fields are not valid, the result is UNPREDICTABLE.

The operand must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand

FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, W, ConvertFmt(ValueFPR(fs, fmt), fmt, W))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Inexact, Invalid Operation, Unimplemented Operation

Encoding:

COP0

010000

CO

1

Implementation-dependent code

WAIT

100000

6

1

19

6

Format:

WAIT 

MIPS32

Enter Standby Mode

Purpose:

Enter Standby Mode

Wait for Event

Description:

The WAIT instruction performs an implementation-dependent operation, involving a lower power mode. Software may use the code bits of the instruction to communicate additional information to the processor. The processor may use this information as control for the lower power mode. A value of zero for code bits is the default and must be valid in all implementations.

The WAIT instruction is implemented by stalling the pipeline at the completion of the instruction and entering a lower power mode. The pipeline is restarted when an external event, such as an interrupt or external request occurs, and execution continues with the instruction following the WAIT instruction. It is implementation-dependent whether the pipeline restarts when a non-enabled interrupt is requested. In this case, software must poll for the cause of the restart. The assertion of any reset or NMI must restart the pipeline and the corresponding exception must be taken.

If the pipeline restarts as the result of an enabled interrupt, that interrupt is taken between the WAIT instruction and the following instruction (EPC for the interrupt points at the instruction following the WAIT instruction).

In Release 6, the behavior of WAIT has been modified to make it a requirement that a processor that has disabled operation as a result of executing a WAIT will resume operation on arrival of an interrupt even if interrupts are not enabled.

In Release 6, the encoding of WAIT with bits 24:6 of the opcode set to 0 will never disable CP0 Count on an active

WAIT instruction. In particular, this modification has been added to architecturally specify that CP0 Count is not disabled on execution of WAIT with default code of 0. Prior to Release 6, whether Count is disabled was implementation-dependent. In the future, other encodings of WAIT may be defined which specify other forms of power-saving or stand-by modes. If not implemented, then such unimplemented encodings must default to WAIT 0.

Restrictions:

Pre-Release 6: The operation of the processor is UNDEFINED if a WAIT instruction is executed in the delay slot of a branch or jump instruction.

Release 6: Implementations are required to signal a Reserved Instruction exception if WAIT is encountered in the delay slot or forbidden slot of a branch or jump instruction.

If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.

Operation:

Pre-Release 6:

I: Enter implementation dependent lower power mode
I+1:/* Potential interrupt taken here */

Release 6:

I: if IsCoprocessorEnabled(0) then
      while ( !interrupt_pending_and_not_masked_out() && 
             !implementation_dependent_wake_event() ) 
         < enter or remain in low power mode or stand-by mode>
   else
      SignalException(CoprocessorUnusable, 0)
   endif
 
I+1:  if ( interrupt_pending() && interrupts_enabled() ) then
         EPC = PC + 4
         < process interrupt; execute ERET eventually >
      else 
             // unblock on non-enabled interrupt or imp dep wake event.
             PC = PC + 4
             < continue execution at instruction after wait >
      endif
 
 function interrupt_pending_and_not_masked_out 
     return (Config3VEIC && IntCtlVS && CauseIV && !StatusBEV)
             ? CauseRIPL > StatusIPL : CauseIP & StatusIM;
 endfunction
 
 function interrupts_enabled 
     return StatusIE && !StatusEXL && !StatusERL && !DebugDM;
  endfunction
 
function implementation_dependent_wake_event
    <return true if implementation dependent waking-up event occurs>
 endfunction

Exceptions:

Coprocessor Unusable Exception

Encoding:

COP0

0100 00

WRPGPR

01 110

rt

rd

0

000 0000 0000

6

5

5

5

11

Format:

WRPGPR rd, rt

MIPS32 Release 2

Write to GPR in Previous Shadow Set

Purpose:

Write to GPR in Previous Shadow Set

To move the contents of a current GPR to a GPR in the previous shadow set.

Description:

 SGPR[SRSCtlPSS, rd] = GPR[rt]

The contents of the current GPR rt is moved to the shadow GPR register specified by SRSCtlPSS (signifying the previous shadow set number) and rd (specifying the register number within that set).

Restrictions:

In implementations prior to Release 2 of the Architecture, this instruction resulted in a Reserved Instruction exception.

Operation:

SGPR[SRSCtlPSS, rd] = GPR[rt]

Exceptions:

Coprocessor Unusable, Reserved Instruction

Encoding:

SPECIAL3

011111

0

00000

rt

rd

WSBH

00010

BSHFL

100000

6

5

5

5

5

6

Format:

WSBH rd, rt

MIPS32 Release 2

Word Swap Bytes Within Halfwords

Purpose:

Word Swap Bytes Within Halfwords

To swap the bytes within each halfword of GPR rt and store the value into GPR rd.

Description:

 GPR[rd] = SwapBytesWithinHalfwords(GPR[rt])

Within each halfword of the lower word of GPR rt the bytes are swapped, the result is sign-extended, and stored in

GPR rd.

Restrictions:

In implementations prior to Release 2 of the architecture, this instruction resulted in a Reserved Instruction exception.

If GPR rt does not contain a sign-extended 32-bit value (bits 63..31 equal), then the result of the operation is

UNPREDICTABLE.

Operation:

if NotWordValue(GPR[rt]) then
   UNPREDICTABLE
endif
GPR[rd] = sign_extend(GPR[rt]23..16 || GPR[rt]31..24 || GPR[rt]7..0 || GPR[rt]15..8)

Exceptions:

Reserved Instruction

Programming Notes:

The WSBH instruction can be used to convert halfword and word data of one endianness to another endianness. The endianness of a word value can be converted using the following sequence:

lw    t0, 0(a1)           /* Read word value */
wsbh  t0, t0              /* Convert endiannes of the halfwords */
rotr  t0, t0, 16          /* Swap the halfwords within the words */

Combined with SEH and SRA, two contiguous halfwords can be loaded from memory, have their endianness converted, and be sign-extended into two word values in four instructions. For example:

lw    t0, 0(a1)           /* Read two contiguous halfwords */
wsbh  t0, t0              /* Convert endiannes of the halfwords */
seh   t1, t0              /* t1 = lower halfword sign-extended to word */
sra   t0, t0, 16          /* t0 = upper halfword sign-extended to word */

Zero-extended words can be created by changing the SEH and SRA instructions to ANDI and SRL instructions, respectively.

.

Encoding:

XORI

001110

rs

rt

immediate

6

5

5

16

Format:

XORI rt, rs, immediate

MIPS32

Exclusive OR Immediate

Purpose:

Exclusive OR Immediate

To do a bitwise logical Exclusive OR with a constant.

Description:

 GPR[rt] = GPR[rs] XOR immediate

Combine the contents of GPR rs and the 16-bit zero-extended immediate in a bitwise logical Exclusive OR operation and place the result into GPR rt.

Restrictions:

None

Operation:

GPR[rt] = GPR[rs] xor zero_extend(immediate)

Exceptions:

None

Encoding:

SPECIAL

000000

rs

rt

rd

0

00000

XOR

100110

6

5

5

5

5

6

Format:

XOR rd, rs, rt

MIPS32

Exclusive OR

Purpose:

Exclusive OR

To do a bitwise logical Exclusive OR.

Description:

 GPR[rd] = GPR[rs] XOR GPR[rt]

Combine the contents of GPR rs and GPR rt in a bitwise logical Exclusive OR operation and place the result into

GPR rd.

Restrictions:

None

Operation:

GPR[rd] = GPR[rs] xor GPR[rt]

Exceptions:

None