Encoding:

COP1

010001

fmt

10110

ADDR.PS

011000

Format:

ADDR.PS fd, fs, ft

MIPS-3D

Floating Point Reduction Add

Purpose:

Floating Point Reduction Add

To perform a reduction add on two paired-single ﬂoating point values

Description:

FPR[fd].PL = FPR[ft].PU + FPR[ft].PL; FPR[fd].PU = FPR[fs].PU + FPR[fs].PL

The paired-single values in FPR ft are added together and the result put in the lower paired-single position of FPR fd.

Similarly, the paired-single values in FPR fs are added together and the result put in the upper paired-single position of FPR fd. The two results are calculated to inﬁnite precision and rounded by using the current rounding mode in

FCSR. The operands and result are values in format PS.

Any generated exceptions in the two independent adds are OR’ed together. Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

is UNPRE-The ﬁelds fs,ft, and fd must specify FPRs valid for operands of type PS. If they are not valid, the result

DICTABLE.

is UNPREDICTABLE and the values in theThe operands must be values in format PS. If they are not,the result operand FPRs become UNPREDICTABLE.

The result of ADDR.PS is UNPREDICTABLE if the processor is executing in 16 FP registers mode.

Operation:

lower = ValueFPR(ft, PS)_31..0+ ValueFPR(ft, PS)_63..32
upper = ValueFPR(fs, PS)_31..0+ ValueFPR(fs, PS)_63..32
StoreFPR (fd, PS, upper || lower)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Overﬂow, Inexact, Underﬂow

Encoding:

COP1

010001

BC1ANY2

01001

xx0

offset

Format:

BC1ANY2F cc,offset

MIPS-3D

Branch on Any of Two Floating Point Condition Codes False

Purpose:

Branch on Any of Two Floating Point Condition Codes False

To test two consecutive ﬂoating point condition codes and do a PC-relative conditional branch

Description:

If FPConditionCode(CCn+1) = 0 or FPConditionCode(CCn) = 0, then branch

An 18-bit signed offset (the 16-bit offset ﬁeld shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If either one of the two FP condition code bits CC is false (0), the program branches to the effective target address after the instruction in the delay slot is executed.

The CC speciﬁed must align to 2, so bit 18 must always be zero. For example, specifying a value of 4 will check if in UNPREeither one of CC5 or CC4 is 0 and branch accordingly. Specifying an illegally aligned CC will result

DICTABLE behavior.

An FP condition code is set by an FP compare instruction, C.cond.fmt and the MIPS-3D compare absolute instruction

CABS.cond.fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Operation:

This operation speciﬁcation is for the general Branch On Any Two Condition operation with the tf (true/false) as a variables. The individual instructions BC1ANY2F and BC1ANY2T have a speciﬁc values for tf.

I:    condition = (FPConditionCode(cc) = 0) or
                   (FPConditionCode(cc+1) = 0)
      target_offset = (offset₁₅)^{GPRLEN-(16+2)}|| offset || 0²
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) instructions to branch to addresses outside this range.

Encoding:

COP1

010001

BC1ANY2

01001

xx0

offset

Format:

BC1ANY2T cc,offset

MIPS-3D

Branch on Any of Two Floating Point Condition Codes True

Purpose:

Branch on Any of Two Floating Point Condition Codes True

To test two consecutive FP condition codes and do a PC-relative conditional branch

Description:

If FPConditionCode(CCn+1) = 1 or FPConditionCode(CCn) = 1, then branch

An 18-bit signed offset (the 16-bit offset ield shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If either one of the two FP condition code bits CC is true (1), the program branches to the effective target address after the instruction in the delay slot is executed.

The CC speciied must align to 2, so bit 18 must always be zero. For example, specifying a value of 2 will check if in UNPREeither one of CC3 or CC2 is 1 and branch accordingly. Specifying an illegally aligned CC will result

DICTABLE behavior.

An FP condition code is set by an FP compare instruction, C.cond.fmt, and the MIPS-3D compare absolute instruction CABS.cond.fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Operation:

This operation speciication is for the general Branch On Any Two Condition operation with the tf (true/false) as a variables. The individual instructions BC1ANY2F and BC1ANY2T have a speciic values for tf.

I:    condition = (FPConditionCode(cc) = 1) or
                   (FPConditionCode(cc+1) = 1)
      target_offset = (offset₁₅)^{GPRLEN-(16+2)}|| offset || 0²
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) instructions to branch to addresses outside this range.

Encoding:

COP1

010001

BC1ANY4

01010

x00

offset

Format:

BC1ANY4F cc,offset

MIPS-3D

Branch on Any of Four Floating Point Condition Codes False

Purpose:

Branch on Any of Four Floating Point Condition Codes False

To test four consecutive FP condition codes and do a PC-relative conditional branch

Description:

If FPConditionCode(CCn+3) = 0 or FPConditionCode(CCn+2) = 0  or
FPConditionCode(CCn+1) = 0 or FPConditionCode(CCn) = 0, then branch

An 18-bit signed offset (the 16-bit offset ield shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself) in the branch delay slot to form a PC-relative effective target address. If any of the four FP condition code bits CC is false (0), the program branches to the effective target address after the instruction in the delay slot is executed.

The CC speciied must align to 4, so bits 18 and 19 must always be zero. For example, specifying a value of 0 will in UNPREcheck if any one of CC3..0 is 0 and branch accordingly. Specifying an illegally aligned CC will result

DICTABLE behavior.

An FP condition code is set by an FP compare instruction, C.cond.fmt and the MIPS-3D compare absolute instruction

CABS.cond.fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Operation:

This operation speciication is for the general Branch On Any Four Condition operation with the tf (true/false) as a variables. The individual instructions BC1ANY4F and BC1ANY4T have a speciic values for tf.

I:    condition = (FPConditionCode(cc) = 0) or
                   (FPConditionCode(cc+1) = 0) or
                   (FPConditionCode(cc+2) = 0) or
                   (FPConditionCode(cc+3) = 0)
      target_offset = (offset₁₅)^{GPRLEN-(16+2)}|| offset || 0²
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) instructions to branch to addresses outside this range.

Encoding:

COP1

010001

BC1ANY4

01010

x00

offset

Format:

BC1ANY4T cc,offset

MIPS-3D

Branch on Any of Four Floating Point Condition Codes True

Purpose:

Branch on Any of Four Floating Point Condition Codes True

To test four consecutive FP condition codes and do a PC-relative conditional branch

Description:

If FPConditionCode(CCn+3) = 1 or FPConditionCode(CCn+2) = 1  or
FPConditioncode(CCn+1) = 1 or FPConditionCode(CCn) = 1, then branch

FP condition code bits CC is true (1), the program branches to the effective target address after the instruction in the delay slot is executed.

The CC speciied must align to 4, so bits 18 and 19 must always be zero. For example, specifying a value of 4 will check if any of the bits CC7..4 is 1 and branch accordingly. Specifying an illegally aligned CC will result in UNPREDICTABLE behavior.

An FP condition code is set by an FP compare instruction, C.cond.fmt and the MIPS-3D compare absolute instruction

CABS.cond.fmt.

Restrictions:

Processor operation is UNPREDICTABLE if a branch, jump, ERET, DERET, or WAIT instruction is placed in the delay slot of a branch or jump.

Operation:

I:    condition = (FPConditionCode(cc) = 1) or
                   (FPConditionCode(cc+1) = 1) or
                   (FPConditionCode(cc+2) = 1) or
                   (FPConditionCode(cc+3) = 1)
      target_offset = (offset₁₅)^{GPRLEN-(16+2)}|| offset || 0²
I+1:  if condition then
         PC = PC + target_offset
      endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Programming Notes:

With the 18-bit signed instruction offset, the conditional branch range is ± 128 KBytes. Use jump (J) or jump register

(JR) instructions to branch to addresses outside this range.

Encoding:

COP1

010001

fmt

cond

Format:

CABS.cond.fmt		Floating Point Absolute Compare
CABS.cond.S cc,fs,ft	MIPS-3D	Floating Point Absolute Compare
CABS.cond.D cc,fs,ft	MIPS-3D	Floating Point Absolute Compare
CABS.cond.PS cc,fs,ft	MIPS-3D	Floating Point Absolute Compare

Purpose:

Floating Point Absolute Compare

To compare FP values and record the boolean result in one or more condition codes

Description:

FPConditionCode(cc) = FPR[fs] compare_absolute_cond FPR[ft]

The absolute value in FPR fs is compared to the absolute value in FPR ft; the values are in format fmt. The comparison is exact and neither overﬂows nor underﬂows.

If the comparison speciﬁed by cond_2..1is true for the operand values, the result is true; otherwise, the result is false. If no exception is taken, the result is written into condition code CC; true is 1 and false is 0.

CABS.cond.PS compares the upper and lower halves of FPR fs and FPR ftindependently and writes the results into condition codes CC+1 and CC respectively. The CC number must be even. If the number is not even the operation of the instruction is UNPREDICTABLE.

See the description of the C.cond.fmtinstruction in Volume II of this multi-volume set for a complete description of the cond value and the behavior of the compare instruction.

Restrictions:

The ﬁelds fs and ft must specify FPRs valid for operands of type fmt; if they are not valid, the result is UNPREDICTABLE. is UNPREDICTABLE and the value of the

The operands must be values in formatfmt;if they are not,the result operand FPRs becomes UNPREDICTABLE.

The result of CABS.cond.PS is UNPREDICTABLE if the processor is executing in 16 FP registers mode, or if the condition code number is odd.

Operation:

if SNaN(ValueFPR(fs, fmt)) or SNaN(ValueFPR(ft, fmt)) or
   QNaN(ValueFPR(fs, fmt)) or QNaN(ValueFPR(ft, fmt)) then
   less = false
   equal = false
   unordered = true
   if (SNaN(ValueFPR(fs,fmt)) or SNaN(ValueFPR(ft,fmt))) or
   (cond₃and (QNaN(ValueFPR(fs,fmt)) or QNaN(ValueFPR(ft,fmt)))) then
      SignalException(InvalidOperation)
   endif
else
   less =   AbsoluteValue(ValueFPR(fs, fmt)) <_fmt
             AbsoluteValue(ValueFPR(ft, fmt))
   equal =  AbsoluteValue(ValueFPR(fs, fmt)) =_fmt
             AbsoluteValue(ValueFPR(ft, fmt))
   unordered = false
endif
condition = (cond₂ and less) or (cond₁ and equal)
      or (cond₀ and unordered)
SetFPConditionCode(cc, condition)

For CABS.cond.PS,the pseudo code above is repeated for both halves of the operand registers, treating each half as an independent single-precision values. Exceptions on the two halves are logically ORed and reported together. The results of the lower half comparison are written to condition code CC; the results of the upper half comparison are written to condition code CC+1.

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation

Encoding:

COP1

010001

fmt

10100

00000

CVT.PS.PW

100110

Format:

CVT.PS.PW fd,fs

MIPS-3D

Floating Point Convert Paired Word to Paired Single

Purpose:

Floating Point Convert Paired Word to Paired Single

To convert a pair of 32-bit ﬁxed point words to FP paired-single value

Description:

FPR[fd] = (convert_and_round(FPR[fs]_63..32) || convert_and_round(FPR[f]s_31..0)

The value in FPR fs, in format PW, is converted to a value in paired-single ﬂoating point format and rounded according to the current rounding mode in FCSR. The result is placed in FPR fd.

Restrictions:

The ﬁelds fs and fd must specify valid FPRs-fs for type PW and fd for type PS. If they are not valid,the resultis

UNPREDICTABLE. The operand in register fs must be a value in format type PW; if it is not, the result is UNPREDICTABLE and the value in the operand FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in 16 FP registers mode.

Operation:

StoreFPR(fd, PS,
                ConvertFmt(ValueFPR(fs, PW)_63..32, W, S) ||
                ConvertFmt(ValueFPR(fs, PW)_31..0, W, S)
      )

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation

Encoding:

COP1

010001

fmt

10110

00000

CVT.PW.PS

100100

Format:

CVT.PW.PS fd,fs

MIPS-3D

Floating Point Convert Paired Single to Paired Word

Purpose:

Floating Point Convert Paired Single to Paired Word

To convert a FP paired-single value to a pair of 32-bit ﬁxed point words

Description:

FPR[fd].PU = convert_and_round(FPR[fs].PU); FPR[fd].PL =convert_and_round(FPR[fs].PL)

The values in FPR fs,in format PS, are converted to a pair of values in 32-bit word ﬁxed point format and rounded according to the current rounding mode in FCSR. The result is placed in FPR fd. The conversions of the two halves are done independently.

When either source value is Inﬁnity, NaN, or rounds to an integer outside the range -231 to 231-1,the result cannot be represented correctly, an IEEE Invalid Operation condition exists, and the Invalid Operation ﬂag is set in the

FCSR. If the Invalid Operation Enable bit is set in the FCSR, no result is written to fd and an Invalid Operation exception is taken immediately. Otherwise,

the defaultresult, 231–1,is written to the correspond half of FPR fd which caused the exception.

Restrictions:

The ﬁelds fs and fd must specify valid FPRs-fs for type PS and fd for type PW. If they are not valid,the resultis

UNPREDICTABLE. The format of the data in the speciﬁed operand register fs must be a value in format PS; if it is

not, the result is UNPREDICTABLE and the value in the operand FPR becomes UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in 16 FP registers mode.

Operation:

StoreFPR(fd, PW,
                ConvertFmt(ValueFPR(fs, PS)_63..32, S, W) ||
                ConvertFmt(ValueFPR(fs, PS)_31..0, S, W)
      )

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions

Unimplemented Operation, Invalid Operation, Overﬂow, Inexact

Encoding:

COP1

010001

fmt

10110

MULR.PS

011010

Format:

MULR.PS fd, fs, ft

MIPS-3D

Floating Point Reduction Multiply

Purpose:

Floating Point Reduction Multiply

To perform a reduction multiply on two paired-single ﬂoating point values

Description:

FPR[fd].PL = FPR[ft].PU ∗ FPR[ft].PL; FPR[fd].PU = FPR[fs].PU ∗ FPR[fs].PL

The paired-single values in FPR ft are multiplied together and the result putin the lower paired-single position of

FPR fd. Similarly, the paired-single values in FPR fs are multiplied together and the result put in the upper paired-single position of FPR fd. The two results are calculated to inﬁnite precision and rounded by using the current rounding mode in FCSR. The operands and result are values in format PS.

Any generated exceptions in the two independent adds are OR’ed together. Cause bits are ORed into the Flag bits if no exception is taken.

Restrictions:

is UNPRE-The ﬁelds fs,ft, and fd must specify FPRs valid for operands of type PS. If they are not valid, the result

DICTABLE.

is UNPREDICTABLE and the values in theThe operands must be values in format PS. If they are not,the result operand FPRs become UNPREDICTABLE.

The result of ADDR.PS is UNPREDICTABLE if the processor is executing in 16 FP registers mode.

Operation:

lower = ValueFPR(ft, PS)_31..0× ValueFPR(ft, PS)_63..32
upper = ValueFPR(fs, PS)_31..0× ValueFPR(fs, PS)_63..32
StoreFPR (fd, PS, upper || lower)

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Overﬂow, Inexact, Underﬂow

Encoding:

COP1

010001

fmt

00000

RECIP1

011101

Format:

RECIP1.fmt		Floating Point Reduced Precision Reciprocal (Sequence Step 1)
RECIP1.S fd,fs	MIPS-3D	Floating Point Reduced Precision Reciprocal (Sequence Step 1)
RECIP1.D fd,fs	MIPS-3D	Floating Point Reduced Precision Reciprocal (Sequence Step 1)
RECIP1.PS fd,fs	MIPS-3D	Floating Point Reduced Precision Reciprocal (Sequence Step 1)

Purpose:

Floating Point Reduced Precision Reciprocal (Sequence Step 1)

Generate a reduced-precision reciprocal of one or two FP values

Description:

FPR[fd] = 1.0 / FPR[fs]

The reciprocal of the value in FPR fs is approximated and placed in FPR fd. The operand and result are values in format S, D, or PS.

The numeric accuracy of this operation is implementation dependent;it does not meetthe accuracy speciﬁed by the

IEEE 754 Floating Point standard. A minimum accuracy of 14 bits is recommended for both the S and D input data formats.

It is implementation dependent whether the result is affected by the current rounding mode in FCSR. This instruction is meant to operate in RN (round to nearest) mode for the best accuracy. It is also meant to operate in the Flush to

Zero (FS=0) mode. In this mode, if the incoming data is in the denormalized range, it is assumed to be zero, and if the output is in the denormalized range, it is forced to zero.

In addition, if the input to this instruction is zero, the output is not inﬁnity, but the maximum normalized value. This property is useful for 3D graphics applications. If the input is inﬁnity, the output is zero.

This instruction is used as the ﬁrst step of an instruction sequence that can be used to produce a full precision reciprocal value. See the description of RECIP2.fmt for an example of how to use this instruction in a code sequence to produce a full precision reciprocal result.

Restrictions:

is UNPRE-The ﬁelds fs and fd must specify FPRs valid for operands oftype fmt.Ifthey are not valid,the result

DICTABLE. The format of the data in the speciﬁed operand register fs must be a value in format fmt; if it is not, the

result is UNPREDICTABLE and the value of the operand FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, fmt, (1.0 / ValueFPR(fs, fmt))_{ReducedPrecision})

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Overﬂow, Inexact, Underﬂow, Division-by-zero

Encoding:

COP1

010001

fmt

RECIP2

011100

Format:

RECIP2.fmt		Floating Point Reduced Precision Reciprocal (Sequence Step 2)
RECIP2.S fd,fs,ft	MIPS-3D	Floating Point Reduced Precision Reciprocal (Sequence Step 2)
RECIP2.D fd,fs,ft	MIPS-3D	Floating Point Reduced Precision Reciprocal (Sequence Step 2)
RECIP2.PS fd,fs,ft	MIPS-3D	Floating Point Reduced Precision Reciprocal (Sequence Step 2)

Purpose:

Floating Point Reduced Precision Reciprocal (Sequence Step 2)

Take the result of RECIP1.fmt and iterate towards obtaining a full precision reciprocal FP value

Description:

FPR[fd] = iterate with FPR[fs] and FPR[ft]

This is the second step in the instruction sequence used to generate a full precision reciprocal result. (RECIP1.fmt instruction is the ﬁrst step). The operand and result are values in format S, D, or PS.

The numeric accuracy of this operation is implementation dependent;it does not meetthe accuracy speciﬁed by the

IEEE 754 Floating Point standard.

Zero (FS=0) mode. In this mode, if the incoming data is in the denormalized range, it is assumed to be zero, and if the output is in the denormalized range, it is forced to zero.

The example below shows how a full precision reciprocalresult can be obtained using the RECIP1 and RECIP2 instructions. Assume that a value b is in register f0 in format S. Assume that RECIP1.fmt produces a 16-bit result. At the end of the three-instruction sequence shown below, register f3 contains the full precision 24-bit reciprocal 1/b.

RECIP1.S  f1, f0          /* reduced precision 16-bit 1/b */
RECIP2.S  f2, f1, f0      /* -(b * f1 - 1.0) */
MADD.S    f3, f1, f1, f2  /* 24-bit 1/b */

The instruction sequence to produce a double, 52-bit result is as follows:

RECIP1.D  f1, f0          /* reduced precision 16-bit 1/b */
RECIP2.D  f2, f1, f0      /* -(b * f1 - 1.0) */
MADD.D    f3, f1, f1, f2  /* 32-bit 1/b */
RECIP2.D  f4, f3, f0      /* -(b * f3 - 1.0) */
MADD.D    f5, f3, f3, f4  /* 53-bit 1/b */

The instruction sequence to take a paired single value and produce a paired single resultis as follows. Assume that register f0 holds two single values a and b in a paired single format, i.e., f0 = a | b.

RECIP1.PS f1, f0          /* ( reduced precision 16-bit 1/a and 1/b ) */
RECIP2.PS f2, f1, f0      /* ( -(a*f1-1.0) and -(b*f1-1.0) ) */
MADD.PS   f3, f1, f1, f2  /* ( 24-bit 1/a and 1/b ) */

If the hardware does not implement the RECIP1.PS instruction, it is still possible to obtain a paired single result, but this takes three more instructions in the required sequence. Assume that register f0 holds a single value a and register

f1 holds a single value b.

RECIP1.S  f2, f0          /* ( f2 gets reduced precision 1/a ) */
RECIP1.S  f3, f1          /* ( f3 gets reduced precision 1/b ) */
CVT.PS.S  f4, f1, f0      /* ( f4 now holds the PS values b | a ) */
CVT.PS.S  f5, f3, f2      /* ( f5 holds PS seed 1/b | 1/a ) */
RECIP2.PS f6, f5, f4      /* ( f6 holds intermediate 1/b | 1/a ) */
MADD.PS   f7, f5, f5, f6  /* ( f7 holds full precision PS 1/b | 1/a ) */

Restrictions:

The ﬁelds fs, ft, and fd must specify FPRs valid for operands of type fmt. If they are not valid, the result is UNPREDICTABLE. The format of the data in the speciﬁed operand register fs must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value in the operand FPR becomes UNPREDICTABLE.

The result of RECIP2.PS is UNPREDICTABLE if the processor is executing in 16 FP registers mode.

Operation:

StoreFPR(fd, fmt, RECIP_iteration(ValueFPR(fs, fmt), ValueFPR(ft, fmt)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Inexact, Invalid Operation, Overﬂow, Underﬂow

Encoding:

COP1

010001

fmt

00000

RSQRT1

011110

Format:

RSQRT1.fmt		Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 1)
RSQRT1.S fd, fs	MIPS-3D	Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 1)
RSQRT1.D fd, fs	MIPS-3D	Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 1)
RSQRT1.PS fd, fs	MIPS-3D	Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 1)

Purpose:

Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 1)

To produce a reduced-precision reciprocal of the square root of one or two FP values

Description:

FPR[fd] = 1.0 / sqrt (FPR[fs])

The reciprocal of the positive square root of the value in FPR fs is approximated and placed in FPR fd. The operand and result are values in format S, D, or PS.

The numeric accuracy of this operation is implementation dependent;it does not meetthe accuracy speciﬁed by the

IEEE 754 Floating Point standard. A minimum accuracy of 14 bits is recommended for the S input data format, and

23 bits for the D data format.

It is implementation dependent whether the result is affected by the current rounding mode in FCSR.

This instruction is used as the ﬁrst step of an instruction sequence that can be used to produce a full precision reciprocal square root value. See the description of RSQRT2.fmt for an example of how to use this instruction in a code sequence to produce a full precision reciprocal square root result.

Restrictions:

is UNPRE-The ﬁelds fs and fd must specify FPRs valid for operands of type fmt. If they are not valid,the result

DICTABLE. The format of the data in the speciﬁed operand register fs must be a value in format fmt; if it is not, the

result is UNPREDICTABLE and the value in the operand FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, fmt, (1.0 / SquareRoot(ValueFPR(fs, fmt)))_{ReducedPrecision})

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Overﬂow, Inexact, Underﬂow, Division-by-zero

Encoding:

COP1

010001

fmt

RSQRT2

011111

Format:

RSQRT2.fmt		Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2)
RSQRT2.S fd, fs, ft	MIPS-3D	Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2)
RSQRT2.D fd, fs, ft	MIPS-3D	Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2)
RSQRT2.PS fd, fs, ft	MIPS-3D	Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2)

Purpose:

Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2)

Iterate towards obtaining a full precision reciprocal square root FP value

Description:

FPR[fd] = iterate with FPR[fs] and FPR[ft]

This is a step of iteration towards generating the full precision reciprocal square root value. The operand and result are values in format S, D, or PS.

The numeric accuracy of this operation is implementation dependent;it does not meetthe accuracy speciﬁed by the

IEEE 754 Floating Point standard.

It is implementation dependent whether the result is affected by the current rounding mode in FCSR.

A full precision reciprocal square root result is obtained by using the instruction sequence shown below. Assume that a value b is in register f0 in format S. Assume that RSQRT1.fmt has a 16-bit precision in the example implementation. At the end of the four-instruction sequence shown below, register f4 contains the full precision 24-bit reciprocal square root 1/(sqrt)b.

RSQRT1.S  f1, f0          /* 16-bit 1/sqrt(b) */
MUL.S     f2, f1, f0      /* b * f0 */
RSQRT2.S  f3, f2, f1      /* -(f1 * f2 - 1.0)/2 */
MADD.S    f4, f1, f1, f3  /* 24-bit 1/sqrt(b) */

The instruction sequence to produce a 52-bit result is as follows:

RSQRT1.D  f1, f0          /* 16-bit 1/sqrt(b) */
MUL.D     f2, f1, f0      /* b * f0 */
RSQRT2.D  f3, f2, f1      /* -(f1 * f2 - 1.0)/2 */
MADD.D    f4, f1, f1, f3  /* 31-bit 1/sqrt(b) */
MUL.D     f5, f0, f4      /* b * f0 */
RSQRT2.D  f6, f5, f4      /* -(f4 * f5 - 1.0)/2 */
MADD.D    f7, f4, f4, f6  /* 53-bit 1/sqrt(b) */

RSQRT1.PS f1, f0          /* ( 16-bit 1/sqrt(a) and 1/sqrt(b) ) */
MUL.PS    f2, f1, f0      /* ( a * f0 and b * f1 ) */
RSQRT2.PS f3, f2, f1      /* ( -(f1*f2-1.0)/2 ) */
MADD.PS   f4, f1, f1, f3  /* ( 24-bit 1/sqrt(a) and 1/sqrt(b) ) */

If the hardware does not implement the RSQRT1.PS instruction, it is still possible to obtain a paired single result, but this takes three more instructions in the required sequence. Assume that register f0 holds a single value a and register

f1 holds a single value b.

RSQRT1.S  f2, f0          /* ( f2 gets reduced precision 1/sqrt(a) ) */
RSQRT1.S  f3, f1          /* ( f3 gets reduced precision 1/sqrt(b) ) */
CVT.PS.S  f4, f1, f0      /* ( f4 now holds the PS values b | a ) */
CVT.PS.S  f5, f3, f2      /* ( f5 holds PS seed 1/sqrt(b) | 1/sqrt(a) ) */
MUL.PS    f6, f5, f4      /* ( f6 holds intermediate1 results ) */
RSQRT2.PS f7, f6, f5      /* ( f7 holds intermediate2 results ) */
MADD.PS   f8, f5, f5, f7  /* ( f8 holds full precision PS 1/sqrt(b) | */
/*  1/sqrt(a) ) */

Restrictions:

The ﬁelds fs, ft, and fd must specify FPRs valid for operands of type fmt. If they are not valid, the result is UNPREDICTABLE. The format of the data in the speciﬁed operand register fs must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand FPR becomes UNPREDICTABLE.

Operation:

StoreFPR(fd, fmt, RSQRT_iteration(ValueFPR(fs, fmt), ValueFPR(ft, fmt)))

Exceptions:

Coprocessor Unusable, Reserved Instruction

Floating Point Exceptions:

Unimplemented Operation, Invalid Operation, Overﬂow, Inexact, Underﬂow