COP1 010001 |
fmt |
ft |
fs |
fd |
RSQRT2 011111 |
6 |
5 |
5 |
5 |
5 |
6 |
RSQRT2.fmt |
Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2) | |
RSQRT2.S fd, fs, ft |
MIPS-3D |
Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2) |
RSQRT2.D fd, fs, ft |
MIPS-3D |
Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2) |
RSQRT2.PS fd, fs, ft |
MIPS-3D |
Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2) |
Floating Point Reduced Precision Reciprocal Square Root (Sequence Step 2)
Iterate towards obtaining a full precision reciprocal square root FP value
FPR[fd] = iterate with FPR[fs] and FPR[ft]
This is a step of iteration towards generating the full precision reciprocal square root value. The operand and result are values in format S, D, or PS.
The numeric accuracy of this operation is implementation dependent;it does not meetthe accuracy specified by the
IEEE 754 Floating Point standard.
It is implementation dependent whether the result is affected by the current rounding mode in FCSR.
A full precision reciprocal square root result is obtained by using the instruction sequence shown below. Assume that a value b is in register f0 in format S. Assume that RSQRT1.fmt has a 16-bit precision in the example implementation. At the end of the four-instruction sequence shown below, register f4 contains the full precision 24-bit reciprocal square root 1/(sqrt)b.
RSQRT1.S f1, f0 /* 16-bit 1/sqrt(b) */ MUL.S f2, f1, f0 /* b * f0 */ RSQRT2.S f3, f2, f1 /* -(f1 * f2 - 1.0)/2 */ MADD.S f4, f1, f1, f3 /* 24-bit 1/sqrt(b) */
The instruction sequence to produce a 52-bit result is as follows:
RSQRT1.D f1, f0 /* 16-bit 1/sqrt(b) */ MUL.D f2, f1, f0 /* b * f0 */ RSQRT2.D f3, f2, f1 /* -(f1 * f2 - 1.0)/2 */ MADD.D f4, f1, f1, f3 /* 31-bit 1/sqrt(b) */ MUL.D f5, f0, f4 /* b * f0 */ RSQRT2.D f6, f5, f4 /* -(f4 * f5 - 1.0)/2 */ MADD.D f7, f4, f4, f6 /* 53-bit 1/sqrt(b) */
The instruction sequence to take a paired single value and produce a paired single resultis as follows. Assume that register f0 holds two single values a and b in a paired single format, i.e., f0 = a | b.
RSQRT1.PS f1, f0 /* ( 16-bit 1/sqrt(a) and 1/sqrt(b) ) */ MUL.PS f2, f1, f0 /* ( a * f0 and b * f1 ) */ RSQRT2.PS f3, f2, f1 /* ( -(f1*f2-1.0)/2 ) */ MADD.PS f4, f1, f1, f3 /* ( 24-bit 1/sqrt(a) and 1/sqrt(b) ) */
If the hardware does not implement the RSQRT1.PS instruction, it is still possible to obtain a paired single result, but this takes three more instructions in the required sequence. Assume that register f0 holds a single value a and register
f1 holds a single value b.
RSQRT1.S f2, f0 /* ( f2 gets reduced precision 1/sqrt(a) ) */ RSQRT1.S f3, f1 /* ( f3 gets reduced precision 1/sqrt(b) ) */ CVT.PS.S f4, f1, f0 /* ( f4 now holds the PS values b | a ) */ CVT.PS.S f5, f3, f2 /* ( f5 holds PS seed 1/sqrt(b) | 1/sqrt(a) ) */ MUL.PS f6, f5, f4 /* ( f6 holds intermediate1 results ) */ RSQRT2.PS f7, f6, f5 /* ( f7 holds intermediate2 results ) */ MADD.PS f8, f5, f5, f7 /* ( f8 holds full precision PS 1/sqrt(b) | */ /* 1/sqrt(a) ) */
The fields fs, ft, and fd must specify FPRs valid for operands of type fmt. If they are not valid, the result is UNPREDICTABLE. The format of the data in the specified operand register fs must be a value in format fmt; if it is not, the result is UNPREDICTABLE and the value of the operand FPR becomes UNPREDICTABLE.
StoreFPR(fd, fmt, RSQRT_iteration(ValueFPR(fs, fmt), ValueFPR(ft, fmt)))
Coprocessor Unusable, Reserved Instruction
Unimplemented Operation, Invalid Operation, Overflow, Inexact, Underflow