COP1X 010011 |
rs |
ft |
fs |
fd |
ALNV.PS 011110 |
6 |
5 |
5 |
5 |
5 |
6 |
ALNV.PS fd, fs, ft, rs |
MIPS64, MIPS32 Release 2, removed in Release 6 |
Floating Point Align Variable |
Floating Point Align Variable
To align a misaligned pair of paired single values.
FPR[fd] = ByteAlign(GPR[rs]2..0, FPR[fs], FPR[ft])
FPR fs is concatenated with FPR ft and this value is funnel-shifted by GPR rs2..0 bytes, and written into FPR fd. If
GPR rs2..0 is 0, FPR fd receives FPR fs. If GPR rs2..0 is 4, the operation depends on the current endianness.
Figure 3-1 illustrates the following example: for a big-endian operation and a byte alignment of 4, the upper half of
FPR fd receives the lower half of the paired single value in fs, and the lower half of FPR fd receives the upper half of the paired single value in FPR ft.
The move is non arithmetic; it causes no IEEE 754 exceptions, and the FCSRCause and FCSRFlags fields are not modified.
The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is
UNPREDICTABLE.
If GPR rs1..0 are non-zero, the results are UNPREDICTABLE.
The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. The instruction is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.
This instruction has been removed in Release 6.
if GPR[rs]2..0 = 0 then StoreFPR(fd, PS,ValueFPR(fs,PS)) else if GPR[rs]2..0 != 4 then UNPREDICTABLE else if BigEndianCPU then StoreFPR(fd, PS, ValueFPR(fs, PS)31..0 || ValueFPR(ft,PS)63..32) else StoreFPR(fd, PS, ValueFPR(ft, PS)31..0 || ValueFPR(fs,PS)63..32) endif
Coprocessor Unusable, Reserved Instruction
ALNV.PS is designed to be used with LUXC1 to load 8 bytes of data from any 4-byte boundary. For example:
/* Copy T2 bytes (a multiple of 16) of data T0 to T1, T0 unaligned, T1 aligned. Reads one dw beyond the end of T0. */ LUXC1 F0, 0(T0) /* set up by reading 1st src dw */ LI T3, 0 /* index into src and dst arrays */ ADDIU T4, T0, 8 /* base for odd dw loads */ ADDIU T5, T1, -8/* base for odd dw stores */ LOOP: LUXC1 F1, T3(T4) ALNV.PS F2, F0, F1, T0/* switch F0, F1 for little-endian */ SDC1 F2, T3(T1) ADDIU T3, T3, 16 LUXC1 F0, T3(T0) ALNV.PS F2, F1, F0, T0/* switch F1, F0 for little-endian */ BNE T3, T2, LOOP SDC1 F2, T3(T5) DONE:
ALNV.PS is also useful with SUXC1 to store paired-single results in a vector loop to a possibly misaligned address:
/* T1[i] = T0[i] + F8, T0 aligned, T1 unaligned. */ CVT.PS.S F8, F8, F8/* make addend paired-single */ /* Loop header computes 1st pair into F0, stores high half if T1 */ /* misaligned */ LOOP: LDC1 F2, T3(T4)/* get T0[i+2]/T0[i+3] */ ADD.PS F1, F2, F8/* compute T1[i+2]/T1[i+3] */ ALNV.PS F3, F0, F1, T1/* align to dst memory */ SUXC1 F3, T3(T1)/* store to T1[i+0]/T1[i+1] */ ADDIU T3, 16 /* i = i + 4 */ LDC1 F2, T3(T0)/* get T0[i+0]/T0[i+1] */ ADD.PS F0, F2, F8/* compute T1[i+0]/T1[i+1] */ ALNV.PS F3, F1, F0, T1/* align to dst memory */ BNE T3, T2, LOOP SUXC1 F3, T3(T5)/* store to T1[i+2]/T1[i+3] */ /* Loop trailer stores all or half of F0, depending on T1 alignment */