Encoding:

COP1X

010011

ALNV.PS

011110

Format:

ALNV.PS fd, fs, ft, rs

MIPS64, MIPS32 Release 2, removed in Release 6

Floating Point Align Variable

Purpose:

Floating Point Align Variable

To align a misaligned pair of paired single values.

Description:

FPR[fd] = ByteAlign(GPR[rs]_2..0, FPR[fs], FPR[ft])

FPR fs is concatenated with FPR ft and this value is funnel-shifted by GPR rs_2..0 bytes, and written into FPR fd. If

GPR rs_2..0 is 0, FPR fd receives FPR fs. If GPR rs_2..0 is 4, the operation depends on the current endianness.

Figure 3-1 illustrates the following example: for a big-endian operation and a byte alignment of 4, the upper half of

FPR fd receives the lower half of the paired single value in fs, and the lower half of FPR fd receives the upper half of the paired single value in FPR ft.

The move is non arithmetic; it causes no IEEE 754 exceptions, and the FCSR_Cause and FCSR_Flags fields are not modified.

Restrictions:

The fields fs, ft, and fd must specify FPRs valid for operands of type PS. If the fields are not valid, the result is

UNPREDICTABLE.

If GPR rs_1..0 are non-zero, the results are UNPREDICTABLE.

The result of this instruction is UNPREDICTABLE if the processor is executing in the FR=0 32-bit FPU register model. The instruction is predictable if executing on a 64-bit FPU in the FR=1 mode, but not with FR=0, and not on a 32-bit FPU.

Availability and Compatibility:

This instruction has been removed in Release 6.

Operation:

if GPR[rs]_2..0 = 0 then
   StoreFPR(fd, PS,ValueFPR(fs,PS))
else if GPR[rs]_2..0 != 4 then
   UNPREDICTABLE
else if BigEndianCPU then
   StoreFPR(fd, PS, ValueFPR(fs, PS)_31..0 || ValueFPR(ft,PS)_63..32)
else
   StoreFPR(fd, PS, ValueFPR(ft, PS)_31..0 || ValueFPR(fs,PS)_63..32)
endif

Exceptions:

Coprocessor Unusable, Reserved Instruction

Programming Notes:

ALNV.PS is designed to be used with LUXC1 to load 8 bytes of data from any 4-byte boundary. For example:

/* Copy T2 bytes (a multiple of 16) of data T0 to T1, T0 unaligned, T1 aligned.
             Reads one dw beyond the end of T0. */
   LUXC1     F0, 0(T0) /* set up by reading 1st src dw */
   LI       T3, 0     /* index into src and dst arrays */
   ADDIU     T4, T0, 8 /* base for odd dw loads */
   ADDIU     T5, T1, -8/* base for odd dw stores */
LOOP:
   LUXC1     F1, T3(T4)
   ALNV.PS   F2, F0, F1, T0/* switch F0, F1 for little-endian */
   SDC1      F2, T3(T1)
   ADDIU     T3, T3, 16
   LUXC1     F0, T3(T0)
   ALNV.PS   F2, F1, F0, T0/* switch F1, F0 for little-endian */
   BNE       T3, T2, LOOP
   SDC1      F2, T3(T5)
DONE:

ALNV.PS is also useful with SUXC1 to store paired-single results in a vector loop to a possibly misaligned address:

/* T1[i] = T0[i] + F8, T0 aligned, T1 unaligned. */
      CVT.PS.S F8, F8, F8/* make addend paired-single */
/* Loop header computes 1st pair into F0, stores high half if T1 */
/* misaligned */
LOOP:
   LDC1      F2, T3(T4)/* get T0[i+2]/T0[i+3] */
   ADD.PS    F1, F2, F8/* compute T1[i+2]/T1[i+3] */
   ALNV.PS   F3, F0, F1, T1/* align to dst memory */
   SUXC1     F3, T3(T1)/* store to T1[i+0]/T1[i+1] */
   ADDIU     T3, 16    /* i = i + 4 */
   LDC1      F2, T3(T0)/* get T0[i+0]/T0[i+1] */
   ADD.PS    F0, F2, F8/* compute T1[i+0]/T1[i+1] */
   ALNV.PS   F3, F1, F0, T1/* align to dst memory */
   BNE      T3, T2, LOOP
   SUXC1     F3, T3(T5)/* store to T1[i+2]/T1[i+3] */
/* Loop trailer stores all or half of F0, depending on T1 alignment */