Encoding:

POOL32A

000000

rt

rs

sz

0000

0000110000

6

5

5

2

4

10

Format:

CRC32B, CRC32H, CRC32W, CRC32D 

Generate CRC with reversed polynomial 0xEDB88320

CRC32B rt, rs, rt

microMIPS32 Release 6

Generate CRC with reversed polynomial 0xEDB88320

CRC32H rt, rs, rt

microMIPS32 Release 6

Generate CRC with reversed polynomial 0xEDB88320

CRC32W rt, rs, rt

microMIPS32 Release 6

Generate CRC with reversed polynomial 0xEDB88320

CRC32D rt, rs, rt

microMIPS64 Release 6

Generate CRC with reversed polynomial 0xEDB88320

Purpose:

Generate CRC with reversed polynomial 0xEDB88320

Description:

GPR[rt] = CRC32(GRP[rs], GPR[rt])

CRC32B/H/W/D generates a 32-bit Cyclic Redundancy Check (CRC) value based on the reversed polynomial

0xEDB88320. The new 32-bit CRC value is generated with a cumulative 32-bit CRC value input as GPR[rt] and a byte or half-word or word or double-word message right-justified in GPR[rs]. The message size is encoded in field sz of the instruction.

The generated value overwrites the input CRC value in GPR[rt], after sign extension, as the original value is considered redundant once the cumulative CRC value is re-generated with the additional message. More importantly, source-destroying definition of the CRC instruction allows the instruction to be included in a loop without having to move the destination to the source for the next iteration of the instruction.

The CRC32B/H/W/D instruction does not pad the input message. It is software's responsibility to ensure the input message, whether byte, half-word, word or double-word is fully-defined, otherwise the result is UNPREDICTABLE and thus unusable.

The reversed polynomial is a 33-bit polynomial of degree 32. Since the coefficient of most significance is always 1, it is dropped from the 32-bit binary number, as per standard representation. The order of the remaining coefficients increases from right to left in the binary representation.

Since the CRC is processed more than a bit at a time, the order of bits in the data elements of size byte, half-word, word or double-word is important. The specification assumes support for an "lsb-first" (little-endian) standard, and thus coefficients of polynomial terms that represent the message must be ordered from right to left in order of deccreasing significance.

The specification of the CRC instruction assumes the following in regards to a message of arbitrary length whose 32bit CRC value is to be generated. The message itself is a polynomial represented by binary coefficients of each term of the polynomial.

The typical use of CRC is to generate a checksum to accompany a message that is transmitted electronically in order to detect transmission errors at the receiving end. If the message is considered to be a polynomial, the coefficient of the most-significant term is transmitted first followed by remaining bits in order of decreasing significance, followed by the 32-bit CRC. The specification for these CRC instruction is thus most appropriate for standards that transmit least significant bit of data first (little-endian), such as IEEE 802 Ethernet. The least-significant bit of data conveniently maps to the coefficient of the most-significant term of the message polynomial.

Restrictions:

No data-dependent exceptions are possible.

Operation:

if (Config5CRCP = 0) then
   SignalException(ReservedInstruction)
endif
if (sz = 0b00) then
   temp = CRC32(GPR[rt], GPR[rs], 1, 0xEDB88320)
else if (sz = 0b01) then
   temp = CRC32(GPR[rt], GPR[rs], 2, 0xEDB88320)
else if (sz = 0b10) then
   temp = CRC32(GPR[rt], GPR[rs], 4, 0xEDB88320)
else if (sz = 0b11) then
   if (Are64BitOperationsEnabled()) then
      temp = CRC32(GPR[rt], GPR[rs], 8, 0xEDB88320)
   else
      SignalException(ReservedInstruction)
   endif
endif
GPR[rt] = sign_extend(temp)
// Bit oriented definition of CRC32 function
function CRC32(value, message, numbytes, poly)
   # value - right-justified current 32-bit CRC value
   # message - right-justified byte/half-word/word/double-word message
   # numbytes - size of message in bytes: byte/half-word/word/double-word
   # poly - 32-bit reversed polynomial
   value = (value and 0xffffffff) xor {(64-(numbytes*8))'b0,message} 
   for (i=0; i<numbytes*8; i++)
      if (value and 0d1) then // check most significant coefficient
         value = (value >> 1) xor poly
      else
         value = (value >> 1) 
      endif
   endfor
   return value
endfunction

Exceptions:

Reserved Instruction Exception

Restriction:

These instructions are implemented in Release 6 only if CP0 Config5CRCP is set to 1.

Programing Notes:

When calculating CRC, it is recommended that the initial value of GPR[rt] be all ones, when the CRC instruction is the first in the sequence to be referenced. This allows the CRC to differentiate between actual leading zeroes in the message element, and zeros added by transmission errors. The initial all one's value makes no difference to the CRC calculation as long as both sender and receiver use the same assumption on the initial value, to generate and check respectively.

If the order of bits in bytes assumes most-significant bit first, then Release 6 BITSWAP can be used to reverse the order of bits in order to operate with these instructions. However BITSWAP would only apply to byte messages.

CRC32B/H/W/D instructions are interchangeable: a series of low-order CRC instructions can be reduced to a series of high-order CRC32H operations, to increase throughput of the overall CRC generation process. The process of doing this will add trailing zeroes to the message for which CRC is being generated, since the data element is now larger, however, this will not change the CRC value that is generated. It is the original message that must be transmitted along with the CRC, without the trailing zeroes.

In pseudo-assembly, the following sequence of byte CRC operations may be used to generate a cumulative CRC value. (Pseudo-assembly is used to clearly indicate terms which need to be modified for interchangeability.)

li $3, 0xFFFF_FFFF     // initialize CRC value
la $4, memaddr         // assume word-aligned for convenience
for (i=0; i < byte_cnt; i++)
lb $2, 0($4)        // read message bytes
crc32b $3, $2, $3
add $4, $4, 1       // increment byte memory address by 1
endfor

This is equivalent to the sequence of word CRC operations. The simple example assumes some multiple of 4 bytes are processed.

for (i=0; i < byte_cnt/4; i++)
lw $2, 0($4)        // read message words
crc32w $3, $2, $3
add $4, $4, 4       // increment word memory address by 4
endfor

The throughput is thus increased by a multiple of 4 as only a quarter of the byte oriented operations occur.