NanoMips ISA Reference

/home/arch-index/nanomips/main › ADD rd, rs, rt - Add

Assembly:

ADD rd, rs, rt

nanoMIPS, not available in NMS

Add

Purpose:

Add. Add two 32-bit signed integers in registers $rs and $rt, placing the 32-bit result inregister $rd, and trapping on overﬂow.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	x	0100010	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
sum = GPR[rs] + GPR[rt]
if overflows(sum, nbits=32):
    raise exception('OV')
GPR[rd] = sign_extend(sum, from_nbits=32)

Exceptions:

Overﬂow.

/home/arch-index/nanomips/main › ADDIU rt, rs, imm - Add Immediate (Untrapped)

Assembly:

ADDIU rt, rs, imm

nanoMIPS, availability varies by format.

Add Immediate (Untrapped)

Purpose:

Add Immediate (Untrapped). Add immediate value imm to the 32-bit integer value in register$rs, placing the 32-bit result in register $rt, and not trapping on overﬂow.

Availability:

nanoMIPS, availability varies by format.

Format:

ADDIU[32]

000000

rt!=0

imm = u

ADDIU[48], not available in NMS

011000	rt	00001	s[15:0]	s[31:16]
6	5	5	16	16

if C0.Config5.NMS == 1:
    raise exception('RI')
imm = sign_extend(s, from_nbits=32)
rs = rt

ADDIU[GP48], not available in NMS, not available in P64 mode

011000	rt	00010	s[15:0]	s[31:16]
6	5	5	16	16

if C0.Config5.NMS == 1:
    raise exception('RI')
if pointers_are_64_bits():
    raise behaves_like('DADDIU[GP48]')
imm = sign_extend(s, from_nbits=32)
rs = 28

ADDIU[GP.B], not available in P64 mode

010001	rt	011	u
6	5	3	18

if pointers_are_64_bits():
    raise behaves_like('DADDIU[GP.B]')
imm = u
rs = 28

ADDIU[GP.W], not available in P64 mode

010000	rt	u[20:2]	00
6	5	19	2

if pointers_are_64_bits():
    raise behaves_like('DADDIU[GP.W]')
imm = u
rs = 28

ADDIU[R1.SP], not available in P64 mode

011100	rt3	1	u[7:2]
6	3	1	6

if pointers_are_64_bits():
    raise behaves_like('DADDIU[R1.SP]')
rt = decode_gpr(rt3, 'gpr3')
rs = 29
imm = u

ADDIU[R2]

100100	rt3	rs3	0	u[4:2]
6	3	3	1	3

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
imm = u

ADDIU[RS5]

100100

with rt!=0

s[3]

s[2:0]

rs = rt
imm = sign_extend(s, from_nbits=4)

ADDIU[RS5] with rt=0 is used to provide a 16 bit NOP instruction.

ADDIU[NEG]

100000	rt	rs	1000	u
6	5	5	4	12

imm = -u

Operation:

sum = GPR[rs] + imm
GPR[rt] = sign_extend(sum, from_nbits=32)

Exceptions:

Reserved Instruction for ADDIU[48] and ADDIU[GP48] formats on NMS cores.

/home/arch-index/nanomips/main › ADDIUPC rt, imm - Add Immediate (Untrapped) to PC

Assembly:

ADDIUPC rt, imm

nanoMIPS, availability varies by format.

Add Immediate (Untrapped) to PC

Purpose:

Add Immediate (Untrapped) to PC. Compute address by adding immediate value imm to thePC and placing the result in register $rt, not trapping on overﬂow.

Availability:

nanoMIPS, availability varies by format.

Format:

ADDIUPC[32], not available in P64 mode

000001	rt	s[20:1]	s[21]
6	5	20	1

if pointers_are_64_bits():
    raise behaves_like('DADDIUPC[32]')
s = sign_extend(s[21] @ s[20:1] @ '0')
imm = s + 4

ADDIUPC[48], not available in NMS, not available in P64 mode

011000	rt	00011	s[15:0]	s[31:16]
6	5	5	16	16

if C0.Config5.NMS == 1:
    raise exception('RI')
if pointers_are_64_bits():
    raise behaves_like('DADDIUPC[48]')
s = sign_extend(s[31:16] @ s[15:0])
imm = s + 6

Operation:

GPR[rt] = effective_address(CPU.next_pc, s)

Exceptions:

Reserved Instruction for ADDIUPC[48] format on NMS cores.

/home/arch-index/nanomips/main › ADDU dst, src1, src2 - Add (Untrapped)

Assembly:

ADDU dst, src1, src2

nanoMIPS, availability varies by format.

Add (Untrapped)

Purpose:

Add (Untrapped). Add two 32-bit integers in registers $src1 and $src2, placing the 32-bitresult in register $dst, and not trapping on overﬂow.

Availability:

nanoMIPS, availability varies by format.

Format:

ADDU[32]

001000	rt	rs	rd	x	0101010	000
6	5	5	5	1	7	3

dst = rd
src1 = rs
src2 = rt
not_in_nms = False

ADDU[16]

101100	rt3	rs3	rd3	0
6	3	3	3	1

dst = decode_gpr(rd3, 'gpr3')
src1 = decode_gpr(rs3, 'gpr3')
src2 = decode_gpr(rt3, 'gpr3')
not_in_nms = False

ADDU[4X4], not available in NMS

001111	rt4[3]	0	rt4[2:0]	rs4[3]	0	rs4[2:0]
6	1	1	3	1	1	3

dst = decode_gpr(rt4, 'gpr4')
src1 = decode_gpr(rt4, 'gpr4')
src2 = decode_gpr(rs4, 'gpr4')
not_in_nms = True

Operation:

if not_in_nms and C0.Config5.NMS:
    raise exception('RI')
sum = GPR[src1] + GPR[src2]
GPR[dst] = sign_extend(sum, from_nbits=32)

Exceptions:

Reserved Instruction for ADDU[4X4] format on NMS cores.

/home/arch-index/nanomips/main › ALIGN rd, rs, rt, bp - Align

Assembly:

ALIGN rd, rs, rt, bp

Assembly alias

Align

Purpose:

Align. Concatenate the 32 bit values in registers $rt and $rs, extract the word at speciﬁedbyte position bp, and place the result in register $rd.

Availability:

Assembly alias

Expansion:

bp != 0:
EXTW rd, rs, rt, (4-bp)<<3
bp == 0:
MOVE rd, rt

/home/arch-index/nanomips/main › ALUIPC rt, %pcrel_hi(address) - Add aLigned Upper Immediate to PC

Assembly:

ALUIPC rt, %pcrel_hi(address)

nanoMIPS

Add aLigned Upper Immediate to PC

Purpose:

Add aLigned Upper Immediate to PC. Compute a 4KB aligned PC relative address by addingan upper 20 bitimmediate value to NextPC, discarding the lower 12 bits, and placing the resultin register $rt.

Availability:

nanoMIPS

Format:

111000	rt	s[20:12]	s[30:21]	1	s[31]
6	5	9	10	1	1

offset = sign_extend(s, from_nbits=32)
address = effective_address(CPU.next_pc, offset) & ~0xfff

Operation:

GPR[rt] = address

Exceptions:

None.

/home/arch-index/nanomips/main › AND rd, rs, rt - AND

Assembly:

AND rd, rs, rt

nanoMIPS

AND

Purpose:

AND. Compute logical AND of registers $rs and $rt, placing the result in register $rd.

Availability:

nanoMIPS

Format:

AND[32]

001000	rt	rs	rd	x	1001010	000
6	5	5	5	1	7	3

AND[16]

010100	rt3	rs3	10	0	0
6	3	3	2	1	1

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
rd = rt

Operation:

GPR[rd] = GPR[rs] & GPR[rt]

Exceptions:

None.

/home/arch-index/nanomips/main › ANDI rt, rs, u - AND Immediate

Assembly:

ANDI rt, rs, u

nanoMIPS

AND Immediate

Purpose:

AND Immediate. Compute logical AND of register $rs and immediate u, placing the resultin register $rt.

Availability:

nanoMIPS

Format:

ANDI[32]

100000	rt	rs	0010	u
6	5	5	4	12

ANDI[16]

111100	rt3	rs3	eu
6	3	3	4

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
u = (0x00ff if eu == 12 else
     0xffff if eu == 13 else
     eu)

Operation:

GPR[rt] = GPR[rs] & u

Exceptions:

None.

/home/arch-index/nanomips/main › BALC address - Branch And Link, Compact

Assembly:

BALC address

nanoMIPS

Branch And Link, Compact

Purpose:

Branch And Link, Compact. Unconditional PC relative branch to address, placing returnaddress in register $31.

Availability:

nanoMIPS

Format:

BALC[32]

001010	1	s[24:1]	s[25]
6	1	24	1

offset = sign_extend(s, from_nbits=26)

BALC[16]

001110	s[9:1]	s[10]
6	9	1

offset = sign_extend(s, from_nbits=11)

Operation:

address = effective_address(CPU.next_pc, offset)
GPR[31] = CPU.next_pc
CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BALRSC rt, rs - Branch And Link Register Scaled, Compact

Assembly:

BALRSC rt, rs

nanoMIPS

Branch And Link Register Scaled, Compact

Purpose:

Branch And Link Register Scaled, Compact.

Unconditional branch to address NextPC+

2*$rs, placing return address in register $rt.

Availability:

nanoMIPS

Format:

010010

rt!=0

1000

Operation:

address = effective_address(CPU.next_pc, offset=GPR[rs]<<1)
GPR[rt] = CPU.next_pc
CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BBEQZC rt, bit, address - Branch if Bit Equals Zero, Compact

Assembly:

BBEQZC rt, bit, address

nanoMIPS, not available in NMS

Branch if Bit Equals Zero, Compact

Purpose:

Branch if Bit Equals Zero, Compact. PC relative branch to address if bit bit of register $rtis equal to zero.

Availability:

nanoMIPS, not available in NMS

Format:

110010	rt	001	x	bit	s[10:1]	s[11]
6	5	3	1	6	10	1

offset = sign_extend(s, from_nbits=12)

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
if bit >= 32 and not Are64BitOperationsEnabled():
    raise exception('RI');
address = effective_address(CPU.next_pc, offset)
testbit = (GPR[rt] >> bit) & 1
if testbit == 0:
    CPU.next_pc = address

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › BBNEZC rt, bit, address - Branch if Bit Not Equal to Zero, Compact

Assembly:

BBNEZC rt, bit, address

nanoMIPS, not available in NMS

Branch if Bit Not Equal to Zero, Compact

Purpose:

Branch if Bit Not Equal to Zero, Compact. PC relative branch to address if bit bit of register$rt is not equal to zero.

Availability:

nanoMIPS, not available in NMS

Format:

110010	rt	101	x	bit	s[10:1]	s[11]
6	5	3	1	6	10	1

offset = sign_extend(s, from_nbits=12)

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
if bit >= 32 and not Are64BitOperationsEnabled():
    raise exception('RI');
address = effective_address(CPU.next_pc, offset)
testbit = (GPR[rt] >> bit) & 1
if testbit == 1:
    CPU.next_pc = address

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › BC address - Branch, Compact

Assembly:

BC address

nanoMIPS

Branch, Compact

Purpose:

Branch, Compact. Unconditional PC relative branch to address.

Availability:

nanoMIPS

Format:

BC[32]

001010	0	s[24:1]	s[25]
6	1	24	1

offset = sign_extend(s, from_nbits=26)
address = effective_address(CPU.next_pc, offset)

BC[16]

000110	s[9:1]	s[10]
6	9	1

offset = sign_extend(s, from_nbits=11)
address = effective_address(CPU.next_pc, offset)

Operation:

CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BEQC rs, rt, address - Branch if Equal, Compact

Assembly:

BEQC rs, rt, address

nanoMIPS, availability varies by format.

Branch if Equal, Compact

Purpose:

Branch if Equal, Compact. PC relative branch to address if registers $rs and $rt are areequal.

Availability:

nanoMIPS, availability varies by format.

Format:

BEQC[32]

100010	rt	rs	00	s[13:1]	s[14]
6	5	5	2	13	1

offset = sign_extend(s, from_nbits=15)
address = effective_address(CPU.next_pc, offset)
not_in_mms = False

BEQC[16], not available in NMS

110110

rs3<rt3 && u!=0

rt3

rs3

u[4:1]

rs = decode_gpr(rs3, 'gpr3')
rt = decode_gpr(rt3, 'gpr3')
offset = u
address = effective_address(CPU.next_pc, offset)
not_in_mms = True

Operation:

if not_in_mms and C0.Config5.NMS == 1:
    raise exception('RI')
if GPR[rs] == GPR[rt]:
    CPU.next_pc = address

Exceptions:

Reserved Instruction for BEQC[16] format on NMS cores.

/home/arch-index/nanomips/main › BEQIC rt, u, address - Branch if Equal to Immediate, Compact

Assembly:

BEQIC rt, u, address

nanoMIPS

Branch if Equal to Immediate, Compact

Purpose:

Branch if Equal to Immediate, Compact. PC relative branch to address if value of register$rt is equal to immediate value u.

Availability:

nanoMIPS

Format:

110010	rt	000	u	s[10:1]	s[11]
6	5	3	7	10	1

Operation:

offset = sign_extend(s, from_nbits=12)
address = effective_address(CPU.next_pc, offset)
if GPR[rt] == u:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BEQZC rt, address # when rt andaddressarein range - Branch if Equal to Zero, Compact

Assembly:

BEQZC rt, address # when rt andaddressarein range

nanoMIPS

Branch if Equal to Zero, Compact

Purpose:

Branch if Equal to Zero, Compact. PC relative branch to address if register $rt equals zero.

Availability:

nanoMIPS

Format:

BEQZC[16]

100110	rt3	s[6:1]	s[7]
6	3	6	1

Operation:

rt = decode_gpr(rt3, 'gpr3')
offset = sign_extend(s, from_nbits=8)
address = effective_address(CPU.next_pc, offset)
if GPR[rt] == 0:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BGEC rs, rt, address - Branch if Greater than or Equal, Compact

Assembly:

BGEC rs, rt, address

nanoMIPS

Branch if Greater than or Equal, Compact

Purpose:

Branch if Greater than or Equal, Compact. PC relative branch to address if register $rs isgreater than or equal to register $rt.

Availability:

nanoMIPS

Format:

100010	rt	rs	10	s[13:1]	s[14]
6	5	5	2	13	1

Operation:

offset = sign_extend(s, from_nbits=15)
address = effective_address(CPU.next_pc, offset)
if GPR[rs] >= GPR[rt]:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BGEIC rt, u, address - Branch if Greater than or Equal to Immediate, Compact

Assembly:

BGEIC rt, u, address

nanoMIPS

Branch if Greater than or Equal to Immediate, Compact

Purpose:

Branch if Greater than or Equal to Immediate, Compact. PC relative branch to address ifsigned register value $rt is greater than or equal to immediate u.

Availability:

nanoMIPS

Format:

110010	rt	010	u	s[10:1]	s[11]
6	5	3	7	10	1

Operation:

offset = sign_extend(s, from_nbits=12)
address = effective_address(CPU.next_pc, offset)
if GPR[rt] >= u:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BGEIUC rt, u, address - Branch if Greater than or Equal to Immediate Unsigned, Compact

Assembly:

BGEIUC rt, u, address

nanoMIPS

Branch if Greater than or Equal to Immediate Unsigned, Compact

Purpose:

Branch if Greater than or Equal to Immediate Unsigned, Compact. PC relative branch toaddress if unsigned register $rt is greater than or equal to immediate u.

Availability:

nanoMIPS

Format:

110010	rt	011	u	s[10:1]	s[11]
6	5	3	7	10	1

Operation:

offset = sign_extend(s, from_nbits=12)
address = effective_address(CPU.next_pc, offset)
if unsigned(GPR[rt]) >= u:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BGEUC rs, rt, address - Branch if Greater than or Equal

Assembly:

BGEUC rs, rt, address

nanoMIPS

Branch if Greater than or Equal

Purpose:

Branch if Greater than or Equal

to Unsigned, Compact.PC relative branch to address if

unsigned register $rs is greater than or equal to unsigned register $rt.

Availability:

nanoMIPS

Format:

100010	rt	rs	11	s[13:1]	s[14]
6	5	5	2	13	1

Operation:

offset = sign_extend(s, from_nbits=15)
address = effective_address(CPU.next_pc, offset)
if unsigned(GPR[rs]) >= unsigned(GPR[rt]):
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BITREVB rt, rs - Bit Reverse in Bytes

Assembly:

BITREVB rt, rs

Assembly alias, not available in NMS

Bit Reverse in Bytes

Purpose:

Bit Reverse in Bytes. Reverse bits in each byte of 32-bit value in register $rs, placing theresult in register $rt.

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 7, 8, 1

/home/arch-index/nanomips/main › BITREVH rt, rs - Bit Reverse in Halfs

Assembly:

BITREVH rt, rs

Assembly alias, not available in NMS

Bit Reverse in Halfs

Purpose:

Bit Reverse in Halfs. Reverse bits in each halfword of 32-bit value in register $rs, placingthe result in register $rt.

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 15, 16

/home/arch-index/nanomips/main › BITREVW rt, rs - Bit Reverse in Word

Assembly:

BITREVW rt, rs

Assembly alias, not available in NMS

Bit Reverse in Word

Purpose:

Bit Reverse in Word. Reverse all bits in 32 bit register $rs, placing the result in register $rt.

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 31, 0

/home/arch-index/nanomips/main › BITSWAP rt, rs - Bitswap

Assembly:

BITSWAP rt, rs

Assembly alias, not available in NMS

Bitswap

Purpose:

Bitswap.

Reverse bits in each byte of 32-bit value in register $rs, placing the resultin

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 7, 8, 1

The assembly alias BITSWAP is provided for compatibility with MIPS32™.Its behavior is equivalent to the new assembly alias BITREVB, whose name is chosen to ﬁt consistently with the naming of other

reversing instructions in nanoMIPS™.

/home/arch-index/nanomips/main › BLTC rs, rt, address - Branch if Less Than, Compact

Assembly:

BLTC rs, rt, address

nanoMIPS

Branch if Less Than, Compact

Purpose:

Branch if Less Than, Compact. PC relative branch to address if signed register $rs is lessthan signed register $rt.

Availability:

nanoMIPS

Format:

101010	rt	rs	10	s[13:1]	s[14]
6	5	5	2	13	1

Operation:

offset = sign_extend(s, from_nbits=15)
address = effective_address(CPU.next_pc, offset)
if GPR[rs] < GPR[rt]:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BLTIC rt, u, address - Branch if Less Than Immediate, Compact

Assembly:

BLTIC rt, u, address

nanoMIPS

Branch if Less Than Immediate, Compact

Purpose:

Branch if Less Than Immediate, Compact. PC relative branch to address if signed register$rt is less than immediate u.

Availability:

nanoMIPS

Format:

110010	rt	110	u	s[10:1]	s[11]
6	5	3	7	10	1

Operation:

offset = sign_extend(s, from_nbits=12)
address = effective_address(CPU.next_pc, offset)
if GPR[rt] < u:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BLTIUC rt, u, address - Branch if Less Than Immediate Unsigned Compact

Assembly:

BLTIUC rt, u, address

nanoMIPS

Branch if Less Than Immediate Unsigned Compact

Purpose:

Branch if Less Than Immediate Unsigned Compact.

PC relative branch to address if unsigned register $rt is less than immediate u.

Availability:

nanoMIPS

Format:

110010	rt	111	u	s[10:1]	s[11]
6	5	3	7	10	1

Operation:

offset = sign_extend(s, from_nbits=12)
address = effective_address(CPU.next_pc, offset)
if unsigned(GPR[rt]) < u:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BLTUC rs, rt, address - Branch if Less Than Unsigned, Compact

Assembly:

BLTUC rs, rt, address

nanoMIPS

Branch if Less Than Unsigned, Compact

Purpose:

Branch if Less Than Unsigned, Compact. PC relative branch to address if unsigned register$rs is less than unsigned register $rt.

Availability:

nanoMIPS

Format:

101010	rt	rs	11	s[13:1]	s[14]
6	5	5	2	13	1

Operation:

offset = sign_extend(s, from_nbits=15)
address = effective_address(CPU.next_pc, offset)
if unsigned(GPR[rs]) < unsigned(GPR[rt]):
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BNEC rs, rt, address - Branch Not Equal, Compact

Assembly:

BNEC rs, rt, address

nanoMIPS, availability varies by format.

Branch Not Equal, Compact

Purpose:

Branch Not Equal, Compact. PC relative branch to address if register $rs is not equal toregister $rt.

Availability:

nanoMIPS, availability varies by format.

Format:

BNEC[32]

101010	rt	rs	00	s[13:1]	s[14]
6	5	5	2	13	1

offset = sign_extend(s, from_nbits=15)

BNEC[16], not available in NMS

110110

rs3>=rt3 && u!=0

rt3

rs3

u[4:1]

if C0.Config5.NMS == 1:
    raise exception('RI')
rs = decode_gpr(rs3, 'gpr3')
rt = decode_gpr(rt3, 'gpr3')
offset = u

Operation:

address = effective_address(CPU.next_pc, offset)
if GPR[rs] != GPR[rt]:
    CPU.next_pc = address

Exceptions:

Reserved Instruction for BNEC[16] format on NMS cores.

/home/arch-index/nanomips/main › BNEIC rt, u, address - Branch if Not Equal to Immediate, Compact

Assembly:

BNEIC rt, u, address

nanoMIPS

Branch if Not Equal to Immediate, Compact

Purpose:

Branch if Not Equal to Immediate, Compact. PC relative branch to address if register $rt isnot equal to immediate u.

Availability:

nanoMIPS

Format:

110010	rt	100	u	s[10:1]	s[11]
6	5	3	7	10	1

Operation:

offset = sign_extend(s, from_nbits=12)
address = effective_address(CPU.next_pc, offset)
if GPR[rt] != u:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BNEZC rt, address - Branch if Not Equal to Zero, Compact

Assembly:

BNEZC rt, address

nanoMIPS

Branch if Not Equal to Zero, Compact

Purpose:

Branch if Not Equal to Zero, Compact. PC relative branch to address if register $rt is notequal to zero.

Availability:

nanoMIPS

Format:

BNEZC[16]

101110	rt3	s[6:1]	s[7]
6	3	6	1

rt = decode_gpr(rt3, 'gpr3')
offset = sign_extend(s, from_nbits=8)

Operation:

address = effective_address(CPU.next_pc, offset)
if GPR[rt] != 0:
    CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BREAK code - Break

Assembly:

BREAK code

nanoMIPS

Break

Purpose:

Break. Cause a Breakpoint exception.

Availability:

nanoMIPS

Format:

BREAK[32]

000000	00000	10	code
6	5	2	19

BREAK[16]

000100	00000	10	code
6	5	2	3

Operation:

raise exception('BP')

Exceptions:

Breakpoint.

/home/arch-index/nanomips/main › BRSC rs - Branch Register Scaled, Compact

Assembly:

BRSC rs

nanoMIPS

Branch Register Scaled, Compact

Purpose:

Branch Register Scaled, Compact. Unconditional branch to address

NextPC + 2*$rs.

Availability:

nanoMIPS

Format:

010010	00000	rs	1000	x
6	5	5	4	12

Operation:

address = effective_address(CPU.next_pc, offset=GPR[rs]<<1)
CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › BYTEREVH rt, rs - Byte Reverse in Halfs

Assembly:

BYTEREVH rt, rs

Assembly alias, not available in NMS

Byte Reverse in Halfs

Purpose:

Byte Reverse in Halfs. Reverse bytes in each halfword of 32-bit value in register $rs, placingthe result in register $rt.

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 8, 24

/home/arch-index/nanomips/main › BYTEREVW rt, rs - Byte Reverse in Word

Assembly:

BYTEREVW rt, rs

Assembly alias, not available in NMS

Byte Reverse in Word

Purpose:

Byte Reverse in Word. Reverse each byte in word value in register $rs, placing the result inregister $rt.

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 24, 8

/home/arch-index/nanomips/main › CACHE op, offset(rs) - Cache operation/Cache operation using EVA addressing

Assembly:

CACHE op, offset(rs)	nanoMIPS. Requires CP0 privilege, availability varies by format.	Cache operation/Cache operation using EVA addressing
CACHEE op, offset(rs)	nanoMIPS. Requires CP0 privilege, availability varies by format.	Cache operation/Cache operation using EVA addressing

Purpose:

Cache operation/Cache operation using EVA addressing. Perform cache operation of typeop at address $rs +offset (register plus immediate). For CACHEE, translate the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Requires CP0 privilege, availability varies by format.

Format:

CACHE

101001	op	rs	s[8]	0111	0	01	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)
is_eva = False

CACHEE, present when Conﬁg5.EVA=1.

101001	op	rs	s[8]	0111	0	10	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)
is_eva = True

Operation:

# NMS core without caches gives RI (not CoprocessorUnusable)exception.
if (C0.Config5.NMS and C0.Config1.DL == 0 and C0.Config1.IL == 0
                   and C0.Config2.SL == 0 and C0.Config2.TL == 0
                   and C0.Config5.L2C == 0):
    raise exception('RI')
if is_eva and not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Load', eva=is_eva)
# Behavior for index cacheops is unpredictable ifaddressisnotunmapped.
if op <= 11:  # Index cacheop
    translation_type, description, result_args = decode_va(va, eva=is_eva)
    if translation_type != 'unmapped':
        raise UNPREDICTABLE('Index cacheopunpredictable withVA not unmapped')
pa, cca = va2pa(va, 'Cacheop', eva=is_eva)
if cca == 2 or cca == 7:
    if C0.Config.AT >= 2:
        pass  # Cacheop to uncached address is anopin R6
    else:
        raise UNPREDICTABLE('Cacheop to uncached address isunpredictable')
else:
    cacheop(va, pa, op)

The CACHE/CACHEE instructions perform the cache operation speciﬁed by argument ’op’ on the register plus immediate address $rs +offset. For CACHEE, the virtual address is translated as though

the core is in user mode, although it is actually in kernel mode.

The ’op’ argument is a 5 bit value specifying one of the following the possible cache operations, which are described in more detail below:

’op’OperationAvailability

0ICache Index InvalidateRequired (if ICache present) 1DCache Index Writeback InvalidateRequired (if DCache present)

2TCache Index Writeback InvalidateRequired (if TCache present) 3SCache Index Writeback InvalidateRequired (if SCache present)

4ICache Index Load TagRecommended (if ICache present) 5DCache Index Load TagRecommended (if DCache present)

6TCache Index Load TagRecommended (if TCache present) 7SCache Index Load TagRecommended (if SCache present)

8ICache Index Store TagRequired (if ICache present) 9DCache Index Store TagRequired (if DCache present)

10TCache Index Store TagRequired (if TCache present) 11SCache Index Store TagRequired (if SCache present)

12ICache Implementation Dependent OpOptional (if ICache present) 13DCache Implementation Dependent OpOptional (if DCache present)

14TCache Implementation Dependent OpOptional (if TCache present) 15SCache Implementation Dependent OpOptional (if SCache present)

16ICache Hit InvalidateRequired (if ICache present) 17DCache Hit InvalidateOptional (if DCache present)

18TCache Hit InvalidateOptional (if TCache present) 19SCache Hit InvalidateOptional (if SCache present)

20ICache FillRecommended (if ICache present) 21DCache Hit Writeback InvalidateRecommended (if DCache present)

’op’OperationAvailability

22TCache Hit Writeback InvalidateRecommended (if TCache present) 23SCache Hit Writeback InvalidateRecommended (if SCache present)

24Unused 25DCache Hit WritebackRecommended (if DCache present)

26TCache Hit WritebackRecommended (if TCache present) 27SCache Hit WritebackRecommended (if SCache present)

28ICache Fetch and LockRecommended (if ICache present) 29DCache Fetch and LockRecommended (if DCache present)

30Unused 31Unused

Index cacheops (those with op <= 11 and optionally the implementation dependent cases 12<= op <=

15) are operations where the input address is treated as an index into the target cache array. The rules for constructing the index are given in the cacheop() function pseudocode.

’Hit’ cacheops are operations where the input address is treated as a virtual memory address.The operation willtargetthe cache line containing data for that virtual address,ifitis presentin the

cache.

The operations listed above behave as follows:

ICache Index Invalidate (op=0): Set the state of the instruction cache line at the speciﬁed index to invalid.

D/T/S Cache Index Writeback Invalidate (op=1,2,3):

Ifthe cache line atthe speciﬁed index is

valid and dirty, write the line back to the memory address speciﬁed by the cache tag. Whether or not the line was dirty, set the state of the cache line to invalid. For a write-through cache, the

writeback step is not required and this is eﬀectively a Cache Index Invalidate operation. This cache operation is required and may be used by software to invalidate the entire data cache by

stepping through allindices. Note that the Index Store Tag operation must be used to initialize the cache at power up.

I/D/T/S Cache Index Load Tag (op=4,5,6,7): Read the tag for the cache line at the speciﬁed index

into the TagLo and TagHi Coprocessor 0 registers.If the DataLo and DataHi registers are implemented, also read the data corresponding to the byte index into the DataLo and DataHi registers.

This operation must not cause a Cache Error Exception. The granularity and alignment of the data read into the DataLo and DataHi registers is implementation-dependent, but is typically the

result of an aligned access to the cache, ignoring the appropriate low-order bits of the byte index.

I/D/T/S Cache Index Store Tag (op=8,9,10,11): Write the tag for the cache block at the speciﬁed index from the TagLo and TagHi Coprocessor 0 registers. This operation must not cause a Cache

Error Exception. This required encoding may be used by software to initialize the entire instruction or data caches by stepping through all valid indices. Doing so requires that the TagLo and

TagHi registers associated with the cache be initialized to zero ﬁrst.

I/D/T/S Cache Implementation Dependent Op (op=12,13,14,15): Available for implementation dependent operation.

I/D/T/S Cache Hit Invalidate (op=16,17,18,19):

If the cache block contains the speciﬁed address,

setthe state ofthe cache block to invalid.This required encoding may be used by software to invalidate a range of addresses from the instruction cache by stepping through the address

range by the line size of the cache.In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system.

ICache FIll (op=20): Fill the cache from the speciﬁed virtual address.
D/T/S Hit Writeback Invalidate (op=21,22,23): For the cache line (if any) which contains the

speciﬁed address:if the cache line is valid and dirty, write the line back to the memory address speciﬁed by the cache tag. Whether or not the line was dirty, set the state of the cache line to

invalid. For a write-through cache,the writeback step is not required and this is eﬀectively a Cache Hit Invalidate operation. This cache operation is required and may be used by software to

invalidate a range of addresses from the data cache by stepping through the address range by the line size of the cache.In multiprocessor implementations with coherent caches, the operation

may optionally be broadcast to all coherent caches within the system.

D/T/S Hit Writeback (op=25,26,27):

If the cache block contains the speciﬁed address and it is

valid and dirty, write the contents back to memory. After the operation is completed,leave the state of the line valid, but clear the dirty state. For a write-through cache, this operation may be

treated as a nop.In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system.

I/D Fetch and Lock (op=28,29):

If the cache does not contain the speciﬁed virtual address, ﬁll

itfrom memory, performing a write-back if required.Setthe state to valid and locked.The way selected on a ﬁll from memory is implementation dependent. The lock state may be cleared

by executing an Index Invalidate,Index Writeback Invalidate, Hit Invalidate, or Hit Writeback Invalidate operation to the locked line, or via an Index Store Tag operation to the line that clears

the lock bit.It is implementation dependent whether a locked line is displaced as the result of an external invalidate or intervention that hits on the locked line. Software must not depend on the

locked line remaining in the cache if an externalinvalidate or intervention would invalidate the line if it were not locked.It is implementation dependent whether a Fetch and Lock operation

aﬀects more than one line. For example, more than one line around the referenced address may be fetched and locked.Itis recommended that only the single line containing the referenced

address be aﬀected.

It is implementation dependent whether the input address for an Index cacheop is converted into a physical address by the MMU, so to avoid the posibility of generating a TLB exception, the index value

should always be converted to an unmapped address (such as a kseg0 address by ORing the index with 0x80000000) before being used by the cache instruction. For example, the following code sequence

performs a data cache Index Store Tag operation using the index passed in GPR a0:

        li      a1, 0x80000000       /* Baseofkseg0 segment */
        or      a0, a0, a1           /* Convertindex to kseg0 address */
        cache   DCIndexStTag, 0(a1)  /* Performtheindex store tag operation */

Some CACHE/CACHEE operations may result in a Cache Error exception. For example, if a Writeback operation detects a cache or bus error during the processing of the operation, that error is reported

via a Cache Error exception.Also, a Bus Error Exception may occur if a bus operation invoked by this instruction is terminated in an error. However, cache error exceptions must not be triggered by

an Index Load Tag or Index Store tag operation, as these operations are used for initialization and diagnostic purposes.

It is implementation dependent whether a data watch is triggered by a cache instruction whose address matches the Watch register address match conditions. The preferred implementation is not to match

on the CACHE/CACHEE instructions.

The operation of the instruction is UNPREDICTABLE if the cache line that contains the CACHE instruction is the target of an invalidate or a writeback invalidate operation.

If this instruction is used to lock all ways of a cache at a speciﬁc cache index, the behavior of that cache to subsequent cache misses to that cache index is UNDEFINED.

The eﬀective address may be arbitrarily aligned. The CACHE/CACHEE instructions never causes an Address Error Exception due to a non-aligned address.

The CACHE instruction and the memory transactions which are sourced by the CACHE instruction, such as cache reﬁll or cache writeback, obey the ordering and completion rules of the SYNC instruction.

Any use of this instruction that can cause cacheline writebacks should be followed by a subsequent SYNC instruction to avoid hazards where the writeback data is not yet visible at the next level of the

memory hierarchy.

For multiprocessor implementations that maintain coherent caches, some of the Hit type operations

may optionally aﬀect all coherent caches within the implementation.In this case,if the eﬀective address uses a coherent Cache Coherency Attribute (CCA),

then the operation is globalized, meaning

it is broadcast to all of the coherent caches within the system.If the eﬀective address does not use one of the coherent CCAs, there is no broadcast of the operation.If multiple levels of caches are to

be aﬀected by one CACHE instruction, all of the aﬀected cache levels must be processed in the same manner - either all aﬀected cache levels use the globalized behavior or all aﬀected cache levels use

the non-globalized behavior.

Exceptions:

Address Error. Bus Error. Cache Error. Coprocessor Unusable. Reserved Instruction on NMS cores without caches. Reserved Instruction for CACHEE if EVA not implemented. TLB Invalid. TLB Reﬁll.

/home/arch-index/nanomips/main › CLO rt, rs - Count Leading Ones

Assembly:

CLO rt, rs

nanoMIPS, not available in NMS

Count Leading Ones

Purpose:

Count Leading Ones. Count leading ones in 32-bit register value $rs, placing the result inregister $rt.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	0100101	100	111	111
6	5	5	7	3	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
input = GPR[rs]
i = 0
while i < 32:
    if input[31 - i] != 1: break
    i += 1
GPR[rt] = i

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › CLZ rt, rs - Count Leading Zeros

Assembly:

CLZ rt, rs

nanoMIPS, not available in NMS

Count Leading Zeros

Purpose:

Count Leading Zeros. Count leading zeros in 32-bit register value $rs, placing the result inregister $rt.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	0101101	100	111	111
6	5	5	7	3	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
input = GPR[rs]
i = 0
while i < 32:
    if input[31 - i] != 0: break
    i += 1
GPR[rt] = i

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › CRC32B rt, rs - CRC32 Byte.

Assembly:

CRC32B rt, rs

nanoMIPS. Optional, present when Config5.CRCP=1.

CRC32 Byte.

Purpose:

CRC32 Byte.Generatea32-bit CRC valuebasedonthereversedpolynomial$rt

0xEDB88320,using cumulative 32-bit CRC valueand right-justiﬁed byte-sized message$rt $rs as inputs.

Availability:

nanoMIPS. Optional, present when Config5.CRCP=1.

Format:

001000	rt	rs	x	000	1111	1	01	000
6	5	5	3	3	4	1	2	3

Operation:

if C0.Config5.CRCP == 0:
    raise exception('RI')
result = crc32(value=GPR[rt], message=GPR[rs], nbits=8, poly=0xEDB88320)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on cores without CRC support.

/home/arch-index/nanomips/main › CRC32CB rt, rs - CRC32 (Castagnoli) Byte

Assembly:

CRC32CB rt, rs

nanoMIPS. Optional, present when Config5.CRCP=1.

CRC32 (Castagnoli) Byte

Purpose:

CRC32 (Castagnoli) Byte. Generate a 32-bit CRC value $rt based on the reversed polynomial0x82F63B78, using cumulative 32-bit CRC value $rt and right-justiﬁed byte-sized message $rs as inputs.

Availability:

nanoMIPS. Optional, present when Config5.CRCP=1.

Format:

001000	rt	rs	x	100	1111	1	01	000
6	5	5	3	3	4	1	2	3

Operation:

if C0.Config5.CRCP == 0:
    raise exception('RI')
result = crc32(value=GPR[rt], message=GPR[rs], nbits=8, poly=0x82F63B78)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on cores without CRC support.

/home/arch-index/nanomips/main › CRC32CH rt, rs - CRC32 (Castagnoli) Half

Assembly:

CRC32CH rt, rs

nanoMIPS. Optional, present when Config5.CRCP=1.

CRC32 (Castagnoli) Half

Purpose:

CRC32 (Castagnoli) Half. Generate a 32-bit CRC value $rt based on the reversed polynomial0x82F63B78, using cumulative 32-bit CRC value $rt and right-justiﬁed halfword-sized message $rs as inputs.

Availability:

nanoMIPS. Optional, present when Config5.CRCP=1.

Format:

001000	rt	rs	x	101	1111	1	01	000
6	5	5	3	3	4	1	2	3

Operation:

if C0.Config5.CRCP == 0:
    raise exception('RI')
result = crc32(value=GPR[rt], message=GPR[rs], nbits=16, poly=0x82F63B78)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on cores without CRC support.

/home/arch-index/nanomips/main › CRC32CW rt, rs - CRC32 (Castagnoli) Word

Assembly:

CRC32CW rt, rs

nanoMIPS. Optional, present when Config5.CRCP=1.

CRC32 (Castagnoli) Word

Purpose:

CRC32 (Castagnoli) Word. Generate a 32-bit CRC value $rt based on the reversed polynomial 0x82F63B78, using cumulative 32-bit CRC value $rt and right-justiﬁed word-sized message $rs as inputs.

Availability:

nanoMIPS. Optional, present when Config5.CRCP=1.

Format:

001000	rt	rs	x	110	1111	1	01	000
6	5	5	3	3	4	1	2	3

Operation:

if C0.Config5.CRCP == 0:
    raise exception('RI')
result = crc32(value=GPR[rt], message=GPR[rs], nbits=32, poly=0x82F63B78)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on cores without CRC support.

/home/arch-index/nanomips/main › CRC32H rt, rs - CRC32 Half.

Assembly:

CRC32H rt, rs

nanoMIPS. Optional, present when Config5.CRCP=1.

CRC32 Half.

Purpose:

CRC32 Half.Generatea32-bit CRC valuebasedonthereversedpolynomial$rt

0xEDB88320,using cumulative 32-bit CRC valueand right-justiﬁed halfword-sized message$rt

$rs as inputs.

Availability:

nanoMIPS. Optional, present when Config5.CRCP=1.

Format:

001000	rt	rs	x	001	1111	1	01	000
6	5	5	3	3	4	1	2	3

Operation:

if C0.Config5.CRCP == 0:
    raise exception('RI')
result = crc32(value=GPR[rt], message=GPR[rs], nbits=16, poly=0xEDB88320)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on cores without CRC support.

/home/arch-index/nanomips/main › CRC32W rt, rs - CRC32 Word.

Assembly:

CRC32W rt, rs

nanoMIPS. Optional, present when Config5.CRCP=1.

CRC32 Word.

Purpose:

CRC32 Word.Generatea 32-bit CRC value$rt based on thereversed polynomial

0xEDB88320, using cumulative 32-bit CRC value $rt and right-justiﬁed word-sized message $rs as inputs.

Availability:

nanoMIPS. Optional, present when Config5.CRCP=1.

Format:

001000	rt	rs	x	010	1111	1	01	000
6	5	5	3	3	4	1	2	3

Operation:

if C0.Config5.CRCP == 0:
    raise exception('RI')
result = crc32(value=GPR[rt], message=GPR[rs], nbits=32, poly=0xEDB88320)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on cores without CRC support.

/home/arch-index/nanomips/main › DERET - Debug Exception Return

Assembly:

DERET

nanoMIPS. Optional, present when Debug implemented.

Debug Exception Return

Purpose:

Debug Exception Return. Return from a debug exception by jumping to the address in theDEPC register, and clearing Debug.DM.

Availability:

nanoMIPS. Optional, present when Debug implemented.

Format:

001000	x	11	10001	101	111	111
6	10	2	5	3	3	3

Operation:

if C0.Config1.EP == 0:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
if C0.Debug.DM == 0:
    raise exception('RI')
CPU.next_pc = sign_extend(Root.C0.DEPC)
C0.Debug.DM = 0
# If single stepping, forward progress isallowedonthenextinstruction.
CPU.debug_sst_progress_allowed = True
clear_execution_hazards()
clear_instruction_hazards()

The DERET instruction implements a software barrier that resolves all execution and instruction hazards. See the EHB and JALRC.HB instructions for an explanation of execution and instruction hazards

respectively, and also the SYNCI/SYNCIE instruction for additionalinformation on resolving instruction hazards created by writing to the instruction stream.

The eﬀects of the DERET barrier are seen starting with the fetch and decode of the instruction at the PC to which the DERET returns. This means,for instance,that if C0.DEPC is modiﬁed by an MTC0

instruction prior to a DERET, an EHB is required between the MTC0 and the DERET to ensure that the DERET uses the correct DEPC value.

The DERET instruction is only legalin debug mode and will give a Coprocessor Unusable exception when executed in user mode or a Reserved Instruction exception when executed in kernel mode.

Exceptions:

Coprocessor Unusable.Reserved Instruction when notin Debug Mode or on cores without Debug support.

/home/arch-index/nanomips/main › DI rt - Disable Interrupts

Assembly:

DI rt

nanoMIPS. Requires CP0 privilege.

Disable Interrupts

Purpose:

Disable Interrupts. Disable interrupts by setting Status.IE to 0, and return the previousvalue of Status register in register $rt.

Availability:

nanoMIPS. Requires CP0 privilege.

Format:

001000	rt	x	01	00011	101	111	111
6	5	5	2	5	3	3	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
GPR[rt] = C0.Status
C0.Status.IE = 0

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › DIV rd, rs, rt - Divide

Assembly:

DIV rd, rs, rt

nanoMIPS

Divide

Purpose:

Divide. Divide signed word $rs by signed word $rt and place the result in $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0100011	000
6	5	5	5	1	7	3

Operation:

numerator = GPR[rs]
denominator = GPR[rt]
if denominator == 0:
    quotient, remainder = (UNKNOWN, UNKNOWN)
else:
    quotient, remainder = divide_integers(numerator, denominator)
GPR[rd] = sign_extend(quotient, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › DIVU rd, rs, rt - Divide Unsigned

Assembly:

DIVU rd, rs, rt

nanoMIPS

Divide Unsigned

Purpose:

Divide Unsigned. Divide unsigned word $rs by unsigned word $rt and place the result inregister $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0110011	000
6	5	5	5	1	7	3

Operation:

numerator = zero_extend(GPR[rs], from_nbits=32)
denominator = zero_extend(GPR[rt], from_nbits=32)
if denominator == 0:
    quotient, remainder = (UNKNOWN, UNKNOWN)
else:
    quotient, remainder = divide_integers(numerator, denominator)
GPR[rd] = sign_extend(quotient, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › DVP rt - Disable Virtual Processors

Assembly:

DVP rt

nanoMIPS. Optional, present when Config5.VP=1, otherwise NOP. Requires CP0 privilege.

Disable Virtual Processors

Purpose:

Disable Virtual Processors. Disable all virtual processors in a physical core other than theone that issued the instruction. Set VPControl.DIS to 1, and place the previous value of the VPControl CP0 register in register $rt.

Availability:

nanoMIPS. Optional, present when Config5.VP=1, otherwise NOP. Requires CP0 privilege.

Format:

001000	rt	x	00000	0	1110010	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.VP == 0:
    # No operation when VP not implemented pass
else:
    if not IsCoprocessor0Enabled():
        raise coprocessor_exception(0)
    GPR[rt] = C0.VPControl
    C0.VPControl.DIS = 1
    disable_virtual_processors()

The DVP instruction is used to halt instruction fetch for all virtual processors in a VP core, other than the one which issued the DVP instruction. Possible uses for DVP include:

Performing cache operations where the cache state must not be aﬀected by the actions of other

threads on the same core.

Reprogramming virtual processor scheduling priority.

All outstanding instructions for the aﬀected virtual processors must be complete before the DVP itself is allowed to retire. Any outstanding events such as hardware instruction or data prefetch, or page-table

walks, must also be terminated.

Memory ordering equivalent to that provided by SYNC(stype=0) is guaranteed between subsequent

instructions on the virtual processor which issued the DVP, and instructions which have already graduated on the disabled virtual processors.

If a virtual processor is already disabled by another event,for instance,if it has executed a WAIT or a PAUSE instruction or has been halted by some external hardware event,then the disabled virtual

processor will not be re-enabled until both an EVP instruction has been executed on the controlling thread, and an event which would otherwise have woken the virtual processor (such as an interrupt for

a WAIT instruction or an interrupt or clearing of the LLBit for a PAUSE instruction) has also occurred.

The eﬀect of a DVP instruction is undone by an EVP instruction, which causes execution to resume immediately (where applicable) on all other virtual processors. From the perspective of the disabled

virtual processors, after the EVP, execution continues as though the DVP had not occurred.

If an event occurs in between the DVP and EVP that renders state of a disabled virtual processor UNPREDICTABLE (such as power-gating), then the eﬀect of EVP is UNPREDICTABLE.

A disabled virtual processor cannot be woken by an interrupt or a deferred exception, at least until execution is re-enabled by an EVP instruction on the controlling thread.The virtual processor that

executes the DVP, however, continues to be interruptible.

A DVP which is executed when VPControl.DIS=1 will return the current value of the VPControl register but otherwise will leave the other virtual processors in a disabled state. Software should only re-enable

virtual processors (via the EVP instruction) if it has veriﬁed from the VPControl value returned by the DVP that virtual processors were previously enabled. Performing this check allows DVP/EVP pairs to

be safely nested.

In a core with multiple virtual processors, more than one virtual processor may execute a DVP simultaneously. The implementation should ensure that the selection of which virtual processor’s DVP successfully graduates is not biased towards any one virtual processor, in order to prevent the possibility

of live-lock.

The DVP instruction behaves like a NOP on cores which do not implement virtual processors (i.e. when Conﬁg5.VP=0). This behavior allows kernel code to enclose critical sequences within DVP/EVP blocks

without ﬁrst checking whether itis running on a VP core.The encoding ofthe DVP instruction is equivalentto a SLTU instruction targeting $0,i.e.a NOP, which leads to the correct behavior on

non-VP cores with no additional hardware special casing.

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › EHB - Execution hazard barrier

Assembly:

EHB

nanoMIPS

Execution hazard barrier

Purpose:

Execution hazard barrier. Clear all execution hazards before allowing any subsequent instructions to graduate.

Availability:

nanoMIPS

Format:

100000	00000	x	1100	x	0000	00011
6	5	5	4	3	4	5

Operation:

clear_execution_hazards()

The EHB instruction creates an execution hazard barrier, meaning thatit ensures that subsequent

instructions will be aware of changes to CP0 state caused by prior instructions. Examples of instructions which change CP0 state and which need an execution hazard barrier to ensure that subsequent

instructions see those updates are MTC0, EI, DI, TLBR and CACHE/CACHEE

In the absence of an execution hazard barrier, the CP0 register value used as input to an instruction may be out of date, since it may have been read before the write to the CP0 register by a prior instruction

has actually been committed.

An execution hazard barrier is suﬃcient to ensure that a fetched instruction is aware of all prior CP0 updates. However, it is not suﬃcient to ensure that the correct instruction is being fetched as a result

of those CP0 updates. Ensuring that the correct instruction is fetched requires an instruction hazard barrier, which is provided by the JALRC.HB instruction, or any ofthe exception return instructions

ERET/ERETNC or DERET.

Exceptions:

None.

/home/arch-index/nanomips/main › EI rt - Enable Interrupts

Assembly:

EI rt

nanoMIPS. Requires CP0 privilege.

Enable Interrupts

Purpose:

Enable Interrupts.

Enable interrupts by setting Status.IE to 1, and return the previous

value of Status register in register $rt.

Availability:

nanoMIPS. Requires CP0 privilege.

Format:

001000	rt	x	01	01011	101	111	111
6	5	5	2	5	3	3	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
GPR[rt] = C0.Status
C0.Status.IE = 1

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › ERET - Exception Return/Exception Return Not Clearing LLBit

Assembly:

ERET	nanoMIPS, availability varies by format.	Exception Return/Exception Return Not Clearing LLBit
ERETNC	nanoMIPS, availability varies by format.	Exception Return/Exception Return Not Clearing LLBit

Purpose:

Exception Return/Exception Return Not Clearing LLBit. Return from an exception: either byclearing Status.ERL if set and jumping to the address in ErrorEPC; otherwise by clearing Status.EXL,

jumping to the address in EPC, and updating the current Shadow Register Setto SRSCtl.PSS if required.

Availability:

nanoMIPS, availability varies by format.

Format:

ERET, requires CP0 privilege.

001000	x	0	11	11001	101	111	111
6	9	1	2	5	3	3	3

nc = False

ERETNC, present when Config5.LLB=1. Requires CP0 privilege.

001000	x	1	11	11001	101	111	111
6	9	1	2	5	3	3	3

nc = True

Operation:

if nc and C0.Config5.LLB == 0:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
if C0.Status.ERL == 1:
    effective_epc = sign_extend(C0.ErrorEPC)
    C0.Status.ERL = 0
else:
    effective_epc = sign_extend(C0.EPC)
    C0.Status.EXL = 0
    if C0.SRSCtl.HSS > 0 and C0.Status.BEV == 0:
        C0.SRSCtl.CSS = C0.SRSCtl.PSS
CPU.next_pc = effective_epc
# clear LLbit unless this is an ERETNC
if not nc:
   C0.LLAddr.LLB = 0
clear_execution_hazards()
clear_instruction_hazards()

The ERET/ERETNC instructions implement a software barrier that resolves all execution and instruction hazards. See the EHB and JALRC.HB instructions for an explanation of execution and instruction

hazards respectively, and also the SYNCI/SYNCIE instruction for additionalinformation on resolving instruction hazards created by writing to the instruction stream.

The eﬀects of the ERET/ERETNC barrier are seen starting with the fetch and decode of the instruction at the PC to which the ERET returns. This means, for instance, that if C0.EPC is modiﬁed by an MTC0

instruction prior to an ERET, an EHB is required between the MTC0 and the ERET to ensure that the ERET uses the correct EPC value.

Conﬁg5.LLB indicates support for the ERETNC instruction.It is always 1 for R6 cores, except for those implementing the nanoMIPS™ subset.In other words, ERETNC is required for nanoMIPS™ cores and

optional for NMS cores.

Exceptions:

Coprocessor Unusable. Reserved Instruction allowed for ERETNC on NMS cores.

/home/arch-index/nanomips/main › EVP rt - Enable Virtual

Assembly:

EVP rt

nanoMIPS. Optional, present when Conﬁg5.VP=1, otherwise NOP. Requires CP0 privilege.

Enable Virtual

Purpose:

Enable VirtualProcessors.Enableallvirtualprocessorsinaphysicalcore.Set

VPControl.DIS to 0, and place the previous value of the VPControl CP0 register in register $rt.

Availability:

nanoMIPS. Optional, present when Conﬁg5.VP=1, otherwise NOP. Requires CP0 privilege.

Format:

001000	rt	x	00000	1	1110010	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.VP == 0:
    # No operation when VP not implemented pass
else:
    if not IsCoprocessor0Enabled():
        raise coprocessor_exception(0)
    GPR[rt] = C0.VPControl
    C0.VPControl.DIS = 0
    enable_virtual_processors()

The EVP instruction is used on VP cores to undo the eﬀect of a DVP instruction, and the reader should refer to the DVP description for details regarding its usage.

The EVP instruction behaves like a NOP on cores which do not implement virtual processors (i.e. when Conﬁg5.VP=0). This behavior allows kernel code to enclose critical sequences within DVP/EVP blocks

without ﬁrst checking whether itis running on a VP core.The encoding ofthe EVP instruction is equivalentto a SLTU instruction targeting $0,i.e.a NOP, which leads to the correct behavior on

non-VP cores with no additional hardware special casing.

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › EXT rt, rs, pos, size - Extract

Assembly:

EXT rt, rs, pos, size

nanoMIPS, not available in NMS

Extract

Purpose:

Extract. Extract a bit ﬁeld of size size at position pos from register $rs and store it rightjustiﬁed into register $rt.

Availability:

nanoMIPS, not available in NMS

Format:

100000	rt	rs	1111	0	msbd	0	lsb
6	5	5	4	1	5	1	5

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
pos = lsb
size = msbd + 1
if pos + size > 32:
    raise UNPREDICTABLE()
result = zero_extend(GPR[rs] >> pos, from_nbits=size)
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › EXTW rd, rs, rt, shift - Extract Word

Assembly:

EXTW rd, rs, rt, shift

nanoMIPS

Extract Word

Purpose:

Extract Word. Concatenate the 32 bit values in registers $rt and $rs, extract the word atspeciﬁed bit position shift, and place the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	shift	011	111
6	5	5	5	5	3	3

Operation:

tmp = GPR[rt][31:0] @ GPR[rs][31:0]
result = tmp >> shift
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › GINVI rs - Globally Invalidate Instruction caches

Assembly:

GINVI rs

nanoMIPS. Optional, present when

Globally Invalidate Instruction caches

Purpose:

Globally Invalidate Instruction caches.

Availability:

nanoMIPS. Optional, present when

Config5.GI >= 2. Requires CP0 privilege.

Format:

001000	x	rs	00	01111	101	111	111
6	5	5	2	5	3	3	3

Operation:

if C0.Config5.GI < 2:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
if GPR[rs] == 0:
    cores = get_all_cores_in_system()
else:
    cores = implementation_dependent_ginvi_cores(GPR[rs])
for core in cores:
    # Find encoded line size, sets, and associativity for thetargetcache.
    (L, S, A) = get_cache_parameters('I', core)
    num_sets = 2 ** (S + 6)
    num_ways = A + 1
    for way_index in range(num_ways):
        for set_index in range(num_sets):
            cache_line = get_cache_line('I', way_index,set_index, core)
            cache_line.valid = False

When $rs is 0, GINVI fully invalidates allinstruction caches of all cores in the system,including the localinstruction cache. For non-zero $rs values, GINVI invalidates the instruction cache of a speciﬁc,

implementation dependent core in the system.

The GINVIinstruction must be followed by a SYNC (stype=0x14) and an instruction hazard barrier (e.g.JRC.HB) to ensure that all instruction caches in the system have been invalidated.

Exceptions:

Coprocessor Unusable. Reserved Instruction if Global Invalidate I-cache not implemented.

/home/arch-index/nanomips/main › GINVT rs, type - Globally invalidate TLBs

Assembly:

GINVT rs, type

nanoMIPS. Optional, present when Config5.GI=3. Requires CP0 privilege.

Globally invalidate TLBs

Purpose:

Globally invalidate TLBs.

Availability:

nanoMIPS. Optional, present when Config5.GI=3. Requires CP0 privilege.

Format:

001000	x	type	rs	00	00111	101	111	111
6	3	2	5	2	5	3	3	3

Operation:

if C0.Config5.GI != 3:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
if not C0.Config5.MI:
    raise exception('RI', 'Config5.MI notset')
ginvt(type, va=GPR[rs])

Perform type invalidation of all TLBs in the system, where type is one of:

```
     type=0:
```

invALL - invalidate all non wired entries.

```
     type=1:
```

invVA- invalidate all entries which match the VA speciﬁed by $rs.

```
     type=2:
```

invMMID invalidate all entries which match C0.MemoryMapID.MMID and are not

global.

```
     type=3:
```

invVAMMID - invalidate all entries which match the VA speciﬁed by $rs and either match

C0.MemoryMapID or are global.

The GINVT instruction must be followed by a SYNC (stype=0x14) and an instruction hazard barrier (e.g.JRC.HB) to ensure that matching entries have been removed from all TLBs in the system and that

all instructions in the instruction stream can only access the new context.

invMMID and invVAMMID operations use the C0.MemoryMapID value of the currently running process. The kernel must save/restore C0.MemoryMapID appropriately before it modiﬁes it for the invalidation

operation. Between the save and restore, it must utilize unmapped addresses.

Exceptions:

Coprocessor Unusable.Reserved Instruction if GlobalInvalidate TLB notimplemented.Reserved Instruction if MemoryMapID not enabled (i.e. Conﬁg5.MI==0).

/home/arch-index/nanomips/main › INS rt, rs, pos, size - Insert. Merge a right justiﬁed bit ﬁeld of size size from register $rs into position pos of

Assembly:

INS rt, rs, pos, size

nanoMIPS, not available in NMS

Insert. Merge a right justiﬁed bit ﬁeld of size size from register $rs into position pos of

Purpose:

Insert. Merge a right justiﬁed bit ﬁeld of size size from register $rs into position pos of

Availability:

nanoMIPS, not available in NMS

Format:

100000	rt	rs	1110	0	msbd	0	lsb
6	5	5	4	1	5	1	5

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
pos = lsb
size = 1 + msbd - lsb
if size < 1:
    raise UNPREDICTABLE()
merge_mask = ((1<<size) - 1) << pos
result = (GPR[rt] & ~merge_mask
          | (GPR[rs] << pos) & merge_mask)
GPR[rt] = sign_extend(result, from_nbits=32)

The INS instruction is not available on NMS cores.It can be emulated using a sequence of three EXTW instructions:

       INS      rt, rs, pos, size

can be emulated using the following sequence of instructions (provided rt is not equal to rs):

       EXTW     rt, rt, rt, pos
       EXTW     rt, rt, rs, size
       EXTW     rt, rt, rt, 32 size -pos

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › JALRC.HB rt, rs - Jump And Link Register, Compact, with Hazard Barrier. Unconditional

Assembly:

JALRC.HB rt, rs

nanoMIPS

Jump And Link Register, Compact, with Hazard Barrier. Unconditional

Purpose:

Jump And Link Register, Compact, with Hazard Barrier. Unconditionaljump to address in

register $rs, placing the return address in register $rt. Clear allinstruction and execution hazards before allowing any subsequent instructions to graduate.

Availability:

nanoMIPS

Format:

010010	rt	rs	0001	x
6	5	5	4	12

Operation:

address = GPR[rs] + 0
GPR[rt] = CPU.next_pc
CPU.next_pc = address
clear_instruction_hazards()
clear_execution_hazards()

The JALRC.HB instruction creates an instruction hazard barrier, meaning that it ensures that subsequent

instruction fetches will be aware of state changes caused by prior instructions.Examples of

state changes which aﬀect instruction fetch and which need an instruction hazard barrier to ensure that subsequent instructions see those updates are:

Writes to the instruction stream (which must also have been synchronized by a SYNCI/SYNCIE

and a SYNC).

Updates to the TLB.
Changes in CP0 state which aﬀect addresses mappings.

In the absence of an instruction hazard barrier, the state used as input to an instruction fetch may be out of date, since it may have been read before the updates to that state have actually completed.

JALRC.HB also provides an execution hazard barrier, see the EHB instruction deﬁnition for details. An instruction hazard barrier is also provided by any of the exception return instructions ERET/ERETNC,

or DERET, but those instructions are only available to privileged software, whereas JALRC.HB is available from all operating modes.

Exceptions:

None.

/home/arch-index/nanomips/main › JALRC dst, src - Jump And Link Register, Compact. Unconditional

Assembly:

JALRC dst, src

nanoMIPS

Jump And Link Register, Compact. Unconditional

Purpose:

Jump And Link Register, Compact. Unconditionaljump to address in register $src, placing

the return address in register $dst.

Availability:

nanoMIPS

Format:

JALRC[32]

010010	rt	rs	0000	x
6	5	5	4	12

src = rs
dst = rt

JALRC[16]

110110	rt	1	0000
6	5	1	4

src = rt
dst = 31

Operation:

address = GPR[src] + 0
GPR[dst] = CPU.next_pc
CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › JRC rt - Jump Register, Compact. Unconditional jump to address in register $rt.

Assembly:

JRC rt

nanoMIPS

Jump Register, Compact. Unconditional jump to address in register $rt.

Purpose:

Jump Register, Compact. Unconditional jump to address in register $rt.

Availability:

nanoMIPS

Format:

110110	rt	0	0000
6	5	1	4

Operation:

address = GPR[rt]
CPU.next_pc = address

Exceptions:

None.

/home/arch-index/nanomips/main › LAPC rt, address - Load Address, PC relative

Assembly:

LAPC rt, address

Assembly alias. NMS cores restricted to 21 bit signed oﬀset from PC.

Load Address, PC relative

Purpose:

Load Address, PC relative. Load PC relative address to register $rt.

Availability:

Assembly alias. NMS cores restricted to 21 bit signed oﬀset from PC.

Expansion:

address = $PC + imm (imm in 21 bit signed range):
ADDIUPC[32] rt, imm
address = $PC + imm (imm in 32 bit signed range):
ADDIUPC[48] rt, imm

LAPC uses the ADDIUPC instruction to load a PC relative address into register $rt.In order to determine the correct immediate value for the ADDIUPC instruction, the assembler must assume a value

for the PC that the instruction will be executed from.If the instruction is executed from a diﬀerent PC, then the generated address will be shifted by a PC relative amount.

/home/arch-index/nanomips/main › LB rt, offset(rs) - Load Byte

Assembly:

LB rt, offset(rs)

nanoMIPS

Load Byte

Purpose:

Load Byte. Load signed byte to register $rt from memory address $rs + offset (registerplus immediate).

Availability:

nanoMIPS

Format:

LB[U12]

100001	rt	rs	0000	u
6	5	5	4	12

offset = u

LB[16]

010111	rt3	rs3	00	u
6	3	3	2	2

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
offset = u

LB[GP]

010001	rt	000	u
6	5	3	18

rs = 28
offset = u

LB[S9]

101001	rt	rs	s[8]	0000	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

Operation:

va = effective_address(GPR[rs], offset, 'Load')
data = read_memory_at_va(va, nbytes=1)
GPR[rt] = sign_extend(data, from_nbits=8)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LBE rt, offset(rs) - Load Byte using EVA addressing

Assembly:

LBE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Load Byte using EVA addressing

Purpose:

Load Byte using EVA addressing. Load signed byte to register $rt from virtual address $rs+ offset, translating the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	0000	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

offset = sign_extend(s, from_nbits=9)
if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Load', eva=True)
data = read_memory_at_va(va, nbytes=1, eva=True)
GPR[rt] = sign_extend(data, from_nbits=8)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LBU rt, offset(rs) - Load Byte Unsigned

Assembly:

LBU rt, offset(rs)

nanoMIPS

Load Byte Unsigned

Purpose:

Load Byte Unsigned. Load unsigned byte to register $rt from memory address $rs + offset(register plus immediate).

Availability:

nanoMIPS

Format:

LBU[U12]

100001	rt	rs	0010	u
6	5	5	4	12

offset = u

LBU[16]

010111	rt3	rs3	10	u
6	3	3	2	2

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
offset = u

LBU[GP]

010001	rt	010	u
6	5	3	18

rs = 28
offset = u

LBU[S9]

101001	rt	rs	s[8]	0010	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

Operation:

va = effective_address(GPR[rs], offset, 'Load')
GPR[rt] = read_memory_at_va(va, nbytes=1)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LBUE rt, offset(rs) - Load Byte Unsigned using EVA addressing

Assembly:

LBUE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Load Byte Unsigned using EVA addressing

Purpose:

Load Byte Unsigned using EVA addressing. Load unsigned byte to register $rt from virtualaddress $rs + offset, translating the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	0010	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

offset = sign_extend(s, from_nbits=9)
if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Load', eva=True)
GPR[rt] = read_memory_at_va(va, nbytes=1, eva=True)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LBUX rd, rs(rt) - Load Byte Unsigned indeXed

Assembly:

LBUX rd, rs(rt)

nanoMIPS

Load Byte Unsigned indeXed

Purpose:

Load Byte Unsigned indeXed. Load unsigned byte to register $rd from memory address $rt+ $rs (register plus register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0010	0	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs], GPR[rt], 'Load')
GPR[rd] = read_memory_at_va(va, nbytes=1)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LBX rd, rs(rt) - Load Byte indeXed

Assembly:

LBX rd, rs(rt)

nanoMIPS

Load Byte indeXed

Purpose:

Load Byte indeXed. Load signed byte to register $rd from memory address $rt + $rs (register plus register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0000	0	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs], GPR[rt], 'Load')
data = read_memory_at_va(va, nbytes=1)
GPR[rd] = sign_extend(data, from_nbits=8)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LH rt, offset(rs) - Load Half

Assembly:

LH rt, offset(rs)

nanoMIPS

Load Half

Purpose:

Load Half. Load signed halfword to register $rt from memory address $rs + offset (registerplus immediate).

Availability:

nanoMIPS

Format:

LH[U12]

100001	rt	rs	0100	u
6	5	5	4	12

offset = u

LH[16]

011111	rt3	rs3	0	u[2:1]	0
6	3	3	1	2	1

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
offset = u

LH[GP]

010001	rt	100	u[17:1]	0
6	5	3	17	1

rs = 28
offset = u

LH[S9]

101001	rt	rs	s[8]	0100	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

Operation:

va = effective_address(GPR[rs], offset, 'Load')
data = read_memory_at_va(va, nbytes=2)
GPR[rt] = sign_extend(data, from_nbits=16)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHE rt, offset(rs) - Load Half using EVA addressing

Assembly:

LHE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Load Half using EVA addressing

Purpose:

Load Half using EVA addressing. Load signed halfword to register $rt from virtual address$rs + offset, translating the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	0100	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
offset = sign_extend(s, from_nbits=9)
va = effective_address(GPR[rs], offset, 'Load', eva=True)
data = read_memory_at_va(va, nbytes=2, eva=True)
GPR[rt] = sign_extend(data, from_nbits=16)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHU rt, offset(rs) - Load Half Unsigned

Assembly:

LHU rt, offset(rs)

nanoMIPS

Load Half Unsigned

Purpose:

Load Half Unsigned. Load unsigned halfword to register $rt from memory address $rs +offset (register plus immediate).

Availability:

nanoMIPS

Format:

LHU[U12]

100001	rt	rs	0110	u
6	5	5	4	12

offset = u

LHU[16]

011111	rt3	rs3	1	u[2:1]	0
6	3	3	1	2	1

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
offset = u

LHU[GP]

010001	rt	100	u[17:1]	1
6	5	3	17	1

rs = 28
offset = u

LHU[S9]

101001	rt	rs	s[8]	0110	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

Operation:

va = effective_address(GPR[rs], offset, 'Load')
GPR[rt] = read_memory_at_va(va, nbytes=2)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHUE rt, offset(rs) - Load Half Unsigned using EVA addressing

Assembly:

LHUE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Load Half Unsigned using EVA addressing

Purpose:

Load Half Unsigned using EVA addressing. Load unsigned halfword to register $rt from virtual address $rs + offset, translating the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	0110	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
offset = sign_extend(s, from_nbits=9)
va = effective_address(GPR[rs], offset, 'Load', eva=True)
GPR[rt] = read_memory_at_va(va, nbytes=2, eva=True)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHUX rd, rs(rt) - Load Half Unsigned indeXed

Assembly:

LHUX rd, rs(rt)

nanoMIPS

Load Half Unsigned indeXed

Purpose:

Load Half Unsigned indeXed. Load unsigned halfword to register $rd from memory address$rt + $rs (register plus register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0110	0	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs], GPR[rt], 'Load')
GPR[rd] = read_memory_at_va(va, nbytes=2)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHUXS rd, rs(rt) - Load Half Unsigned indeXed Scaled

Assembly:

LHUXS rd, rs(rt)

nanoMIPS

Load Half Unsigned indeXed Scaled

Purpose:

Load Half Unsigned indeXed Scaled. Load unsigned halfword to register $rd from memoryaddress $rt + 2*$rs (register plus scaled register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0110	1	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs]<<1, GPR[rt], 'Load')
GPR[rd] = read_memory_at_va(va, nbytes=2)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHX rd, rs(rt) - Load Half indeXed

Assembly:

LHX rd, rs(rt)

nanoMIPS

Load Half indeXed

Purpose:

Load Half indeXed. Load signed halfword to register $rd from memory address $rt + $rs(register plus register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0100	0	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs], GPR[rt], 'Load')
data = read_memory_at_va(va, nbytes=2)
GPR[rd] = sign_extend(data, from_nbits=16)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LHXS rd, rs(rt) - Load Half indeXed Scaled

Assembly:

LHXS rd, rs(rt)

nanoMIPS

Load Half indeXed Scaled

Purpose:

Load Half indeXed Scaled. Load signed halfword to register $rd from memory address $rt+ 2*$rs (register plus scaled register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0100	1	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs]<<1, GPR[rt], 'Load')
data = read_memory_at_va(va, nbytes=2)
GPR[rd] = sign_extend(data, from_nbits=16)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LI rt, s - Load Immediate

Assembly:

LI rt, s

nanoMIPS, availability varies by format.

Load Immediate

Purpose:

Load Immediate. Load immediate value s to register $rt.

Availability:

nanoMIPS, availability varies by format.

Format:

LI[16]

110100	rt3	eu
6	3	7

rt = decode_gpr(rt3, 'gpr3')
s = -1 if eu == 127 else eu
not_in_nms = False

LI[48], not available in NMS

011000	rt	00000	s[15:0]	s[31:16]
6	5	5	16	16

s = sign_extend(s[31:16] @ s[15:0])
not_in_nms = True

Operation:

if not_in_nms and C0.Config5.NMS == 1:
    raise exception('RI')
GPR[rt] = s

Exceptions:

Reserved Instruction for LI[48] format on NMS cores.

/home/arch-index/nanomips/main › LL rt, offset(rs) - Load Linked word/Load Linked word using EVA addressing/Load Linked Word Pair/Load

Assembly:

LL rt, offset(rs)	nanoMIPS, availability varies by format.	Load Linked word/Load Linked word using EVA addressing/Load Linked Word Pair/Load
LLE rt, offset(rs)	nanoMIPS, availability varies by format.	Load Linked word/Load Linked word using EVA addressing/Load Linked Word Pair/Load
LLWP rt, ru, (rs)	nanoMIPS, availability varies by format.	Load Linked word/Load Linked word using EVA addressing/Load Linked Word Pair/Load
LLWPE rt, ru, (rs)	nanoMIPS, availability varies by format.	Load Linked word/Load Linked word using EVA addressing/Load Linked Word Pair/Load

Purpose:

Load Linked word/Load Linked word using EVA addressing/Load Linked Word Pair/LoadLinked Word Pair using EVA addressing. For LL/LLE,load word for atomic RMW to register $rt from address $rs + offset (register plus immediate).For LLWP/LLWPE,load words for atomic RMW to

registers $rt and $ru from address $rs. For LLE/LLWPE, translate the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS, availability varies by format.

Format:

LL

101001	rt	rs	s[8]	1010	0	01	s[7:2]	00
6	5	5	1	4	1	2	6	2

offset = sign_extend(s, from_nbits=9)
nbytes = 4
is_eva = False

LLE, present when Conﬁg5.EVA=1, requires CP0 privilege.

101001	rt	rs	s[8]	1010	0	10	s[7:2]	00
6	5	5	1	4	1	2	6	2

offset = sign_extend(s, from_nbits=9)
nbytes = 4
is_eva = True

LLWP, required (optional on NMS cores).

101001	rt	rs	x	1010	0	01	ru	x	01
6	5	5	1	4	1	2	5	1	2

offset = 0
nbytes = 8
is_eva = False

LLWPE, present when Conﬁg5.EVA=1. Requires CP0 privilege.

101001	rt	rs	x	1010	0	10	ru	x	01
6	5	5	1	4	1	2	5	1	2

offset = 0
nbytes = 8
is_eva = True

Operation:

if nbytes == 8 and C0.Config5.XNP:
    raise exception('RI', 'LLWP[E] requires word-paired support')
if is_eva and not C0.Config5.EVA:
    raise exception('RI')
va = effective_address(GPR[rs], offset, 'Load', eva=is_eva)
# Linked access must be aligned.
if va & (nbytes-1):
    raise exception('ADEL', badva=va)
pa, cca = va2pa(va, 'Load', eva=is_eva)
if (cca == 2 or cca == 7) and not C0.Config5.ULS:
    raise UNPREDICTABLE('uncached CCAnotsynchronizable when Config5.ULS=0')
    # (Preferred behavior for non-synchronizableaddressisBusError).
# Indicate that there is an active RMW sequence onthisprocessor.
C0.LLAddr.LLB = 1
# Save target address of active RMW sequence.
record_linked_address(va, pa, cca, nbytes=nbytes)
data = read_memory(va, pa, cca, nbytes=nbytes)
if nbytes == 4: # LL/LLE
    GPR[rt] = sign_extend(data, from_nbits=32)
else:  # LLWP/LLWPE
    word0 = data[63:32] if C0.Config.BE else data[31:0]
    word1 = data[31:0] if C0.Config.BE else data[63:32]
    if rt == ru:
        raise UNPREDICTABLE()
    GPR[rt] = sign_extend(word0, from_nbits=32)
    GPR[ru] = sign_extend(word1, from_nbits=32)

The LL/LLE/LLWP/LLWPE instructions are used to initiate an atomic read-modify-write sequence. C0.LLAddr.LLB is set to 1,indicating that there is an active RMW sequence on the current processor,

and an implementation dependent set of state is saved which indicates the address and access type of the active RMW sequence. There can be only one active RMW sequence per processor.

The RMW sequence will be completed by a matching SC/SCE/SCWP/SCWPE instruction.The storeconditional instruction will only complete if the system can guarantee that the accessed memory location has not been modiﬁed since the load-linked instruction occurred, as discussed in more detail

the SC/SCE/SCWP/SCWPE instruction description.

The address and CCA targeted by the LL/LLE/LLWP/LLWPE must be must be synchronizable by all processors and I/O devices sharing the location; if it is not, the result is UNPREDICTABLE. Which storage is

synchronizable is a function of both CPU and system implementations - see the SC/SCE/SCWP/SCWPE

instruction for the formal deﬁnition.The preferred behavior for a load-linked instruction which attempts to access an address which is not synchronizable is a Bus Error exception.

If Conﬁg5.ULS is set, then the system supports uncached load-linked/store-conditional accesses. Otherwise, the result of uncached accesses is unpredictable.

A LL/LLE/LLWP/LLWPE instruction on one processor must nottake action that, by itself, causes a

store-conditional instruction for the same block on another processor to fail. For example, if an implementation depends on retaining the data in the cache during the RMW sequence, cache misses caused

by a load-linked instruction must not fetch data in the exclusive state, since that would remove it from another core’s cache if it were present.

An execution of a load-linked instruction does not have to be followed by execution of store-conditional instruction; a program is free to abandon the RMW sequence without attempting a write.

Supportfor the paired word instructions LLWP/LLWPE is indicated by the Conﬁg5.XNP bit.Paired word support is required for nanoMIPS™ cores, except for NMS cores, where it is optional.

The result of LLWP/LLWPE is unpredictable if $rt and $ru are the same register.

Exceptions:

Address Error. Bus Error. Coprocessor Unusable for LLE/LLWPE. Reserved Instruction for LLE/LLWPE if EVA not implemented. Reserved Instruction for LLWP/LLWPE ifload linked pair not implemented.

TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LSA rd, rs, rt, u2 - Load Scaled Address

Assembly:

LSA rd, rs, rt, u2

nanoMIPS

Load Scaled Address

Purpose:

Load Scaled Address. Add register $rs scaled by a left shift u2 to register $rt and place the32 bit result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	u2	x	001	111
6	5	5	5	2	3	3	3

Operation:

sum = (GPR[rs] << u2) + GPR[rt]
GPR[rd] = sign_extend(sum, from_nbits=32)

In nanoMIPS™,the shift ﬁeld directly encodes the shift amount, meaning thatthe supported shift values are in the range 0 to 3 (instead of 1 to 4 in MIPSR6™).

Exceptions:

None.

/home/arch-index/nanomips/main › LUI rt, %hi(imm) - Load Upper Immediate.

Assembly:

LUI rt, %hi(imm)

nanoMIPS

Load Upper Immediate.

Purpose:

Load Upper Immediate.Load upper 20 bits ofimmediate value imm to upper 20 bits of

Availability:

nanoMIPS

Format:

111000	rt	s[20:12]	s[30:21]	0	s[31]
6	5	9	10	1	1

imm = sign_extend(s, from_nbits=32)

Operation:

GPR[rt] = imm

For backwards compatibility, instances of LUI which use a literal value for the immediate will be treated as containing a 16 bit immediate which should be loaded into the upper 16 bits of the target register.

To access the upper 20 bits of the register, the ’%hi(imm)’ form of the immediate must be used.

Exceptions:

None.

/home/arch-index/nanomips/main › LW rt, offset(rs) - Load Word

Assembly:

LW rt, offset(rs)

nanoMIPS, availability varies by format.

Load Word

Purpose:

Load Word. Load word to register $rt from memory address $rs + offset (register plusimmediate).

Availability:

nanoMIPS, availability varies by format.

Format:

LW[U12]

100001	rt	rs	1000	u
6	5	5	4	12

offset = u

LW[16]

000101	rt3	rs3	u[5:2]
6	3	3	4

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
offset = u

LW[4X4], not available in NMS

011101	rt4[3]	u[2]	rt4[2:0]	rs4[3]	u[3]	rs4[2:0]
6	1	1	3	1	1	3

if C0.Config5.NMS == 1:
    raise exception('RI')
rt = decode_gpr(rt4[3] @ rt4[2:0], 'gpr4')
rs = decode_gpr(rs4[3] @ rs4[2:0], 'gpr4')
offset = u

LW[GP16]

010101	rt3	u[8:2]
6	3	7

rt = decode_gpr(rt3, 'gpr3')
rs = 28
offset = u

LW[GP]

010000	rt	u[20:2]	10
6	5	19	2

rs = 28
offset = u

LW[S9]

101001	rt	rs	s[8]	1000	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

LW[SP]

001101	rt	u[6:2]
6	5	5

rs = 29
offset = u

Operation:

va = effective_address(GPR[rs], offset, 'Load')
data = read_memory_at_va(va, nbytes=4)
GPR[rt] = sign_extend(data, from_nbits=32)

Exceptions:

Address Error. Bus Error. Reserved Instruction for LW[4X4] format on NMS cores. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LWE rt, offset(rs) - Load Word using EVA addressing

Assembly:

LWE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Load Word using EVA addressing

Purpose:

Load Word using EVA addressing.

Load word to register $rt from virtual address $rs +

offset,translating the virtual address as though the core is in user mode, although it is actually in
 kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	1000	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

offset = sign_extend(s, from_nbits=9)
if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Load', eva=True)
data = read_memory_at_va(va, nbytes=4, eva=True)
GPR[rt] = sign_extend(data, from_nbits=32)

Exceptions:

Address Error. Bus Error. Coprocessor unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LWM rt, offset(rs), count - Load Word Multiple

Assembly:

LWM rt, offset(rs), count

nanoMIPS, not available in NMS

Load Word Multiple

Purpose:

Load Word Multiple. Load count words of data to registers $rt, $(rt+1),

..., $(rt+count-1)

from consecutive memory address starting at $rs + offset (register plus immediate).

Availability:

nanoMIPS, not available in NMS

Format:

101001	rt	rs	s[8]	count3	0	1	00	s[7:0]
6	5	5	1	3	1	1	2	8

offset = sign_extend(s, from_nbits=9)
count = 8 if count3 == 0 else count3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
i = 0
while i != count:
    this_rt = ( rt + i      if rt + i < 32 else
                rt + i - 16                    )
    this_offset = offset + (i<<2)
    va = effective_address(GPR[rs], this_offset, 'Load')
    data = read_memory_at_va(va, nbytes=4)
    GPR[this_rt] = sign_extend(data, from_nbits=32)
    if this_rt == rs and i != count - 1:
        raise UNPREDICTABLE()
    i += 1

LWM loads count words to sequentially numbered register from sequential memory addresses. After loading $31, the sequence of registers continues from $16. Some example encodings of the register

list are:

rt=15, count=3:

loads [$15, $16, $17]

rt=31, count=3:

loads [$31, $16, $17].

The result is unpredictable if an LWM instruction updates the base register prior to the ﬁnal load.

LWM must be implemented in such a way as to make the instruction restartable, but the implementation does not need to be fully atomic. For instance,it is allowable for a LWM instruction to be aborted by

an exception after a subset of the register updates have occurred. To ensure restartability, any write to GPR $rs (which may be used as the ﬁnal output register) must be completed atomically, that is, the

instruction must graduate if and only if that write occurs.

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LWPC rt, address - Load Word PC relative

Assembly:

LWPC rt, address

nanoMIPS, not available in NMS

Load Word PC relative

Purpose:

Load Word PC relative. Load word to register $rt from PC relative address address.

Availability:

nanoMIPS, not available in NMS

Format:

LWPC[48]

011000	rt	01011	s[15:0]	s[31:16]
6	5	5	16	16

offset = sign_extend(s, from_nbits=32)

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
address = effective_address(CPU.next_pc, offset)
data = read_memory_at_va(address, nbytes=4)
GPR[rt] = sign_extend(data, from_nbits=32)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LWX rd, rs(rt) - Load Word indeXed

Assembly:

LWX rd, rs(rt)

nanoMIPS

Load Word indeXed

Purpose:

Load Word indeXed. Load word to register $rd from memory address $rt + $rs (registerplus register).

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	1000	0	000	111
6	5	5	5	4	1	3	3

Operation:

va = effective_address(GPR[rs], GPR[rt], 'Load')
data = read_memory_at_va(va, nbytes=4)
GPR[rd] = sign_extend(data, from_nbits=32)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › LWXS rd, rs(rt) - Load Word indeXed Scaled

Assembly:

LWXS rd, rs(rt)

nanoMIPS

Load Word indeXed Scaled

Purpose:

Load Word indeXed Scaled. Load word to register $rd from memory address

$rt + 4*$rs

(register plus scaled register).

Availability:

nanoMIPS

Format:

LWXS[32]

001000	rt	rs	rd	1000	1	000	111
6	5	5	5	4	1	3	3

LWXS[16]

010100	rt3	rs3	rd3	1
6	3	3	3	1

rd = decode_gpr(rd3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
rt = decode_gpr(rt3, 'gpr3')

Operation:

va = effective_address(GPR[rs]<<2, GPR[rt], 'Load')
data = read_memory_at_va(va, nbytes=4)
GPR[rd] = sign_extend(data, from_nbits=32)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › MFC0 rt, c0s, sel - Move From Coprocessor 0

Assembly:

MFC0 rt, c0s, sel

nanoMIPS. Requires CP0 privilege.

Move From Coprocessor 0

Purpose:

Move From Coprocessor 0. Write value of CP0 register indexed by c0s, sel to register $rt.

Availability:

nanoMIPS. Requires CP0 privilege.

Format:

001000	rt	c0s	sel	x	0000110	000
6	5	5	5	1	7	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
value = read_cp0_register(c0s, sel)
GPR[rt] = sign_extend(value, from_nbits=32)

An MFC0 which targets a register which is not used on the current core will return zero.

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › MFHC0 rt, c0s, sel - Move From High Coprocessor 0

Assembly:

MFHC0 rt, c0s, sel

nanoMIPS, required.

Move From High Coprocessor 0

Purpose:

Move From High Coprocessor 0. Write bits 63..32 (when present) of CP0 register indexedby c0s, sel to register $rt.

Availability:

nanoMIPS, required.

(Optional on NMS cores). Requires CP0 privilege.

Format:

001000	rt	c0s	sel	x	0000111	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.MVH == 0:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
value = read_cp0_register(c0s, sel, h=True)
GPR[rt] = sign_extend(value, from_nbits=32)

For certain core conﬁgurations, speciﬁc nanoMIPS32™ CP0 registers may be extended to be 64 bits wide. The MFHC0 instruction is used to read the upper 32 bits of such registers. An MFHC0 which

targets a register for which the ’high’ bits are not used will return zero.

This instruction is available when Conﬁg5.MVH=1, which is required on nanoMIPS™ cores, except for NMS cores where it is optional.

Exceptions:

Coprocessor Unusable. Reserved Instruction on NMS cores without MVH support.

/home/arch-index/nanomips/main › MOD rd, rs, rt - Modulo

Assembly:

MOD rd, rs, rt

nanoMIPS

Modulo

Purpose:

Modulo. Compute signed division of register $rs by register $rt, and place the remainderin register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0101011	000
6	5	5	5	1	7	3

Operation:

numerator = GPR[rs]
denominator = GPR[rt]
if denominator == 0:
    quotient, remainder = (UNKNOWN, UNKNOWN)
else:
    quotient, remainder = divide_integers(numerator, denominator)
GPR[rd] = sign_extend(remainder, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › MODU rd, rs, rt - Modulo Unsigned

Assembly:

MODU rd, rs, rt

nanoMIPS

Modulo Unsigned

Purpose:

Modulo Unsigned. Compute unsigned division of register $rs by register $rt, and place theremainder in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0111011	000
6	5	5	5	1	7	3

Operation:

numerator = zero_extend(GPR[rs], from_nbits=32)
denominator = zero_extend(GPR[rt], from_nbits=32)
if denominator == 0:
    quotient, remainder = (UNKNOWN, UNKNOWN)
else:
    quotient, remainder = divide_integers(numerator, denominator)
GPR[rd] = sign_extend(remainder, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › MOVE.BALC rd, rt, address - Move and Branch and Link, Compact

Assembly:

MOVE.BALC rd, rt, address

nanoMIPS, not available in NMS

Move and Branch and Link, Compact

Purpose:

Move and Branch and Link, Compact. Copy value of register $rt to register $rd, and performan unconditional PC relative branch to address, placing the return address in register $31.

Availability:

nanoMIPS, not available in NMS

Format:

000010	rtz4[3]	rd1	rtz4[2:0]	s[20:1]	s[21]
6	1	1	3	20	1

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
rd = decode_gpr(rd1, 'gpr1')
rt = decode_gpr(rtz4[3] @ rtz4[2:0], 'gpr4.zero')
offset = sign_extend(s, from_nbits=22)
address = effective_address(CPU.next_pc, offset)
GPR[rd] = GPR[rt]
GPR[31] = CPU.next_pc
CPU.next_pc = address

Although this instruction is called MOVE.BALC, the order of the updates to PC, $31 and $rd is invisible to software, and an implementation may choose any order for carring out these steps.

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › MOVE rt, rs - Move

Assembly:

MOVE rt, rs

nanoMIPS

Move

Purpose:

Move. Copy value of register $rs to register $rt.

Availability:

nanoMIPS

Format:

000100

rt!=0

Operation:

GPR[rt] = GPR[rs]

Exceptions:

None.

/home/arch-index/nanomips/main › MOVEP dst1, dst2, src1, src2 - Move Pair

Assembly:

MOVEP dst1, dst2, src1, src2

nanoMIPS, not available in NMS

Move Pair

Purpose:

Move Pair. Copy value of register $src1 to register $dst1, and copy value of register $src2to register $dst2.

Availability:

nanoMIPS, not available in NMS

Format:

MOVEP

101111	rtz4[3]	rd2[0]	rtz4[2:0]	rsz4[3]	rd2[1]	rsz4[2:0]
6	1	1	3	1	1	3

dst1 = decode_gpr(rd2[1] @ rd2[0], 'gpr2.reg1')
dst2 = decode_gpr(rd2[1] @ rd2[0], 'gpr2.reg2')
src1 = decode_gpr(rsz4[3] @ rsz4[2:0], 'gpr4.zero')
src2 = decode_gpr(rtz4[3] @ rtz4[2:0], 'gpr4.zero')

MOVEP[REV]

111111	rt4[3]	rd2[0]	rt4[2:0]	rs4[3]	rd2[1]	rs4[2:0]
6	1	1	3	1	1	3

dst1 = decode_gpr(rs4[3] @ rs4[2:0], 'gpr4')
dst2 = decode_gpr(rt4[3] @ rt4[2:0], 'gpr4')
src1 = decode_gpr(rd2[1] @ rd2[0], 'gpr2.reg1')
src2 = decode_gpr(rd2[1] @ rd2[0], 'gpr2.reg2')

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
if dst1 == src1 or dst1 == src2 or dst2 == src1 or dst2 == src2:
    GPR[dst1] = UNKNOWN
    GPR[dst2] = UNKNOWN
else:
    GPR[dst1] = GPR[src1]
    GPR[dst2] = GPR[src2]

The output register values are unpredictable if either of the output registers is also used as an input.

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › MOVN rd, rs, rt - Move if Not zero

Assembly:

MOVN rd, rs, rt

nanoMIPS

Move if Not zero

Purpose:

Move if Not zero. Copy value of register $rs to register $rd if register $rt is not zero.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	1	1000010	000
6	5	5	5	1	7	3

Operation:

GPR[rd] = GPR[rs] if GPR[rt] != 0 else GPR[rd]

Exceptions:

None.

/home/arch-index/nanomips/main › MOVZ rd, rs, rt - Move if Zero

Assembly:

MOVZ rd, rs, rt

nanoMIPS

Move if Zero

Purpose:

Move if Zero. Copy value of register $rs to register $rd if register $rt is zero.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	0	1000010	000
6	5	5	5	1	7	3

Operation:

GPR[rd] = GPR[rs] if GPR[rt] == 0 else GPR[rd]

Exceptions:

None.

/home/arch-index/nanomips/main › MTC0 rt, c0s, sel - Move To Coprocessor 0

Assembly:

MTC0 rt, c0s, sel

nanoMIPS. Requires CP0 privilege.

Move To Coprocessor 0

Purpose:

Move To Coprocessor 0. Write value of register $rt to CP0 register indexed by c0s, sel.

Availability:

nanoMIPS. Requires CP0 privilege.

Format:

001000	rt	c0s	sel	x	0001110	000
6	5	5	5	1	7	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
write_cp0_register(GPR[rt], c0s, sel)

An MTC0 to a register which is not used on the current core is ignored.

When a register is extended to have high bits for a speciﬁc conﬁguration (see MTHC0), legacy software which is not aware of the existence of these high bits still needs to function correctly.In such cases,

the architecture may require that an MTC0 modiﬁes the high 32 bits of the register as well as the low 32 bits to give the correct legacy behavior.

For this reason, when setting an extended CP0 register, the MTC0 to set the low 32 bits should always precede the MTHC0 to set the high 32 bits. Also, a read-modify-write sequence to set a speciﬁc bitﬁeld

in the low 32 bits should read both the low 32 and high 32 bits, then do MTC0 followed by MTHC0 to write the modiﬁed value back.

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › MTHC0 rt, c0s, sel - Move To High Coprocessor 0

Assembly:

MTHC0 rt, c0s, sel

nanoMIPS, required.

Move To High Coprocessor 0

Purpose:

Move To High Coprocessor 0. Write value of register $rt to bits 63..32 (when present) ofCP0 register indexed by c0s, sel.

Availability:

nanoMIPS, required.

(Optional on NMS cores). Requires CP0 privilege.

Format:

001000	rt	c0s	sel	x	0001111	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.MVH == 0:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
write_cp0_register(GPR[rt], c0s, sel, h=True)

For certain core conﬁgurations, speciﬁc nanoMIPS32™ CP0 registers may be extended to be 64 bits wide.The MTHC0 instruction is used to write the upper 32 bits of such registers. An MTHC0 to a

When a register is extended to have high bits for a speciﬁc conﬁguration, legacy software which is not aware of the existence of these high bits still needs to function correctly.In such cases, the architecture

may require that an MTC0 modiﬁes the high 32 bits of the register as well as the low 32 bits to give the correct legacy behavior.

in the low 32 bits should read both the low 32 and high 32 bits, then do MTC0 followed by MTHC0 to write the modiﬁed value back.

This instruction is available when Conﬁg5.MVH=1, which is required on nanoMIPS™ cores, except for NMS cores where it is optional.

Exceptions:

Coprocessor Unusable. Reserved Instruction on NMS cores without MVHm support.

/home/arch-index/nanomips/main › MUH rd, rs, rt - Multiply High

Assembly:

MUH rd, rs, rt

nanoMIPS

Multiply High

Purpose:

Multiply High. Multiply signed word values from registers $rs and $rt, and place bits 63..32of the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0001011	000
6	5	5	5	1	7	3

Operation:

result = GPR[rs] * GPR[rt]
result_hi = result[63:32]
GPR[rd] = sign_extend(result_hi, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › MUHU rd, rs, rt - Multiply High Unsigned

Assembly:

MUHU rd, rs, rt

nanoMIPS

Multiply High Unsigned

Purpose:

Multiply High Unsigned. Multiply unsigned word values in registers $rs and $rt, and placebits 63..32 of the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0011011	000
6	5	5	5	1	7	3

Operation:

rs_unsigned = zero_extend(GPR[rs], from_nbits=32)
rt_unsigned = zero_extend(GPR[rt], from_nbits=32)
result = rs_unsigned * rt_unsigned
result_hi = result[63:32]
GPR[rd] = sign_extend(result_hi, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › MUL dst, src1, src2 - Multiply

Assembly:

MUL dst, src1, src2

nanoMIPS, availability varies by format.

Multiply

Purpose:

Multiply. Multiply signed word values in registers $src1 and $src2, and place bits 31..0 ofthe result in register $dst.

Availability:

nanoMIPS, availability varies by format.

Format:

MUL[32]

001000	rt	rs	rd	x	0000011	000
6	5	5	5	1	7	3

dst = rd
src1 = rs
src2 = rt
not_in_mms = False

MUL[4X4], not available in NMS

001111	rt4[3]	0	rt4[2:0]	rs4[3]	1	rs4[2:0]
6	1	1	3	1	1	3

dst = decode_gpr(rt4, 'gpr4')
src1 = decode_gpr(rt4, 'gpr4')
src2 = decode_gpr(rs4, 'gpr4')
not_in_mms = True

Operation:

if not_in_mms and C0.Config5.NMS == 1:
    raise exception('RI')
result = GPR[src1] * GPR[src2]
GPR[dst] = sign_extend(result, from_nbits=32)

Exceptions:

Reserved Instruction for MUL[4X4] format on NMS cores.

/home/arch-index/nanomips/main › MULU rd, rs, rt - Multiply Unsigned

Assembly:

MULU rd, rs, rt

nanoMIPS

Multiply Unsigned

Purpose:

Multiply Unsigned. Multiply unsigned word values in registers $rs and $rt, and place bits31..0 of the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0010011	000
6	5	5	5	1	7	3

Operation:

rs_unsigned = zero_extend(GPR[rs], from_nbits=32)
rt_unsigned = zero_extend(GPR[rt], from_nbits=32)
result = rs_unsigned * rt_unsigned
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › NOP - No Operation

Assembly:

NOP

nanoMIPS

No Operation

Purpose:

No Operation.

Availability:

nanoMIPS

Format:

NOP[32]

100000	00000	x	1100	x	0000	00000
6	5	5	4	3	4	5

NOP[16]

100100	00000	x	1	x
6	5	1	1	3

Operation:

# No operation pass

The NOP[32] encoding is equivalent to an SLL[32] instruction using $0 as output and a shift value of 0. The NOP[16] encoding is equivalent to an ADDIU[RS5] instruction using $0 as output. Therefore NOP

does not necessarily need any additionalimplementation in hardware beyond the normal behavior of the SLL[32] and ADDIU[RS5] instructions.

If software intentionally generates a NOP instruction, it should only generate these speciﬁc encodings, rather than other instructions writing to $0 which would also result in no operation.

If hardware implements a performance counter for nops,it can expect these speciﬁc instruction encodings to be used.

It should ignore the x ﬁeld of the encoding, treating all values of x as representing

a valid NOP instruction. Software on the other hand should only generate NOP instructions with an x value of 0.

As for all instruction deﬁnitions containing x ﬁelds, this methodology allows for the possibility that the meaning of x values other than zero might be enhanced in the future, with the understanding that cores

prior to the enhanced deﬁnition will treat the x!=0 encodings as equivalent to the x==0 instruction.

Exceptions:

None.

/home/arch-index/nanomips/main › NOR rd, rs, rt - NOR

Assembly:

NOR rd, rs, rt

nanoMIPS

NOR

Purpose:

NOR. Compute logical NOR of registers $rs and $rt, placing the result in register $rt.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	1011010	000
6	5	5	5	1	7	3

Operation:

GPR[rd] = ~(GPR[rs] | GPR[rt])

Exceptions:

None.

/home/arch-index/nanomips/main › NOT rt, rs - NOT

Assembly:

NOT rt, rs

nanoMIPS

NOT

Purpose:

NOT. Write logical inversion of register $rs to register $rt.

Availability:

nanoMIPS

Format:

010100	rt3	rs3	00	0	0
6	3	3	2	1	1

Operation:

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
GPR[rt] = ~GPR[rs]

Exceptions:

None.

/home/arch-index/nanomips/main › OR rd, rs, rt - OR

Assembly:

OR rd, rs, rt

nanoMIPS

Purpose:

OR. Compute logical OR of registers $rs and $rt, placing the result in register $rt.

Availability:

nanoMIPS

Format:

OR[32]

001000	rt	rs	rd	x	1010010	000
6	5	5	5	1	7	3

OR[16]

010100	rt3	rs3	11	0	0
6	3	3	2	1	1

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
rd = rt

Operation:

GPR[rd] = GPR[rs] | GPR[rt]

Exceptions:

None.

/home/arch-index/nanomips/main › ORI rt, rs, u - OR Immediate

Assembly:

ORI rt, rs, u

nanoMIPS

OR Immediate

Purpose:

OR Immediate. Compute logical OR of register $rs with immediate u, placing the result inregister $rt.

Availability:

nanoMIPS

Format:

100000	rt	rs	0000	u
6	5	5	4	12

Operation:

GPR[rt] = GPR[rs] | u

Exceptions:

None.

/home/arch-index/nanomips/main › PAUSE - Pause

Assembly:

PAUSE

nanoMIPS

Pause

Purpose:

Pause. Pause until LL Bit is cleared.

Availability:

nanoMIPS

Format:

100000	00000	x	1100	x	0000	00101
6	5	5	4	3	4	5

Operation:

if C0.LLAddr.LLB:
    CPU.in_pause_state = True

The purpose ofthe PAUSE instruction is halt a thread (rather than entering a spin loop) when itis waiting to acquire an LL/SC lock. This is particularly useful on multi-threaded processors, since the

waiting thread may be using the same instruction pipeline as the thread which currently owns the lock, and hence entering a spin loop will delay the other thread from completing its task and freeing the

lock.

When a thread is in the paused state,it should not issue any instructions. The paused state will be

cleared either if the LLBit for the thread gets cleared, or if the thread takes an interrupt.If an interrupt occurs,

it is implementation dependent whether C0.EPC points to the PAUSE instruction or the

instruction after the PAUSE.

In LL/SC lock software, the LLBit of the waiting thread will always be cleared when the thread which owns the lock does a store instruction to the lock address in order to clear the lock. Thus the paused

thread will always be woken when it has another opportunity to acquire the lock. After the PAUSE instruction completes, software is expected to attempt to acquire the lock again by re-executing the

LL/SC sequence.

It is legal to implement PAUSE as a NOP instruction.In this case, the behavior of LL/SC lock software will be equivalent to executing a spin loop to acquire the lock. Software using PAUSE will still work,

but the beneﬁt of having the waiting thread not consume instruction issue slots will be lost.

PAUSE is encoded as an SLL instruction with a shift value of 5, targeting GPR $0. Hence PAUSE will behave as a NOP instruction if no additional behavior beyond that of SLL is implemented.

The following assembly code example shows how the PAUSE instruction can be used to halt a thread while it is waiting to acquire an LL/SC lock.

acquire_lock:
        ll      t0, 0(a0)    /* Read softwarelock, set LLBit. */
        bnezc  t0, acquire_lock_retry /* Branch if softwarelock is taken.*/
        addiu   t0, t0, 1    /* Set the software lock. */
        sc      t0, 0(a0)    /* Try to store the softwarelock. */
        bnezc  t0, 10f      /* Branchiflockacquired successfully.*/
        sync
acquire_lock_retry:
        pause                /* Wait for LLBITtoclear before retrying. */
        bc      acquire_lock /* Now retrytheoperation. */
10:
        /* Critical Region Code */
        ...
release_lock:
        sync
        sw      zero, 0(a0)  /* Releasesoftwarelock,clearing LLBIT
                                for any PAUSEd waiters */

Exceptions:

None.

/home/arch-index/nanomips/main › PREF hint, offset(rs) - Prefetch/Prefetch using EVA addressing

Assembly:

PREF hint, offset(rs)	nanoMIPS, availability varies by format.	Prefetch/Prefetch using EVA addressing
PREFE hint, offset(rs)	nanoMIPS, availability varies by format.	Prefetch/Prefetch using EVA addressing

Purpose:

Prefetch/Prefetch using EVA addressing. Perform a prefetch operation of type hint at address $rs + offset (register plus immediate). For PREFE, translate the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS, availability varies by format.

Format:

PREF[S9]

101001

hint!=31

hint

s[8]

0011

s[7:0]

offset = sign_extend(s, from_nbits=9)
is_eva = False

PREF[U12]

100001

hint!=31

hint

0011

offset = u
is_eva = False

PREFE, present when Conﬁg5.EVA=1, requires CP0 privilege.

312625212016151411109870

with hint!=31

101001	hint	rs	s[8]	0011	0	10	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)
is_eva = True

Operation:

if is_eva and not C0.Config5.EVA:
    raise exception('RI')
if is_eva and not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Load', eva=is_eva)
# Perform implementation dependent prefetch actions
pref(va, hint, eva=is_eva)

The PREF and PREFE instructions request that the processor take some action to improve program performance in accordance with the intended data usage speciﬁed by the hint argument.This is

typically done by moving data to or from the cache at the speciﬁed address. The meanings of hint are as follows:

```
     hint=0:
```

load

–

Use: Prefetched data is expected to be read (not modiﬁed).

–

Action: Fetch data as if for a load.

```
hint=1: store
```

–

Use: Prefetched data is expected to be stored or modiﬁed.

–

Action: Fetch data as if for a store.

```
hint=2: L1 LRU hint
```

–

Mark the line as LRU in the L1 cache and thus preferred for next eviction.

Implementations can choose to writeback and/or invalidate the line as long as no architectural state is

modiﬁed.

```
hint=3: Reserved for Implementation
```
```
hint=4:
```

load_streamed

–

Use: Prefetched data is expected to be read (not modiﬁed) but not reused extensively;

”streams” through cache.

–

Action: Fetch data as if for a load and place it in the cache so that it does not displace data

prefetched as ”retained”.

```
hint=5: store_streamed
```

–

Use: Prefetched data is expected to be stored or modiﬁed but not reused extensively;

”streams” through cache.

–

Action: Fetch data as if for a store and place it in the cache so that it does not displace data

prefetched as ”retained”.

```
hint=6:
```

load_retained

–

Use: Prefetched data is expected to be read (not modiﬁed) and reused extensively; it should

be ”retained” in the cache.

–

Action: Fetch data as if for a load and place it in the cache so that it is not displaced by data

prefetched as ”streamed”.

```
hint=7: store_retained
```

–

Use: Prefetched data is expected to be stored or modiﬁed and reused extensively; it should

be ”retained” in the cache.

–

Action: Fetch data as if

for a store and place it in the cache so that it is not displaced by

data prefetched as ”streamed”.

```
hint=8..15: L2 operation
```

–

In the Release 6 architecture, hint codes 8..15 are treated the same as hint codes 0..7 respectively, but operate on the L2 cache.

```
hint=16..23: L3 operation
```

–

In the Release 6 architecture, hint codes 16..23 are treated the same as hint codes 0..7 respectively, but operate on the L3 cache.

```
hint=24..30: Reserved for Architecture
```

–

These hint codes are reserved in nanoMIPS and should act as a NOP. (This is not the same

as the MIPSR6 behavior, where these hints give a Reserved Instruction exception). Note that hint=31 is not listed as that encoding is decoded as a SYNCI instruction.

The action taken for a speciﬁc PREF instruction is both system and context dependent. Any action, including doing nothing, is permitted as long as it does not change architecturally visible state or alter

the meaning of a program.

PREF does not cause addressing-related exceptions, including TLB exceptions.If the address speciﬁed would cause an addressing exception, the exception condition is ignored and no data movement occurs.

For cached addresses, the expected and useful action is for the processor to prefetch a block of data that includes the eﬀective address. The size of the block and the level of the memory hierarchy it is

fetched into are implementation speciﬁc.

PREF neither generates a memory operation nor modiﬁes the state of a cache line for addresses with an uncached CCA.

Prefetch operations have no eﬀect on cache lines that were previously locked with the CACHE instruction.

In coherent multiprocessor implementations,if the eﬀective address uses a coherent CCA, then the instruction causes a coherent memory transaction to occur.This means a prefetch issued on one

processor can cause data to be evicted from the cache in another processor.

The memory transactions which occur as a result of a PREF instruction, such as cache reﬁll or cache writeback, obey the same ordering and completion rules as other load or store instructions.

It is implementation dependent whether a Bus Error or Cache Error exception is reported if such an error is detected as a byproduct ofthe action taken by the PREF instruction.Implementations are

encouraged to report such errors only if there is a speciﬁc requirement for high-reliability. Note that

suppressing a bus or cache error in this case may require that the processor communicate to the system that the reference is speculative.

Hint ﬁeld encodings whose function is described as ”streamed” or ”retained” convey usage intent from software to hardware. Software should not assume that hardware will always prefetch data in an

optimal way.If data is to be truly retained, software should use the Cache instruction to lock data into the cache.

Itis implementation dependent whether a data watch or EJTAG breakpoint exception is triggered by a prefetch instruction whose address matches the Watch register address match or EJTAG data

breakpoint conditions. The preferred implementation is not to match on the prefetch instruction.

Exceptions:

Bus Error. Cache Error. Coprocessor Unusable for PREFE. Reserved Instruction for PREFE if EVA not implemented.

/home/arch-index/nanomips/main › RDHWR rt, hs, sel - Read Hardware Register

Assembly:

RDHWR rt, hs, sel

nanoMIPS, not available in NMS

Read Hardware Register

Purpose:

Read Hardware Register. Read speciﬁc CP0 privileged state (identiﬁed by hs, sel) to register$rs. Kernel code can enable or disable user mode RDHWR accesses by programming the enable bits in the HWREna register.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	hs	sel	x	0111000	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
if not IsCoprocessor0Enabled():
   if not C0.HWREna & (1 << hs):
       raise exception('RI', 'Required HWREnabitnotset')
if sel and hs != 4:
    raise exception('RI', 'sel fieldnot supported for this hs')
if is_guest_mode():
    check_gpsi('CP0')
if hs == 0:
    GPR[rt] = C0.EBase.CPUNum
elif hs == 1:
    GPR[rt] = synci_step()
elif hs == 2:
    if is_guest_mode():
        check_gpsi('GT')
        GPR[rt] = guest_count()
    else:
        GPR[rt] = C0.Count
elif hs == 3:
    GPR[rt] = CPU.count_resolution
elif hs == 4:
    if not C0.Config1.PC:
        raise exception('RI', 'Perf Counters not implemented')
    GPR[rt] = read_cp0_register(25, sel)  # Performance counter register
elif hs == 5:
    GPR[rt] = C0.Config5.XNP
elif hs == 29:
    if not C0.Config3.ULRI:
        raise exception('RI')
    GPR[rt] = sign_extend(C0.UserLocal)
else:
    raise exception('RI')

Exceptions:

Coprocessor Unusable. Reserved Instruction for unsupported register numbers. Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › RDPGPR rt, rs - Read Previous GPR

Assembly:

RDPGPR rt, rs

nanoMIPS. Requires CP0 privilege.

Read Previous GPR

Purpose:

Read Previous GPR. Write the value of register $rs from the previous shadow register set(SRSCtl.PSS) to register $rt in the current shadow register set (SRSCtl.CSS). If shadow register sets are not implemented,just copy the value from register $rs to register $rt.

Availability:

nanoMIPS. Requires CP0 privilege.

Format:

001000	rt	rs	11	10000	101	111	111
6	5	5	2	5	3	3	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
if C0.SRSCtl.HSS > 0:
    GPR[rt] = SRS[C0.SRSCtl.PSS][rs]
else:
    GPR[rt] = GPR[rs]

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › RESTORE u[, dst1 [, dst2 [, ...]]] # jr=0 implied - Restore callee saved registers/Restore callee saved registers and Jump to Return address,

Assembly:

RESTORE u[, dst1 [, dst2 [, ...]]] # jr=0 implied	nanoMIPS, availability varies by format.	Restore callee saved registers/Restore callee saved registers and Jump to Return address,
RESTORE.JRC u[, dst1 [, dst2 [, ...]]] # jr=1 implied	nanoMIPS, availability varies by format.	Restore callee saved registers/Restore callee saved registers and Jump to Return address,

Purpose:

Restore callee saved registers/Restore callee saved registers and Jump to Return address,Compact. Restore registers dst1, [dst2,...]from addresses at the top of the local stack frame ($29 +

u - 4, $29 + u - 8, ...), then point register $29 back to the caller’s stack frame by adding oﬀset u. For RESTORE.JRC, return from the current subroutine by jumping to the address in $31.

Availability:

nanoMIPS, availability varies by format.

Format:

RESTORE[32]

100000	rt	0	count	0011	u[11:3]	gp	10
6	5	1	4	4	9	1	2

jr = 0

RESTORE.JRC[16]

000111	rt1	1	u[7:4]	count
6	1	1	4	4

rt = 30 if rt1 == 0 else 31
gp = 0
jr = 1

RESTORE.JRC[32], gp case not available in NMS

100000	rt	0	count	0011	u[11:3]	gp	11
6	5	1	4	4	9	1	2

jr = 1

Operation:

if gp and C0.Config5.NMS:
    raise exception('RI')
i = 0
while i != count:
    this_rt = ( 28          if gp and (i + 1 == count) else
                rt + i      if rt + i < 32             else
                rt + i - 16                                 )
    this_offset = u - ( (i+1) << 2 )
    va = effective_address(GPR[29], this_offset, 'Load')
    if va & 3:
        raise exception('ADEL', badva=va)
    data = read_memory_at_va(va, nbytes=4)
    GPR[this_rt] = sign_extend(data, from_nbits=32)
    if this_rt == 29:
        raise UNPREDICTABLE()
    i += 1
GPR[29] = effective_address(GPR[29], u)
if jr:
    CPU.next_pc = GPR[31]

The purpose of the RESTORE and RESTORE.JRC instructions is to restore callee saved registers from the stack on exit from a subroutine, adjust the stack pointer register $29 to point to the caller’s stack

frame, and for RESTORE.JRC to return from the subroutine by jumping to the address in register $31. RESTORE/RESTORE.JRC will usually be paired with a matching SAVE instruction at the start of the

subroutine, and SAVE and RESTORE take the same arguments.

The arguments for RESTORE/RESTORE.JRC consist of the amount to increment the stack by, and a list of registers to restore from to the stack. The increment is a double word aligned immediate value u

in the range 0 to 4092. The register list can contain up to 16 consecutive registers. The count of the number of registers is encoded in the instruction’s count ﬁeld. The ﬁrst register in the list is encoded

in the rt ﬁeld of the instruction.

The register list is allowed to wrap around from register $31 back to register $16 and still be considered consecutive; this allows fp ($30) and ra ($31) and the saved temporary registers s0-s7 ($16 - $23) to

be restored in one instruction.

Additionally, $28 (the global pointer register) will be used in place of last register in the sequence if the ’gp’ bit in the instruction encoding is set. This feature (which is not available for NMS cores) makes it

possible to treat $28 as a callee saved register for environments such as Linux which require it.

The restored registers are read from memory addresses $29+ u -4, $29 + u -8, $29 + u -12,... etc,i.e. at the top of the local stack frame. The stack pointer is then adjusted by adding the size u of

the local stack frame, so that it points back to the caller’s stack frame.

RESTORE.JRC with count=0 adjusts the stack pointer and jumps to the address in $31, but does not restore any registers from memory.Thus the RESTORE.JRC[16]instruction format can be used to

provide ADDIU $29, $29,u; JRC $31 behavior using a single 16 bit instruction.

The result of a RESTORE instruction is UNPREDICTABLE if the register list includes register $29.

RESTORE/RESTORE.JRC must be implemented in such a way as to make the instructions restartable,

butthe implementation does not need to be fully atomic.Forinstance,itis allowable for a RESTORE/RESTORE.JRC instruction to be aborted by an exception after a subset of the register updates

have occurred. To ensure restartability, the write to GPR $29 and the jump (for RESTORE.JRC) must be completed atomically, that is, the instruction must graduate if and only if those writes occur.

Exceptions:

Address Error. Bus Error. Reserved Instruction for gp=1 cases on NMS cores. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › ROTR rt, rs, shift - Rotate Right

Assembly:

ROTR rt, rs, shift

nanoMIPS

Rotate Right

Purpose:

Rotate Right. Rotate the word value in register $rs by shift value shift, and place the resultin register $rt.

Availability:

nanoMIPS

Format:

100000	rt	rs	1100	x	0110	shift
6	5	5	4	3	4	5

Operation:

tmp = GPR[rs][31:0] @ GPR[rs][31:0]
result = tmp >> shift
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › ROTRV rd, rs, rt - Rotate Right Variable

Assembly:

ROTRV rd, rs, rt

nanoMIPS

Rotate Right Variable

Purpose:

Rotate Right Variable. Rotate the word value in register $rs by the shift value contained inregister $rt, and place the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0011010	000
6	5	5	5	1	7	3

Operation:

shift = GPR[rt] & 0x1f
tmp = GPR[rs][31:0] @ GPR[rs][31:0]
result = tmp >> shift
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › ROTX rt, rs, shift, shiftx # stripe=0implied - Rotate and eXchange

Assembly:

ROTX rt, rs, shift, shiftx, stripe

nanoMIPS, not available in NMS

Rotate and eXchange

Purpose:

Rotate and eXchange. Rotate and exchange bits in the word value in register $rs and placeresult in register $rt. Speciﬁc choices of the shift, shiftx and stripe arguments allow this instruction to perform bit and byte reordering operations including BYTEREVW, BYTEREVH, BITREVW, BITREVH

and BITREVB.

Availability:

nanoMIPS, not available in NMS

Format:

100000	rt	rs	1101	0	shiftx[4:1]	stripe	0	shift
6	5	5	4	1	4	1	1	5

Operation:

if C0.Config5.NMS:
    raise exception('RI')
tmp0 = GPR[rs][31:0] @ GPR[rs][31:0]
tmp1 = tmp0
for i in range(47):  # 0..46
    s = shift if (i & 0b01000) else shiftx
    if stripe and not (i & 0b00100): s = ~s
    if s[4]: tmp1[i] = tmp0[i+16]
tmp2 = tmp1
for i in range(39):  # 0..38
    s = shift if (i & 0b00100) else shiftx
    if s[3]: tmp2[i] = tmp1[i+8]
tmp3 = tmp2
for i in range(35):  # 0..34
    s = shift if (i & 0b00010) else shiftx
    if s[2]: tmp3[i] = tmp2[i+4]
tmp4 = tmp3
for i in range(33):  # 0..32
    s = shift if (i & 0b00001) else shiftx
    if s[1]: tmp4[i] = tmp3[i+2]
tmp5 = tmp4
for i in range(32):  # 0..31
    s = shift;
    if s[0]: tmp5[i] = tmp4[i+1]
GPR[rt] = sign_extend(tmp5, from_nbits=32)

The ROTX instruction can be used to reverse elements of a selected size within blocks of a diﬀerent selected size. Some example use cases are shown in the table below. The ’Result’ shows the output value

assuming an input value of abcdefgh ijklmnopqrstuvwx yz012345, where each character represents the value of a single bit.

Assembly/Result from

                                                              abcdefgh ijklmnop qrstuvwx

AliasOperationyz012345

BITREVWReverse all bitsROTX rt, rs, 31, 0

                                                              543210zy xwvutsrq ponmlkji
                                                              hgfedcba

BITREVHReverse bits in halfsROTX rt, rs, 15, 16

                                                              ponmlkji hgfedcba 543210zy
                                                              xwvutsrq

BITREVBReverse bits in bytesROTX rt, rs, 7, 8, 1

                                                              hgfedcba ponmlkji xwvutsrq
                                                              543210zy

BYTEREVWReverse all bytesROTX rt, rs, 24, 8

                                                              yz012345 qrstuvwx ijklmnop
                                                              abcdefgh

BYTEREVHReverse bytes in halfsROTX rt, rs, 8, 24

                                                              ijklmnop abcdefgh yz012345
                                                              qrstuvwx

Reverse all nibblesROTX rt, rs, 28, 4

                                                              2345yz01 uvwxqrst mnopijkl
                                                              efghabcd

Reverse nibbles in halfsROTX rt, rs, 12, 20

                                                              mnopijkl efghabcd 2345yz01
                                                              uvwxqrst

Reverse nibbles in bytesROTX rt, rs, 4, 12, 1

                                                              efghabcd mnopijkl uvwxqrst

Assembly/Result from

                                                              abcdefgh ijklmnop qrstuvwx

AliasOperationyz012345

Reverse all bit pairsROTX rt, rs, 30, 2

                                                              452301yz wxuvstqr opmnklij
                                                              ghefcdab

Reverse pairs in halfsROTX rt, rs, 14, 18

                                                              opmnklij ghefcdab 452301yz
                                                              wxuvstqr

Reverse pairs in bytesROTX rt, rs, 6, 10, 1

                                                              ghefcdab opmnklij wxuvstqr
                                                              452301yz

Assembler aliases are provided for certain cases, as indicated in the table.

The MIPS32™ instructions BITSWAP and WSBH are equivalent to BITREVB and BYTEREVH respectively, and are also provided as assembler aliases to ROTX.

The ROTX instruction is designed to be implementable with minimal overhead using existing logic for the ROTR instruction. ROTR can be implemented using a barrel shifter, where the select signals for

the multiplexers at each stage are the bits of the ’shift’ argument. For ROTX, the mux select signals depend on the bit position as well as the stage of the shifter, and are a function of the ’shift’,’shiftx’

and ’stripe’ arguments.

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › SAVE u[, src1 [, src2 [, ...]]] - Save callee saved registers

Assembly:

SAVE u[, src1 [, src2 [, ...]]]

nanoMIPS, availability varies by format.

Save callee saved registers

Purpose:

Save callee saved registers.

Save registers src1,[src2,...]to addresses just below the

current stack pointer ($29) address and adjust the stack pointer by subtracting oﬀset u to accommodate the saved registers and the local stack frame.

Availability:

nanoMIPS, availability varies by format.

Format:

SAVE[16]

000111	rt1	0	u[7:4]	count
6	1	1	4	4

rt = 30 if rt1 == 0 else 31
gp = 0

SAVE[32], gp case not available in NMS

100000	rt	0	count	0011	u[11:3]	gp	00
6	5	1	4	4	9	1	2

Operation:

if gp and C0.Config5.NMS:
    raise exception('RI')
i = 0
while i != count:
    this_rt = ( 28          if gp and (i + 1 == count) else
                rt + i      if rt + i < 32             else
                rt + i - 16                                 )
    this_offset = - ( (i+1) << 2 )
    va = effective_address(GPR[29], this_offset, 'Load')
    if va & 3:
        raise exception('ADES', badva=va)
    data = zero_extend(GPR[this_rt], from_nbits=32)
    write_memory_at_va(data, va, nbytes=4)
    i += 1
GPR[29] = effective_address(GPR[29], -u)

The purpose of the SAVE instruction is to save callee saved registers to the stack on entry to a subroutine, and adjust the stack pointer register ($29) to accommodate the saved registers and the subroutine’s local stack frame.

The instruction speciﬁcation consists of the amount to decrement the stack by, and a list of registers to save to the stack. The stack decrement is a double word aligned immediate value u in the range

0 to 4092. The register list can contain up to 16 consecutive registers. The count of the number of registers in the register list is encoded in the instruction’s count ﬁeld. The ﬁrst register in the list is

encoded in the rt ﬁeld of the instruction.

be saved in one instruction.

possible to treat $28 as a callee saved register for environments such as Linux which require it.

The saved registers are written to memory addresses $29-4, $29-8, $29-12,...etc,i.e.just below the current stack pointer address. The stack pointer is then adjusted by subtracting oﬀset u, which

should be chosen to accommodate the saved registers and current subroutine’s local stack frame, while maintaining the required stack pointer alignment.

SAVE with count=0 adjusts the stack pointer but does not save any registers to memory.Thus the

SAVE[16] instruction format can be used to provide ADDIU16$29, $29, -u behavior.

SAVE must be implemented in such a way as to make the instruction restartable, but the implementation does not need to be fully atomic. For instance, it is allowable for a SAVE instruction to be aborted

by an exception after a subset ofthe memory updates have occurred.To ensure restartability,the write to GPR $29 must be completed atomically,that is,the instruction must graduate if and only if

that write occurs.

Exceptions:

Address Error.Bus Error.Reserved Instruction for gp=1 cases on NMS cores.TLB Invalid.TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SB rt, offset(rs) - Store Byte

Assembly:

SB rt, offset(rs)

nanoMIPS

Store Byte

Purpose:

Store Byte.

Store byte from register $rt to memory address $rs + offset (register plus

immediate).

Availability:

nanoMIPS

Format:

SB[U12]

100001	rt	rs	0001	u
6	5	5	4	12

offset = u

SB[16]

010111	rtz3	rs3	01	u
6	3	3	2	2

rt = decode_gpr(rtz3, 'gpr3.src.store')
rs = decode_gpr(rs3, 'gpr3')
offset = u

SB[GP]

010001	rt	001	u
6	5	3	18

rs = 28
offset = u

SB[S9]

101001	rt	rs	s[8]	0001	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

Operation:

va = effective_address(GPR[rs], offset, 'Store')
data = zero_extend(GPR[rt], from_nbits=8)
write_memory_at_va(data, va, nbytes=1)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SBE rt, offset(rs) - Store Byte using EVA addressing

Assembly:

SBE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Store Byte using EVA addressing

Purpose:

Store Byte using EVA addressing.

Store byte from register $rt to virtual address $rs +

offset,translating the virtual address as though the core is in user mode, although it is actually in
 kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	0001	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

offset = sign_extend(s, from_nbits=9)
if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Store', eva=True)
data = zero_extend(GPR[rt], from_nbits=8)
write_memory_at_va(data, va, nbytes=1, eva=True)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SBX rd, rs(rt) - Store Byte indeXed

Assembly:

SBX rd, rs(rt)

nanoMIPS, not available in NMS

Store Byte indeXed

Purpose:

Store Byte indeXed. Store byte from register $rt to memory address $rt + $rs (registerplus register).

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	0001	0	000	111
6	5	5	5	4	1	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
va = effective_address(GPR[rs], GPR[rt], 'Store')
data = zero_extend(GPR[rd], from_nbits=8)
write_memory_at_va(data, va, nbytes=1)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SC rt, offset(rs) - Store Conditional word/Store Conditional word using EVA addressing/Store Conditional

Assembly:

SC rt, offset(rs)	nanoMIPS, availability varies by format.	Store Conditional word/Store Conditional word using EVA addressing/Store Conditional
SCE rt, offset(rs)	nanoMIPS, availability varies by format.	Store Conditional word/Store Conditional word using EVA addressing/Store Conditional
SCWP rt, ru, (rs)	nanoMIPS, availability varies by format.	Store Conditional word/Store Conditional word using EVA addressing/Store Conditional
SCWPE rt, ru, (rs)	nanoMIPS, availability varies by format.	Store Conditional word/Store Conditional word using EVA addressing/Store Conditional

Purpose:

Store Conditional word/Store Conditional word using EVA addressing/Store Conditional

Word Pair/Store Conditional Word Pair using EVA addressing. Store conditionally to complete atomic read-modify-write. For SC/SCE, store from register $rt to address $rs + offset (register plus oﬀset).

For SCWP/SCWPE, store from registers $rt and $ru to address $rs. For SCE/SCWPE, translate the virtual address as though the core is in user mode, although itis actually in kernel mode.Indicate

Indicate success or failure by writing 1 or 0 respectively to $rt.

Availability:

nanoMIPS, availability varies by format.

Format:

SC

101001	rt	rs	s[8]	1011	0	01	s[7:2]	00
6	5	5	1	4	1	2	6	2

offset = sign_extend(s, from_nbits=9)
nbytes = 4
is_eva = False

SCE, present when Conﬁg5.EVA=1, requires CP0 privilege.

101001	rt	rs	s[8]	1011	0	10	s[7:2]	00
6	5	5	1	4	1	2	6	2

offset = sign_extend(s, from_nbits=9)
nbytes = 4
is_eva = True

SCWP, required (optional on NMS cores).

101001	rt	rs	x	1011	0	01	ru	x	01
6	5	5	1	4	1	2	5	1	2

offset = 0
nbytes = 8
is_eva = False

SCWPE, present when Conﬁg5.EVA=1. Requires CP0 privilege.

101001	rt	rs	x	1011	0	10	ru	x	01
6	5	5	1	4	1	2	5	1	2

offset = 0
nbytes = 8
is_eva = True

Operation:

if nbytes == 8 and C0.Config5.XNP:
    raise exception('RI', 'SCWP[E] requires word-paired support')
if is_eva and not C0.Config5.EVA:
    raise exception('RI')
va = effective_address(GPR[rs], offset, 'Store', eva=is_eva)
# Linked access must be aligned.
if va & (nbytes-1):
    raise exception('ADES', badva=va)
pa, cca = va2pa(va, 'Store', eva=is_eva)
if (cca == 2 or cca == 7) and not C0.Config5.ULS:
    raise UNPREDICTABLE('uncached CCA not synchronizablewhen Config5.ULS=0')
    # (Preferred behavior for non-synchronizableaddressisBusError).
if nbytes == 4: # SC/SCE
    data = zero_extend(GPR[rt], from_nbits=32)
else:  # SCWP/SCWPE
    word0 = GPR[rt][31:0]
    word1 = GPR[ru][31:0]
    data = word0 @ word1 if C0.Config.BE else word1 @ word0
# Write this data to memory, but only if it can bedoneatomicallywith
# respect to a prior linked load. The return valueindicateswhetherthewrite
# occurred.
success = write_memory(data, va, pa, cca, nbytes=nbytes, atomic=True)
if success:
    GPR[rt] = 1
else:
    GPR[rt] = 0
C0.LLAddr.LLB = 0  # SC always clears LLbitregardless of address matches.

The SC, SCE, SCWP and SCWPE instructions are used to complete the atomic read-modify-write (RMW) sequence begun by a prior matching LL/LLE/LLWP/LLWPE instruction respectively.If the system can

guarantee that the write to memory can be completed prior to any other modiﬁcation to the targeted data since it was read by the load-linked instruction which initiated the sequence, then the write will

complete and register $rt will be set to 1,indicating success. Otherwise, the memory write will not occur, and register $rt will be set to 0, indicating failure.

If any ofthe following events occur between a load-linked and a store conditionalinstruction,the store-conditional will fail:

The store-conditional will fail if a coherent store is completed (by either the current processor, another processor, or a coherent I/O module) into the block of synchronizable physical memory containing the load-linked data. The size and alignment of the block is implementation-dependent,

butitis atleast one word and at mostthe minimum page size.Typically,the synchronizable block size is the size of the largest cache line in use.

The store-conditional will fail if an ERET instruction has been executed since the preceding loadlinked instruction.

(Note that nanoMIPS™ also includes the ERETNC instruction, which will not

cause the store-conditional instruction to fail.)

If any of the following events occur between a load-linked and a store conditional instruction, the storeconditional may fail when it would otherwise have succeeded. Portable programs should not cause any

of these events:

The store-conditional may fail

if a load or store is executed on a processor executing a loadlinked/store-conditional sequence, and that

load or store is notto the block of synchronizable

physical memory containing the load-linked data. This is because the load or store may cause the load-linked data to be evicted from the cache.

The store-conditional may fail

if any PREF instruction is executed a processor executing a loadlinked/store-conditional sequence, due to the possibility of the PREF causing a cache eviction.

The store-conditional may fail on coherent multi-processor systems if a non-coherent store is

executed during a load-linked/store-conditional sequence and that store is to the block of synchronizable physical memory containing the linked data.

The store-conditional may fail

if the instructions executed starting with the load-linked instruction and ending with the store-conditional instruction do not lie in a 2048-byte contiguous region

of virtual memory.(The region does not have to be aligned, other than the alignment required for instruction words.)

The store-conditional may fail

if a CACHE operation is carried out during the load-linked/storeconditional sequence, due to the possibility of modifying or evicting the line containing the linked

data.In addition, non-local CACHE operations may cause a store-conditionalinstruction to fail on either the local processor or on the remote processor in multiprocessor or multi-threaded

systems.

The store-conditional must not fail as a result of any of the following events:

The store-conditional must not fail as a result of a load that executes on the processor executing

a load-linked/store-conditional sequence if the load targets the block of synchronizable physical memory containing the load-linked data.

The outcome of the store-conditional is not predictable (it may succeed or fail) under any of the following conditions:

The store-conditional result is unpredictable if the store-conditional was not preceded by a matching load-linked instruction. SC must be preceded by LL, SCE must be preceded by LLE, SCWP

must be preceded by LLWP, and SCWPE must be preceded by LLWPE.

The store-conditional result is unpredictable if the load-linked and store-conditional instructions

do not target identical virtual addresses, physical addresses and CCAs.

The store-conditional result is unpredictable if the targeted memory location is not synchronizable. A synchronizable memory location is one that is associated with the state and logic necessary to track RMW atomicity. Whether a memory location is synchronizable depends on the

processor and system conﬁgurations, and on the memory access type used for the location.

The store-conditional result

is unpredictable ifthe memory access does not use a CCA which

supports atomic RMW for the targeted address.

–

For uniprocessor systems, a cached noncoherent or cached coherent CCA must be used, or

additionally an uncached CCA can be used in the case that Conﬁg5.ULS=1.

–

For multi-processor systems or systems containing coherent I/O devices, a cached coherent CCA must be used, or additionally an uncached CCA can be used in the case that Conﬁg5.ULS=1.

When Conﬁg5.ULS=1, uncached load-linked/store-conditional operations are supported, with the following additional constraints:

The result of a store-conditional which is part of an uncached load-linked/store conditional sequence is unpredictable if during the sequence a local or remote CPU accesses the block of

memory containing the targeted data using any other CCA than that used by the load-linked and store-conditional instructions.

The result of an uncached load-linked/store-conditional sequence is only predicable if it targets

an address in the system which supports uncached RMW accesses.In particular,the system must implement a ”monitor”, which is responsible determining whether or not the address can

be updated atomically with respect to the prior linked load.In response to a store-conditional instruction,the monitor updates memory where appropriate and communicates the resultto

the processor that initiated the sequence.It is implementation dependent as to what form the monitor takes. The recommended response for load-linked/store-conditionalinstructions which

target a non-synchronizable uncached address is that the sub-system report a Bus Error to the processor.

Same processor uncached stores will cause an uncached load-linked/store-conditional sequence

to fail if the store address matches that of the sequence.

A PAUSE instruction is no-op’d when it is preceded by an uncached load-linked instruction. This

is because the event which would wake the CPU from the paused state may only be visible to the external monitor, not to the local processor.

The rules for uncached load-linked/store-conditional atomic operation apply to any uncached CCA

including UCA (UnCached Accelerated). An implementation that supports UCA must guarantee that a store-conditionalinstruction does not participate in store gathering and that it ends any

gathering initiated by stores preceding the SC in program order when the SC address coincides with a gathering address.

The eﬀective address of a store-conditional operation must be naturally-aligned,i.e. word aligned for SC and SCE, and double-word aligned for SCWP and SCWPE: Otherwise an address exception occurs.

The following assembly code shows a possible usage of LL and SC to atomically update a memory location:

L1:
ll       t1, 0(t0)     # Load counter.
addiu    t2, t1, 1     # Increment.
sc       t2, 0(t0)     # Try to store, checking for atomicity.
beqc     t2, 0, L1     # If not atomic (0), try again.

Exceptions between the load-linked and store-conditionalinstructions cause the store-conditional to

fail, so instructions which can cause persistent exceptions must not be used within the load-linked/storeconditional sequence. Examples of instructions which must be avoided are are arithmetic operations

that trap, system calls, and ﬂoating point operations that trap or require software emulation assistance.

Load-linked and store-conditional must function correctly on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached

coherent memory access types.

Support for the paired word instructions SCWP/SCWPE is indicated by the Conﬁg5.XNP bit. Paired word support is required for nanoMIPS™ cores, except for NMS cores, where it is optional.

Exceptions:

Address Error.Bus Error.Coprocessor UnusableforSCE/SCWPE. ReservedInstructionfor SCE/SCWPE if EVA notimplemented.Reserved Instruction for SCWP/SCWPE ifload linked pair

not implemented. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SDBBP code - Software Debug Breakpoint

Assembly:

SDBBP code

nanoMIPS. Optional, present when Debug implemented.

Software Debug Breakpoint

Purpose:

Software Debug Breakpoint. Cause a Software Debug Breakpoint exception.

Availability:

nanoMIPS. Optional, present when Debug implemented.

Format:

SDBBP[32]

000000	00000	11	code
6	5	2	19

SDBBP[16]

000100	00000	11	code
6	5	2	3

Operation:

if C0.Config1.EP == 0:
    raise exception('RI', 'Debug not implemented')
if C0.Config5.SBRI and EffectiveKSU() != 0:
    raise exception('RI', 'SBRI exception')
if Root.C0.Config5.SBRI and is_guest_mode():
    raise exception('RI', 'Root SBRIexception', g=False)
debug_exception('BP')
Root.C0.Debug.DBp = 1
raise EXCEPTION()

Exceptions:

Software Debug Breakpoint. Reserved Instruction if Debug not implemented.

/home/arch-index/nanomips/main › SEB rt, rs - Sign Extend Byte

Assembly:

SEB rt, rs

nanoMIPS, not available in NMS

Sign Extend Byte

Purpose:

Sign Extend Byte. Take the lower byte of the value in register $rs, sign extend it, and placethe result in register $rt.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	x	0000001	000
6	5	5	6	7	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
GPR[rt] = sign_extend(GPR[rs], from_nbits=8)

Exceptions:

Reserved Instruction on NMS cores.

/home/arch-index/nanomips/main › SEH rt, rs - Sign Extend Half

Assembly:

SEH rt, rs

nanoMIPS

Sign Extend Half

Purpose:

Sign Extend Half. Take the lower halfword of the value in register $rs, sign extend it, andplace the result in register $rt.

Availability:

nanoMIPS

Format:

001000	rt	rs	x	0001001	000
6	5	5	6	7	3

Operation:

GPR[rt] = sign_extend(GPR[rs], from_nbits=16)

Exceptions:

None.

/home/arch-index/nanomips/main › SEQI rt, rs, u - Set on Equal to Immediate

Assembly:

SEQI rt, rs, u

nanoMIPS

Set on Equal to Immediate

Purpose:

Set on Equal to Immediate. Set the register $rt to 1 if register $rs is equal to immediatevalue u, and 0 otherwise.

Availability:

nanoMIPS

Format:

100000	rt	rs	0110	u
6	5	5	4	12

Operation:

GPR[rt] = 1 if GPR[rs] == u else 0

Exceptions:

None.

/home/arch-index/nanomips/main › SH rt, offset(rs) - Store Half

Assembly:

SH rt, offset(rs)

nanoMIPS

Store Half

Purpose:

Store Half. Store halfword from register $rt to memory address $rs + offset (register plusimmediate).

Availability:

nanoMIPS

Format:

SH[U12]

100001	rt	rs	0101	u
6	5	5	4	12

offset = u

SH[16]

011111	rtz3	rs3	0	u[2:1]	1
6	3	3	1	2	1

rt = decode_gpr(rtz3, 'gpr3.src.store')
rs = decode_gpr(rs3, 'gpr3')
offset = u

SH[GP]

010001	rt	101	u[17:1]	0
6	5	3	17	1

rs = 28
offset = u

SH[S9]

101001	rt	rs	s[8]	0101	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

Operation:

va = effective_address(GPR[rs], offset, 'Store')
data = zero_extend(GPR[rt], from_nbits=16)
write_memory_at_va(data, va, nbytes=2)

Exceptions:

Address Error. Bus Error. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SHE rt, offset(rs) - Store Half using EVA addressing

Assembly:

SHE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Store Half using EVA addressing

Purpose:

Store Half using EVA addressing. Store halfword from register $rt to virtual address $rs+ offset, translating the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	0101	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

offset = sign_extend(s, from_nbits=9)
if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Store', eva=True)
data = zero_extend(GPR[rt], from_nbits=16)
write_memory_at_va(data, va, nbytes=2, eva=True)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SHX rd, rs(rt) - Store Half indeXed

Assembly:

SHX rd, rs(rt)

nanoMIPS, not available in NMS

Store Half indeXed

Purpose:

Store Half indeXed. Store halfword from register $rt to memory address $rt + $rs (registerplus register).

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	0101	0	000	111
6	5	5	5	4	1	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
va = effective_address(GPR[rs], GPR[rt], 'Store')
data = zero_extend(GPR[rd], from_nbits=16)
write_memory_at_va(data, va, nbytes=2)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS Cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SHXS rd, rs(rt) - Store Half

Assembly:

SHXS rd, rs(rt)

nanoMIPS, not available in NMS

Store Half

Purpose:

Store HalfindeXed Scaled.Store halfword from register $rt to memory address $rt +

2*$rs (register plus scaled register).

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	0101	1	000	111
6	5	5	5	4	1	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
va = effective_address(GPR[rs]<<1, GPR[rt], 'Store')
data = zero_extend(GPR[rd], from_nbits=16)
write_memory_at_va(data, va, nbytes=2)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS Cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SIGRIE code - Signal Reserved Instruction Exception

Assembly:

SIGRIE code

nanoMIPS

Signal Reserved Instruction Exception

Purpose:

Signal Reserved Instruction Exception.

Availability:

nanoMIPS

Format:

000000	00000	00	code
6	5	2	19

Operation:

raise exception('RI')

Exceptions:

Reserved Instruction.

/home/arch-index/nanomips/main › SLL rt, rs, shift - Shift Left Logical

Assembly:

SLL rt, rs, shift

nanoMIPS

Shift Left Logical

Purpose:

Shift Left Logical. Left shift word value in register $rs by amount shift, and place the resultin register $rt.

Availability:

nanoMIPS

Format:

SLL[32]

100000	rt	rs	1100	x	0000	shift
6	5	5	4	3	4	5

NOP[32], EHB, PAUSE, and SYNC instruction formats overlap SLL[32].Opcodes matching those instruction formats should be processed according to the description ofthose instructions, not as

SLL[32].

SLL[16]

001100	rt3	rs3	0	shift3
6	3	3	1	3

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
shift = 8 if shift3 == 0 else shift3

Operation:

result = GPR[rs] << shift
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › SLLV rd, rs, rt - Shift Left Logical Variable

Assembly:

SLLV rd, rs, rt

nanoMIPS

Shift Left Logical Variable

Purpose:

Shift Left Logical Variable. Left shift word value in register $rs by shift amount in register$rt, and place the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0000010	000
6	5	5	5	1	7	3

Operation:

shift = GPR[rt] & 0x1f
result = GPR[rs] << shift
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › SLT rd, rs, rt - Set on Less Than

Assembly:

SLT rd, rs, rt

nanoMIPS

Set on Less Than

Purpose:

Set on Less Than. Set the register $rd to 1 if signed register $rs is less than signed register$rt, and 0 otherwise.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	1101010	000
6	5	5	5	1	7	3

Operation:

GPR[rd] = 1 if GPR[rs] < GPR[rt] else 0

Exceptions:

None.

/home/arch-index/nanomips/main › SLTI rt, rs, u - Set on Less Than Immediate

Assembly:

SLTI rt, rs, u

nanoMIPS

Set on Less Than Immediate

Purpose:

Set on Less Than Immediate. Set the register $rt to 1 if the signed value in register $rs isless than immediate u, and 0 otherwise.

Availability:

nanoMIPS

Format:

100000	rt	rs	0100	u
6	5	5	4	12

Operation:

GPR[rt] = 1 if GPR[rs] < u else 0

Exceptions:

None.

/home/arch-index/nanomips/main › SLTIU rt, rs, u - Set on Less Than Immediate, Unsigned

Assembly:

SLTIU rt, rs, u

nanoMIPS

Set on Less Than Immediate, Unsigned

Purpose:

Set on Less Than Immediate, Unsigned. Set the register $rt to 1 if the unsigned value inregister $rs is less than immediate u, and 0 otherwise.

Availability:

nanoMIPS

Format:

100000	rt	rs	0101	u
6	5	5	4	12

Operation:

GPR[rt] = 1 if unsigned(GPR[rs]) < u else 0

Exceptions:

None.

/home/arch-index/nanomips/main › SLTU rd, rs, rt - Set on Less Than, Unsigned

Assembly:

SLTU rd, rs, rt

nanoMIPS

Set on Less Than, Unsigned

Purpose:

Set on Less Than, Unsigned. Set the register $rd to 1 if unsigned register $rs is less thanunsigned register $rt, and 0 otherwise.

Availability:

nanoMIPS

Format:

001000

rd!=0

1110010

000

Operation:

GPR[rd] = 1 if unsigned(GPR[rs]) < unsigned(GPR[rt]) else 0

SLTU encodings with rd=0 are used for the DVP and EVP instructions. DVP and EVP are required to behave as NOPs on cores without Virtual Processor (VP) support. This means that no DVP/EVP special

casing is required in hardware for non-VP cores,since a SLTU instruction writing to $0 naturally behaves as a NOP.

Exceptions:

None.

/home/arch-index/nanomips/main › SOV rd, rs, rt - Set on Overﬂow

Assembly:

SOV rd, rs, rt

nanoMIPS

Set on Overﬂow

Purpose:

Set on Overﬂow. Set the register $rd to 1 if the signed addition of registers $rs and $rtoverﬂows 32 bits, and 0 otherwise.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	1111010	000
6	5	5	5	1	7	3

Operation:

sum = GPR[rs] + GPR[rt]
GPR[rd] = 1 if overflows(sum, nbits=32) else 0

Exceptions:

None.

/home/arch-index/nanomips/main › SRA rt, rs, shift - Shift Right Arithmetic

Assembly:

SRA rt, rs, shift

nanoMIPS

Shift Right Arithmetic

Purpose:

Shift Right Arithmetic. Right shift word value in register $rs by amount shift, duplicatingthe sign bit (bit 31) in the emptied bits. Place the result in register $rt.

Availability:

nanoMIPS

Format:

100000	rt	rs	1100	x	0100	shift
6	5	5	4	3	4	5

Operation:

GPR[rt] = GPR[rs] >> shift

Exceptions:

None.

/home/arch-index/nanomips/main › SRAV rd, rs, rt - Shift Right Arithmetic Variable

Assembly:

SRAV rd, rs, rt

nanoMIPS

Shift Right Arithmetic Variable

Purpose:

Shift Right Arithmetic Variable. Right shift word value in register $rs by shift amount inregister $rt, duplicating the sign bit (bit 31) in the emptied bits. Place the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0010010	000
6	5	5	5	1	7	3

Operation:

shift = GPR[rt] & 0x1f
GPR[rd] = GPR[rs] >> shift

Exceptions:

None.

/home/arch-index/nanomips/main › SRL rt, rs, shift - Shift Right Logical.

Assembly:

SRL rt, rs, shift

nanoMIPS

Shift Right Logical.

Purpose:

Shift Right Logical.Right shift word value in register $rs by amount shift, ﬁlling the

emptied bits with zeroes. Place the result in register $rt.

Availability:

nanoMIPS

Format:

SRL[32]

100000	rt	rs	1100	x	0010	shift
6	5	5	4	3	4	5

SRL[16]

001100	rt3	rs3	1	shift3
6	3	3	1	3

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
shift = 8 if shift3 == 0 else shift3

Operation:

result = zero_extend(GPR[rs], from_nbits=32) >> shift
GPR[rt] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › SRLV rd, rs, rt - Shift Right Logical Variable

Assembly:

SRLV rd, rs, rt

nanoMIPS

Shift Right Logical Variable

Purpose:

Shift Right Logical Variable. Right shift word value in register $rs by shift amount in register$rt, ﬁlling the emptied bits with zeros. Place the result in register $rd.

Availability:

nanoMIPS

Format:

001000	rt	rs	rd	x	0001010	000
6	5	5	5	1	7	3

Operation:

shift = GPR[rt] & 0x1f
result = zero_extend(GPR[rs], from_nbits=32) >> shift
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › SUB rd, rs, rt - Subtract

Assembly:

SUB rd, rs, rt

nanoMIPS, not available in NMS

Subtract

Purpose:

Subtract. Subtract the 32-bit signed integer in register $rt from the 32-bit signed integerin register $rs, placing the 32-bit result in register $rd, and trapping on overﬂow.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	x	0110010	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
result = GPR[rs] -GPR[rt]
if overflows(result, nbits=32):
    raise exception('OV')
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › SUBU rd, rs, rt - Subtract (Untrapped)

Assembly:

SUBU rd, rs, rt

nanoMIPS

Subtract (Untrapped)

Purpose:

Subtract (Untrapped). Subtract the 32-bit integer in register $rt from the 32-bit integer inregister $rs, placing the 32-bit result in register $rd, and not trapping on overﬂow.

Availability:

nanoMIPS

Format:

SUBU[32]

001000	rt	rs	rd	x	0111010	000
6	5	5	5	1	7	3

SUBU[16]

101100	rt3	rs3	rd3	1
6	3	3	3	1

rd = decode_gpr(rd3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
rt = decode_gpr(rt3, 'gpr3')

Operation:

result = GPR[rs] -GPR[rt]
GPR[rd] = sign_extend(result, from_nbits=32)

Exceptions:

None.

/home/arch-index/nanomips/main › SW rt, offset(rs) - Store Word

Assembly:

SW rt, offset(rs)

nanoMIPS, availability varies by format.

Store Word

Purpose:

Store Word. Store word from register $rt to memory address $rs + offset (register plusimmediate).

Availability:

nanoMIPS, availability varies by format.

Format:

SW[U12]

100001	rt	rs	1001	u
6	5	5	4	12

offset = u

SW[16]

100101	rtz3	rs3	u[5:2]
6	3	3	4

rt = decode_gpr(rtz3, 'gpr3.src.store')
rs = decode_gpr(rs3, 'gpr3')
offset = u

SW[4X4], not available in NMS

111101	rtz4[3]	u[2]	rtz4[2:0]	rs4[3]	u[3]	rs4[2:0]
6	1	1	3	1	1	3

if C0.Config5.NMS == 1:
    raise exception('RI')
rt = decode_gpr(rtz4[3] @ rtz4[2:0], 'gpr4.zero')
rs = decode_gpr(rs4[3] @ rs4[2:0], 'gpr4')
offset = u

SW[GP]

010000	rt	u[20:2]	11
6	5	19	2

rs = 28
offset = u

SW[GP16]

110101	rtz3	u[8:2]
6	3	7

rt = decode_gpr(rtz3, 'gpr3.src.store')
rs = 28
offset = u

SW[S9]

101001	rt	rs	s[8]	1001	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)

SW[SP]

101101	rt	u[6:2]
6	5	5

rs = 29
offset = u

Operation:

va = effective_address(GPR[rs], offset, 'Store')
data = zero_extend(GPR[rt], from_nbits=32)
write_memory_at_va(data, va, nbytes=4)

Exceptions:

Address Error. Bus Error. Reserved Instruction for SW[4X4] format on NMS Cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SWE rt, offset(rs) - Store Word using EVA addressing

Assembly:

SWE rt, offset(rs)

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Store Word using EVA addressing

Purpose:

Store Word using EVA addressing. Store word from register $rt to virtual address $rs +offset,translating the virtual address as though the core is in user mode, although it is actually in kernel mode.

Availability:

nanoMIPS. Optional, present when Conﬁg5.EVA=1. Requires CP0 privilege.

Format:

101001	rt	rs	s[8]	1001	0	10	s[7:0]
6	5	5	1	4	1	2	8

Operation:

offset = sign_extend(s, from_nbits=9)
if not C0.Config5.EVA:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Store', eva=True)
data = zero_extend(GPR[rt], from_nbits=32)
write_memory_at_va(data, va, nbytes=4, eva=True)

Exceptions:

Address Error. Bus Error. Coprocessor Unusable. Reserved Instruction if EVA not implemented. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SWM rt, offset(rs), count - Store Word Multiple.

Assembly:

SWM rt, offset(rs), count

nanoMIPS, not available in NMS

Store Word Multiple.

Purpose:

Store Word Multiple.Storecount wordsofdatafrom registers$rt, $(rt+1),...,

$(rt+count-1) to consecutive memory addressesstarting at $rs + offset (register plusimmediate).

Availability:

nanoMIPS, not available in NMS

Format:

101001	rt	rs	s[8]	count3	1	1	00	s[7:0]
6	5	5	1	3	1	1	2	8

offset = sign_extend(s, from_nbits=9)
count = 8 if count3 == 0 else count3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
i = 0
while i != count:
    this_rt = ( 0           if rt == 0    else
                rt + i      if rt + i < 32 else
                rt + i - 16                    )
    this_offset = offset + (i<<2)
    va = effective_address(GPR[rs], this_offset, 'Store')
    data = zero_extend(GPR[this_rt], from_nbits=32)
    write_memory_at_va(data, va, nbytes=4)
    i += 1

SWM stores count words from sequentially numbered registers to sequential memory addresses. After storing $31, the sequence of registers continues from $16.If rt=0, then $0 is stored for all count steps

of the instruction. Some example encodings of the register list are:

rt=15, count=3:

loads [$15, $16, $17]

rt=31, count=3: saves [$31, $16, $17]
rt=0, count=3: saves [$0, $0, $0].

If a TLB exception or interrupt occurs during the execution of this instruction, a subset of the required memory updates may have occurred. A full restart of the instruction will be performed on return from

the exception.

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SWPC rt, address - Store Word PC relative

Assembly:

SWPC rt, address

nanoMIPS, not available in NMS

Store Word PC relative

Purpose:

Store Word PC relative. Store word from register $rt to PC relative address address.

Availability:

nanoMIPS, not available in NMS

Format:

SWPC[48]

011000	rt	01111	s[15:0]	s[31:16]
6	5	5	16	16

offset = sign_extend(s, from_nbits=32)

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
address = effective_address(CPU.next_pc, offset, 'Store')
data = zero_extend(GPR[rt], from_nbits=32)
write_memory_at_va(data, address, nbytes=4)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SWX rd, rs(rt) - Store Word indeXed

Assembly:

SWX rd, rs(rt)

nanoMIPS, not available in NMS

Store Word indeXed

Purpose:

Store Word indeXed. Store word from register $rt to memory address $rt + $rs (registerplus register).

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	1001	0	000	111
6	5	5	5	4	1	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
va = effective_address(GPR[rs], GPR[rt], 'Store')
data = zero_extend(GPR[rd], from_nbits=32)
write_memory_at_va(data, va, nbytes=4)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SWXS rd, rs(rt) - Store Word indeXed Scaled

Assembly:

SWXS rd, rs(rt)

nanoMIPS, not available in NMS

Store Word indeXed Scaled

Purpose:

Store Word indeXed Scaled. Store word from register $rt to memory address

$rt + 4*$rs

(register plus scaled register).

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	rd	1001	1	000	111
6	5	5	5	4	1	3	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
va = effective_address(GPR[rs]<<2, GPR[rt], 'Store')
data = zero_extend(GPR[rd], from_nbits=32)
write_memory_at_va(data, va, nbytes=4)

Exceptions:

Address Error. Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › SYNC stype - Sync

Assembly:

SYNC stype

nanoMIPS

Sync

Purpose:

Sync.

Impose ordering constraints of type stype on prior and subsequent memory operations.

Availability:

nanoMIPS

Format:

100000	00000	stype	1100	x	0000	00110
6	5	5	4	3	4	5

Operation:

sync_memory_access(stype)

The SYNC instruction is used to order loads and stores for shared memory, and also to order operations with respect to the globalinvalidate instructions GINVI and GINVT. The following types of ordering

guarantees are available with diﬀerent stypes.

Completion Barriers: A completion barrier provides a guarantee that any of the speciﬁed memory

instructions before the SYNC are completed and globally performed before any of the speciﬁed memory instructions after the SYNC are performed to any extent. Loads are completed when

the destination register is written. Stores are completed when the stored value is visible to every other processor in the system.

Ordering Barriers: An ordering barrier provides a guarantee in the system that any speciﬁed

memory instructions before the SYNC are ordered before any of the speciﬁed memory instructions after the SYNC. The ordering SYNC is considered complete when the memory instructions

before and after the SYNC are guaranteed thereafter to retain their order relative to the SYNC,

i.e. when it is guaranteed that all speciﬁed memory instructions before the SYNC will be globally performed before any of

the speciﬁed memory accesses after the SYNC are performed to

any extent.Itis helpfulto think of a global ordering pointin a coherence domain, which is a point where once an instruction reaches,it can be guaranteed to retain its order relative to any

memory instruction that reaches the point after it. The ordering SYNC thus can not complete before all older speciﬁed memory instructions reach the global ordering point.

The following table shows the behavior of the SYNC instruction for each stype value. Operation types listed in the ’What reaches before’ column are subject to a pre-SYNC ordering barrier: such operations,

when younger, must reach the global ordering point before the SYNC instruction completes. Operation types listed in the ’What reaches after’ column are subjectto a post-SYNC ordering barrier:such

operations, when older, must reach the global ordering point only after the SYNC instruction completes. Operation types listed in the ’What completes before’ column are subject to a completion barrier, that

is, they must be globally performed when the SYNC instruction completes.

What

What reachesWhat reachescompletes

NamebeforeafterbeforeAvailabilitystype

0x0SYNCLoads, StoresLoads, StoresLoads, StoresRequired. 0x1-0x3Impl./vendor

speciﬁc.

0x4SYNC_WMBStoresStoresOptional. 0x5-0xFImpl./vendor

speciﬁc.

0x10SYNC_MBLoads, StoresLoads, StoresOptional. 0x11SYNC_ACQUIRE LoadsLoads, StoresOptional.

0x12SYNC_RELEASE Loads, StoresLoadsOptional. 0x13SYNC_RMBLoadsLoadsOptional.

0x14SYNC_GINVLoads, StoresLoads, StoresGINVI, GINVT,Conﬁg5.GI=2,3.

SYNCI

0x15Reserved for 0x1FArchitecture.

SYNC barriers aﬀect only uncached and cached coherent loads and stores and do not aﬀect the order in which instruction fetches are performed. For the purposes of this description,the CACHE, PREF

and SYNCIinstructions are treated as loads and stores.In addition,the optional GlobalInvalidate instructions are synchronizable through SYNC (stype=0x14).

The eﬀect of SYNC on the global order of loads and stores for memory access types other than uncached and cached coherent is UNPREDICTABLE.

A completion barrier may have an adverse impact on performance compared to an ordering barrier due to the constraint of completion. An implementation may optimize the ordering of memory instructions

such that the ordering barrier completes before a completion barrier under the same circumstance. The magnitude of the impact is implementation-dependent but an implementation must ensure that an

ordering barrier is not worse performing than the equivalent completion barrier. Software thus needs to use completion and ordering barriers for the appropriate conditions.

An stype of 0 is used to deﬁne the SYNC instruction with completion barrier semantics. Non-zero values of stype may be deﬁned by the architecture or speciﬁc implementations to perform synchronization

behaviors that are less complete than that of stype=0.If an implementation does not use one of these non-zero values to deﬁne a diﬀerent synchronization behavior, then that non-zero value of stype must

map to a completion barrier. This allows software written for an implementation with a lighter-weight barrier to work on another implementation which only implements the stype=0 completion barrier.

The Acquire and Release barrier types are used to minimize the memory ordering that must be maintained and still have software synchronization work.

A completion barrier is required, potentially in conjunction with an EHB instruction,to guarantee that memory reference results are visible across operating mode changes. For example, a completion

barrier is required on some implementations on entry to and exit from Debug Mode to guarantee that memory eﬀects are handled correctly.

If Global Invalidate instructions are supported, then SYNC (stype=0x14) acts as a completion barrier with respect to any preceding GINVI or GINVT instructions. This SYNC instruction is globalized and

only completes if all preceding GINVI or GINVT operations related to the same program have completed in the system.(Any references to GINVT also imply GINVGT, available in a virtualized MIPS system.)

Asystem thatimplementsthe GlobalInvalidatesalsorequiresthatthecompletionofSYNC (stype=0x14) be constrained by legacy SYNCI operations.Thus SYNC (stype=0x14) can also be

used to enforce synchronization of SYNCI instructions.In the typical use cases, a single GINVI is used by itself to invalidate caches and would be followed by a SYNC (stype=0x14).In the case of GINVT,

multiple GINVT could be used to invalidate multiple TLB mappings, and the SYNC (stype=0x14) would be used to guaranteed completion of any number of GINVTs preceding it.

Terms

Synchronizable: A load or store instruction is synchronizable if the load or store occurs to a physical location in shared memory using a virtual address with a memory access type of either uncached or

cached coherent .

Shared memory: Memory that can be accessed by more than one processor or by a coherent I/O system module.

Performed load: A load instruction is performed when the value returned by the load has been determined. The result of a load on processor A has been determined with respect to processor or coherent

I/O module B when a subsequent store to the location by B cannot aﬀect the value returned by the load. The store by B must use the same memory access type as the load.

Performed store: A store instruction is performed when the store is observable. A store on processor A is observable with respectto processor or coherentI/O module B when a subsequentload ofthe

location by B returns the value written by the store. The load by B must use the same memory access type as the store.

Globally performed load: A load instruction is globally performed when it is performed with respect to all processors and coherent I/O modules capable of storing to the location.

Globally performed store: A store instruction is globally performed when it is globally observable.It is globally observable when it is observable by all processors and I/O modules capable of loading from

the location.

Global ordering point: A point in the coherence domain where when a memory instruction reaches,it can be guaranteed to retain its order relative to any memory instruction that reaches the point after

it.

CoherentI/O module: A coherentI/O module is an Input/Output system componentthat performs coherent Direct Memory Access (DMA). It reads and writes memory independently as though it were

a processor doing loads and stores to locations with a memory access type of cached coherent.

Programming Notes

A processor executing load and store instructions observes the order in which loads and stores using the same memory access type occur in the instruction stream; this is known as program order.

A parallel program has multiple instruction streams that can execute simultaneously on diﬀerent processors.

In multiprocessor (MP) systems,the order in which the eﬀects ofloads and stores are observed by other processors - the global order of the loads and store - determines the actions necessary

to reliably share data in parallel programs.

When all processors observe the eﬀects ofloads and stores in program order, the system is strongly ordered. On such systems, parallel programs can reliably share data without explicitly using a SYNC.

Executing SYNC on such a system is not necessary, will not cause an error, but may reduce overall performance.

If a multiprocessor system is not strongly ordered, the eﬀects of load and store instructions executed by one processor may be observed out of program order by other processors. On such systems, parallel

programs must use SYNC to reliably share data at critical points in the program. SYNC separates the loads and stores executed on the processor into two groups, and the eﬀect of allloads and stores in

one group is seen by all processors before the eﬀect of any load or store in the subsequent group.In eﬀect, SYNC causes the system to be strongly ordered for the executing processor at the instant that

the SYNC is executed.

The hardware ordering support provided in a MIPS-based multiprocessor system is implementation dependent. A parallel program that does not use SYNC generally does not operate on a system that is not

strongly ordered. However, a program that does use SYNC works on both types of systems.(Systemspeciﬁc documentation describes the actions needed to reliably share data in parallel programs for

that system.)

The behavior of a load or store using one memory access type is UNPREDICTABLE if a load or store was previously made to the same physical location using a diﬀerent memory access type. The presence

of a SYNC between the references does not alter this behavior.

SYNC aﬀects the order in which the eﬀects of load and store instructions appear to all processors;it does not generally aﬀect the physical memory-system ordering or synchronization issues that arise in

system programming. The eﬀect of SYNC on implementation-speciﬁc aspects of the cached memory system, such as writeback buﬀers, is not deﬁned.

The code fragments below show how SYNC can be used to coordinate the use of shared data between separate writer and reader instruction streams in a multiprocessor environment. The FLAG location is

used by the instruction streams to determine whether the shared data item DATA is valid. The SYNC executed by processor A forces the store of DATA to be performed globally before the store to FLAG

is performed. The SYNC executed by processor B ensures that DATA is not read until after the FLAG value indicates that the shared data is valid.

# Processor A (writer)
# Conditions at entry:
# The value 0 has been stored in FLAG and that valueisobservablebyB
SW     R1, DATA       # change sharedDATA value
LI     R2, 1
SYNC                  # Perform DATAstore beforeperforming FLAGstore
SW     R2, FLAG       # say that thesharedDATA value isvalid
# Processor B (reader)
LI     R2, 1
1: LW     R1, FLAG   # Get FLAG
BNEC   R2, R1, 1B # if it says that DATAis not valid, poll again
NOP
SYNC              # FLAG value checked beforedoing DATA read
LW     R1, DATA   # Read (valid)sharedDATA value
SYNC

Exceptions:

None.

/home/arch-index/nanomips/main › SYNCI offset(rs) - SYNChronize Instruction cache/SYNChronize Instruction cache using EVA addressing

Assembly:

SYNCI offset(rs)	nanoMIPS, availability varies by format.	SYNChronize Instruction cache/SYNChronize Instruction cache using EVA addressing
SYNCIE offset(rs)	nanoMIPS, availability varies by format.	SYNChronize Instruction cache/SYNChronize Instruction cache using EVA addressing

Purpose:

SYNChronize Instruction cache/SYNChronize Instruction cache using EVA addressing. Synchronize the caches to make instructions writes at address $rs + offset (register plus immediate) eﬀective. For SYNCIE, translate the virtual address as though the core is in user mode, although it is

actually in kernel mode.

Availability:

nanoMIPS, availability varies by format.

Format:

SYNCI[S9]

101001	11111	rs	s[8]	0011	0	00	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)
is_eva = False

SYNCI[U12]

100001	11111	rs	0011	u
6	5	5	4	12

offset = u
is_eva = False

SYNCIE, present when Conﬁg5.EVA=1, requires CP0 privilege.

101001	11111	rs	s[8]	0011	0	10	s[7:0]
6	5	5	1	4	1	2	8

offset = sign_extend(s, from_nbits=9)
is_eva = True

Operation:

if is_eva and not C0.Config5.EVA:
    raise exception('RI')
if is_eva and not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
va = effective_address(GPR[rs], offset, 'Load', eva=is_eva)
pa, cca = va2pa(va, 'Cacheop', eva=is_eva)
# Make data writes at address=PA visible to the instructionstream(forall
# coherent cores in the system)...
# The precise details of the operation are implementationdependent,andwill
# depend on the cache hierarchy and coherency behaviorofthesystem.The
# following code shows a sample implementation forasystemwherethememory
# hierarchy is unified beyond the L1 instruction anddatacaches.
# Find index where address is present in D cache, ifany.
dcache_hit_index = cache_lookup_index('D', va, pa)
if dcache_hit_index:
    way_index, set_index = dcache_hit_index
    dcache_line = get_cache_line('D',way_index, set_index)
    if dcache_line.valid and dcache_line.dirty:
        dcache_line.write_back()
        # Implementation may or may not invalidateline too,seebelow.
for core in get_all_cores_in_system():
    # Find index where address is presentin this core'sIcache,ifany.
    icache_hit_index = cache_lookup_index('I', va, pa, core)
    if icache_hit_index:
        way_index, set_index = icache_hit_index
        icache_line = get_cache_line('I', way_index, set_index,core)
        if not icache_line.locked:
            icache_line.valid = 0

SYNCI is a user privilege instruction for synchronizing the caches to make instruction writes to address

$rs + offset eﬀective. SYNCI must be followed by a SYNC instruction and an instruction hazard barrier to guarantee that subsequent instruction fetches see the updated instructions. One SYNCI instruction

is required for every cache line that was written. The size of the cache line can be determined by the RDHWR instruction.

SYNCI can cause TLB Reﬁll and TLB invalid exceptions (with cause code TLBL). It does not cause TLBRI exceptions. A Cache Error or Bus Error exception may occur as a result of a writeback triggered by

the instruction.

An Address Error Exception (with cause code equal ADEL) may occur if a SYNCI targets an address which is not accessible from the current operating mode.It is implementation dependent whether such

an exception does occur, but the instruction should not aﬀect cache lines which are not accessible from the current operating mode.

It is implementation dependent whether a data watch exception is triggered by a SYNCI instruction whose address matches the Watch register address match conditions. The preferred implementation

is not to match on the SYNCI instruction.

The operation of the processor is UNPREDICTABLE if the eﬀective address of the SYNCI targets any instruction cache line that contains instructions to be executed between the SYNCI and the subsequent

JALRC.HB, JRC.HB, or ERET instruction required to clear the instruction hazard.

The SYNCIinstruction has no eﬀect on cache lines that were previously locked with the CACHE instruction.

If correct software operation depends on the state of a locked line, the CACHE instruction

must be used to synchronize the caches.

In multi-processor systems, a SYNCI to an address with a coherent CCA must guarantee synchronization of all coherent instruction caches in the system.

(Prior to Release 6 of the MIPS™ Architecture,

this behavior was recommended but not required).

The manner in which SYNCI is implemented will depend on the cache hierarchy of the processor. Typically, all caches out

to the point at which both instruction and data references become uniﬁed are

processed.If no caches exist or if instruction cache coherency is already guaranteed, the instruction must be implemented as a NOP.

In a typical implementation in which only the L1 instruction and data caches are aﬀected, this instruction would perform a Hit

Invalidate operation on the instruction cache and a Hit Writeback or Hit

Writeback Invalidate on the data cache. The decision to invalidate the data cache line is implementation dependent, but should be made under the assumption that the data will not be written again soon.

If a Hit Writeback Invalidate (as opposed to a Hit Writeback) would cause the line to be selected for replacement, the invalidate option might be selected.

The following example shows a routine which could be called after the new instruction stream is written to make those changes eﬀective.

/*
 * This routine makes changes to the instruction stream effectivetothe
 * hardware. It should be called after the instruction streamiswritten.
 * On return, the new instructions are effective.
 *
 * Inputs:
 *   a0 = Start address of new instruction stream
 *   a1 = Size in bytes of new instructionstream
 */
        beqc    a1, zero, 20f     /* If size==0, branch around. */
        addu    a1, a0, a1        /* Calculateendaddress+ 1. */
        rdhwr   v0, HW_SYNCI_Step /* Get step sizeforSYNCI. */
        beqc    v0, zero, 20f     /* Nothing to doif no caches. */
10:     synci   0(a0)             /* Sync allcachesaroundaddress. */
        addu    a0, a0, v0        /* Addstepsize.*/
        sltu    v1, a0, a1        /* Notpastthe endaddress? */
        bnec    v1, zero, 10b     /* Branchif moreto do. */
        sync                      /* Clear memory hazards.*/
20:     jrc.hb ra                /* Return, clearinginstruction hazards. */

Exceptions:

Address Error. Bus Error. Cache Error. Coprocessor Unusable for SYNCIE. Reserved Instruction for SYNCIE if EVA not implemented. TLB Invalid. TLB Reﬁll.

/home/arch-index/nanomips/main › SYSCALL code - System Call

Assembly:

SYSCALL code

nanoMIPS

System Call

Purpose:

System Call. Cause a System Call exception.

Availability:

nanoMIPS

Format:

SYSCALL[32]

000000	00000	01	0	code
6	5	2	1	18

SYSCALL[16]

000100	00000	01	0	code
6	5	2	1	2

Operation:

raise exception('SYSCALL')

Exceptions:

System Call.

/home/arch-index/nanomips/main › TEQ rs, rt, code - Trap if Equal

Assembly:

TEQ rs, rt, code

nanoMIPS, not available in NMS

Trap if Equal

Purpose:

Trap if Equal. Cause a Trap exception if registers $rs and $rt are equal.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	code	0	0000000	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
if GPR[rs] == GPR[rt]:
    raise exception('TRAP')

Exceptions:

Trap.

/home/arch-index/nanomips/main › TLBINV - TLB Invalidate

Assembly:

TLBINV

nanoMIPS. Required on TLB cores, unless Config5.IE<2. Requires CP0 privilege.

TLB Invalidate

Purpose:

TLB Invalidate.

Invalidate a set of TLB entries based on ASID match.

Availability:

nanoMIPS. Required on TLB cores, unless Config5.IE<2. Requires CP0 privilege.

Format:

001000	x	00	00011	101	111	111
6	10	2	5	3	3	3

Operation:

if C0.Config4.IE < 2:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
tlbinv()

Exceptions:

Coprocessor Unusable. Reserved Instruction if TLB invalidate not implemented.

/home/arch-index/nanomips/main › TLBINVF - TLB Invalidate Flush

Assembly:

TLBINVF

nanoMIPS. Required on TLB cores, unless Config5.IE<2. Requires CP0 privilege.

TLB Invalidate Flush

Purpose:

TLB Invalidate Flush.

Invalidate a set of TLB entries, ignoring ASID match.

Availability:

nanoMIPS. Required on TLB cores, unless Config5.IE<2. Requires CP0 privilege.

Format:

001000	x	00	01011	101	111	111
6	10	2	5	3	3	3

Operation:

if C0.Config4.IE < 2:
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
tlbinv(flush=True)

Exceptions:

Coprocessor Unusable. Reserved Instruction if TLB invalidate not implemented.

/home/arch-index/nanomips/main › TLBP - TLB Probe

Assembly:

TLBP

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

TLB Probe

Purpose:

TLB Probe. Probe the TLB for an entry matching C0.EntryHi.

Iffound, write the index of

the matching entry to C0.Index, otherwise set C0.Index.P to 1.

Availability:

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

Format:

001000	x	00	00001	101	111	111
6	10	2	5	3	3	3

Operation:

if not got_tlb():
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
tlbp()

Exceptions:

Coprocessor Unusable. Reserved Instruction if TLB not implemented.

/home/arch-index/nanomips/main › TLBR - TLB Read

Assembly:

TLBR

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

TLB Read

Purpose:

TLB Read. Read the TLB entry indexed by C0.Index into the TLB CP0 registers EntryHi,EntryLo0, EntryLo1, PageMask.

Availability:

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

Format:

001000	x	00	01001	101	111	111
6	10	2	5	3	3	3

Operation:

if not got_tlb():
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
tlbr()

Exceptions:

Coprocessor Unusable. Reserved Instruction if TLB not implemented.

/home/arch-index/nanomips/main › TLBWI - TLB Write Indexed

Assembly:

TLBWI

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

TLB Write Indexed

Purpose:

TLB Write Indexed. Write the TLB entry indexed by C0.Index using the values in the TLBCP0 registers EntryHi, EntryLo0, EntryLo1, PageMask.

Availability:

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

Format:

001000	x	00	10001	101	111	111
6	10	2	5	3	3	3

Operation:

if not got_tlb():
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
tlbwi(C0.Index.Index)

Exceptions:

Coprocessor Unusable. Reserved Instruction if TLB not implemented.

/home/arch-index/nanomips/main › TLBWR - TLB Write Random

Assembly:

TLBWR

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

TLB Write Random

Purpose:

TLB Write Random. Write a randomly chosen TLB entry using the values in the TLB CP0registers EntryHi, EntryLo0, EntryLo1, PageMask.

Availability:

nanoMIPS. Required on TLB cores. Requires CP0 privilege.

Format:

001000	x	00	11001	101	111	111
6	10	2	5	3	3	3

Operation:

if not got_tlb():
    raise exception('RI')
if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
tlbwr()

Exceptions:

Coprocessor Unusable. Reserved Instruction if TLB not implemented.

/home/arch-index/nanomips/main › TNE rs, rt, code - Trap if Not Equal

Assembly:

TNE rs, rt, code

nanoMIPS, not available in NMS

Trap if Not Equal

Purpose:

Trap if Not Equal. Cause a Trap exception if registers $rs and $rt are not equal.

Availability:

nanoMIPS, not available in NMS

Format:

001000	rt	rs	code	1	0000000	000
6	5	5	5	1	7	3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
if GPR[rs] != GPR[rt]:
    raise exception('TRAP')

Exceptions:

Trap.

/home/arch-index/nanomips/main › UALH rt, offset(rs) - Unaligned Load Half

Assembly:

UALH rt, offset(rs)

nanoMIPS, not available in NMS

Unaligned Load Half

Purpose:

Unaligned Load Half.

Load signed halfword to register $rt from memory address $rs +

offset (register plus immediate), guaranteeing that the operation completes even if the address is not halfword aligned.

Availability:

nanoMIPS, not available in NMS

Format:

101001	rt	rs	s[8]	0100	0	01	s[7:0]
6	5	5	1	4	1	2	8

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
offset = sign_extend(s, from_nbits=9)
va = effective_address(GPR[rs], offset, 'Load')
data = read_memory_at_va(va, nbytes=2, unaligned_support='always')
GPR[rt] = sign_extend(data, from_nbits=16)

UALH will not cause an Address Error exception for unaligned addresses.

An unaligned load/store instruction may be implemented using more than one memory transaction.It is possible for a subset of these memory transactions to have completed and then for a TLB exception to

occur on a remaining transaction.It is also possible that memory could be modiﬁed by another thread or device in between the completion of the memory transactions. This behavior is equivalent to what

might occur if the unaligned load/store was carried out in software using a series of separate aligned instructions, for instance using LWL/LWR on a pre-R6 MIPS™ core. Software should take equivalent

steps to accommodate this lack of guaranteed atomicity as it would for the multiple instruction case.

Exceptions:

Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Reﬁll. TLB Read Inhibit. Watch.

/home/arch-index/nanomips/main › UALW rt, offset(rs) - Unaligned Load Word

Assembly:

UALW rt, offset(rs)

Assembly alias, not available in NMS

Unaligned Load Word

Purpose:

Unaligned Load Word. Load word to register $rt from memory address $rs + offset (register plus immediate), guaranteeing that the operation completes even if the address is not word aligned.

Availability:

Assembly alias, not available in NMS

Expansion:

UALWM rt, offset(rs), 1

/home/arch-index/nanomips/main › UALWM rt, offset(rs), count - Unaligned Load Word Multiple

Assembly:

UALWM rt, offset(rs), count

nanoMIPS, not available in NMS

Unaligned Load Word Multiple

Purpose:

Unaligned Load Word Multiple.

Load count words of data to registers $rt, $(rt+1),...,

$(rt+count-1) from consecutive memory address starting at $rs + offset (register plus immediate). Guarantee that the operation completes even if the address is not word aligned.

Availability:

nanoMIPS, not available in NMS

Format:

101001	rt	rs	s[8]	count3	0	1	01	s[7:0]
6	5	5	1	3	1	1	2	8

offset = sign_extend(s, from_nbits=9)
count = 8 if count3 == 0 else count3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
i = 0
while i != count:
    this_rt = ( rt + i      if rt + i < 32 else
                rt + i - 16                    )
    this_offset = offset + (i<<2)
    va = effective_address(GPR[rs], this_offset, 'Load')
    data = read_memory_at_va(va, nbytes=4, unaligned_support='always')
    GPR[this_rt] = sign_extend(data, from_nbits=32)
    if this_rt == rs and i != count - 1:
        raise UNPREDICTABLE()
    i += 1

UALWM loads count words to sequentially numbered registers from sequential memory addresses which are potentially unaligned. After loading $31, the sequence of registers continues from $16. See

LWM for example encodings of the register list.

UALWM will not cause an Address Error exception for unaligned addresses.

The result is unpredictable if an UALWM instruction updates the base register prior to the ﬁnal load.

If a TLB exception or interrupt occurs during the execution of this instruction, a subset of the required register updates may have occurred.

steps to accommodate this lack of guaranteed atomicity as it would for the multiple instruction case.

UALWM must be implemented in such a way as to make the instruction restartable, but the implementation does not need to be fully atomic. For instance,

it is allowable for a UALWM instruction to be

aborted by an exception after a subset of the register updates have occurred. To ensure restartability, any write to GPR $rs (which may be used as the ﬁnal output register) must be completed atomically,

that is, the instruction must graduate if and only if that write occurs.

Exceptions:

Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Read Inhibit. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › UASH rt, offset(rs) - Unaligned Store Half

Assembly:

UASH rt, offset(rs)

nanoMIPS, not available in NMS

Unaligned Store Half

Purpose:

Unaligned Store Half. Store halfword from register $rt to memory address $rs + offset(register plus immediate), guaranteeing that the operation completes even if the address is not halfword aligned.

Availability:

nanoMIPS, not available in NMS

Format:

101001	rt	rs	s[8]	0101	0	01	s[7:0]
6	5	5	1	4	1	2	8

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
offset = sign_extend(s, from_nbits=9)
va = effective_address(GPR[rs], offset, 'Store')
data = zero_extend(GPR[rt], from_nbits=16)
write_memory_at_va(data, va, nbytes=2, unaligned_support='always')

UASH will not cause an Address Error exception for unaligned addresses.

steps to accommodate this lack of guaranteed atomicity as it would for the multiple instruction case.

Exceptions:

Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › UASW rt, offset(rs) - Unaligned Store Word

Assembly:

UASW rt, offset(rs)

Assembly alias, not available in NMS

Unaligned Store Word

Purpose:

Unaligned Store Word.

Store word from register $rt to memory address $rs + offset

(register plus immediate), guaranteeing that the operation completes even if the address is not word aligned.

Availability:

Assembly alias, not available in NMS

Expansion:

UASWM rt, offset(rs), 1

/home/arch-index/nanomips/main › UASWM rt, offset(rs), count - Unaligned Store Word Multiple

Assembly:

UASWM rt, offset(rs), count

nanoMIPS, not available in NMS

Unaligned Store Word Multiple

Purpose:

Unaligned Store Word Multiple. Store count words of data from registers $rt, $(rt+1),

...,

$(rt+count-1) to consecutive memory addresses starting at $rs + offset (register plus immediate). Guarantee that the operation completes even if the address is not word aligned.

Availability:

nanoMIPS, not available in NMS

Format:

101001	rt	rs	s[8]	count3	1	1	01	s[7:0]
6	5	5	1	3	1	1	2	8

offset = sign_extend(s, from_nbits=9)
count = 8 if count3 == 0 else count3

Operation:

if C0.Config5.NMS == 1:
    raise exception('RI')
i = 0
while i != count:
    this_rt = ( 0           if rt == 0    else
                rt + i      if rt + i < 32 else
                rt + i - 16                    )
    this_offset = offset + (i<<2)
    va = effective_address(GPR[rs], this_offset, 'Store')
    data = zero_extend(GPR[this_rt], from_nbits=32)
    write_memory_at_va(data, va, nbytes=4, unaligned_support='always')
    i += 1

UASWM stores count words from sequentially numbered registers to sequential memory addresses which are potentially unaligned. After storing $31, the sequence of registers continues from $16.If

rt=0, then $0 is stored for all count steps of the instruction. See SWM for example encodings of the register list.

UASWM will not cause an Address Error exception for unaligned addresses.

the exception.

steps to accommodate this lack of guaranteed atomicity as it would for the multiple instruction case.

Exceptions:

Bus Error. Reserved Instruction on NMS cores. TLB Invalid. TLB Modiﬁed. TLB Reﬁll. Watch.

/home/arch-index/nanomips/main › WAIT code - Wait

Assembly:

WAIT code

nanoMIPS

Wait

Purpose:

Wait. Enter wait state.

Availability:

nanoMIPS

Format:

001000	code	11	00001	101	111	111
6	10	2	5	3	3	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
CPU.in_wait_state = True

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › WRPGPR rt, rs - Write Previous GPR

Assembly:

WRPGPR rt, rs

nanoMIPS. Requires CP0 privilege.

Write Previous GPR

Purpose:

Write Previous GPR. Write the value of register $rs from the current shadow register set(SRSCtl.CSS) to register $rt in the previous shadow register set (SRSCtl.PSS). If shadow register sets are not implemented,just copy the value from register $rs to register $rt.

Availability:

nanoMIPS. Requires CP0 privilege.

Format:

001000	rt	rs	11	11000	101	111	111
6	5	5	2	5	3	3	3

Operation:

if not IsCoprocessor0Enabled():
    raise coprocessor_exception(0)
if C0.SRSCtl.HSS > 0:
    SRS[C0.SRSCtl.PSS][rt] = GPR[rs]
else:
    GPR[rt] = GPR[rs]

Exceptions:

Coprocessor Unusable.

/home/arch-index/nanomips/main › WSBH rt, rs - Word Swap Byte Half

Assembly:

WSBH rt, rs

Assembly alias, not available in NMS

Word Swap Byte Half

Purpose:

Word Swap Byte Half. Swap the bytes within both halfs of the word value in register $rs,and write the result to register $rt.

Availability:

Assembly alias, not available in NMS

Expansion:

ROTX rt, rs, 8, 24

The assembly alias WSHB is provided for compatibility with MIPS32™.Its behavior is equivalent to the new assembly alias BYTEREVH, whose name is chosen to ﬁt consistently with the naming of other

reversing instructions in nanoMIPS™.

/home/arch-index/nanomips/main › XOR rd, rs, rt - XOR

Assembly:

XOR rd, rs, rt

nanoMIPS

XOR

Purpose:

XOR. Compute logical XOR of registers $rs and $rt, placing the result in register $rt.

Availability:

nanoMIPS

Format:

XOR[32]

001000	rt	rs	rd	x	1100010	000
6	5	5	5	1	7	3

XOR[16]

010100	rt3	rs3	01	0	0
6	3	3	2	1	1

rt = decode_gpr(rt3, 'gpr3')
rs = decode_gpr(rs3, 'gpr3')
rd = rt

Operation:

GPR[rd] = GPR[rs] ^ GPR[rt]

Exceptions:

None.

/home/arch-index/nanomips/main › XORI rt, rs, u - XOR Immediate

Assembly:

XORI rt, rs, u

nanoMIPS

XOR Immediate

Purpose:

XOR Immediate. Compute logical XOR of register $rs with immediate u, placing the resultin register $rt.

Availability:

nanoMIPS

Format:

100000	rt	rs	0001	u
6	5	5	4	12

Operation:

GPR[rt] = GPR[rs] ^ u

Exceptions:

None.