Encoding:

SPECIAL3

011111

base

rt

offset

0

SCE

011110

6

5

5

9

1

6

Format:

SCE rt, offset(base)

MIPS32

Store Conditional Word EVA

Purpose:

Store Conditional Word EVA

To store a word to user mode virtual memory while operating in kernel mode to complete an atomic read-modifywrite.

Description:

 if atomic_update then memory[GPR[base] + offset] = GPR[rt], GPR[rt] = 1 else 
GPR[rt] = 0

The LL and SC instructions provide primitives to implement atomic read-modify-write (RMW) operations for synchronizable memory locations.

Release 6 (with Config5ULS =1) formalizes support for uncached LLE and SCE sequences. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)

The least-significant 32-bit word in GPR rt is conditionally stored in memory at the location specified by the aligned effective address. The 9-bit signed offset is added to the contents of GPR base to form an effective address.

The SCE completes the RMW sequence begun by the preceding LLE instruction executed on the processor. To complete the RMW sequence atomically, the following occurs:

Otherwise, memory is not modified and a 0, indicating failure, is written into GPR rt.

If either of the following events occurs between the execution of LL and SC, the SC fails:

If either of the following events occurs between the execution of LLE and SCE, the SCE may succeed or it may fail; the success or failure is not predictable. Portable programs should not cause one of these events.

The following conditions must be true or the result of the SCE is UNPREDICTABLE:

Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LLE/SCE semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location:

made with memory access type of either cached non coherent or cached coherent. All accesses must be to one or the other access type, and they may not be mixed.

with a memory access type of cached coherent.

a memory access type of cached coherent. If the I/O system does not use coherent memory operations, then atomic RMW cannot be provided with respect to the I/O reads and writes.

The SCE instruction functions the same as the SC instruction, except that address translation is performed using the user mode virtual address space mapping in the TLB when accessing an address within a memory segment configured to use the MUSUK access mode. Memory segments using UUSK or MUSK access modes are also accessible.

Refer to Volume III, Enhanced Virtual Addressing section for additional information.

Implementation of this instruction is specified by the Config5EVA field being set to 1.

The definition for SCE is extended for uncached memory types in a manner identical to SC. The extension is defined in the SC instruction description.

Restrictions:

The addressed location must have a memory access type of cached non coherent or cached coherent; if it does not, the result is UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.

The effective address must be naturally-aligned. If either of the 2 least-significant bits of the address is non-zero, an

Address Error exception occurs.

Providing misaligned support for Release 6 is not a requirement for this instruction.

Operation:

vAddr = sign_extend(offset) + GPR[base]
if vAddr1..0 != 02 then
   SignalException(AddressError)
endif
(pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE)
pAddr = pAddrPSIZE-1..3 || (pAddr2..0 xor (ReverseEndian || 02))
bytesel = vAddr2..0 xor (BigEndianCPU || 02)
datadoubleword = GPR[rt]63-8*bytesel..0 || 08*bytesel
if LLbit then
   StoreMemory (CCA, WORD, datadoubleword, pAddr, vAddr, DATA)
endif
GPR[rt] = 063 || LLbit

Exceptions:

TLB Refill, TLB Invalid, TLB Modified, Address Error, Watch, Reserved Instruction, Coprocessor Unusable

Programming Notes:

LLE and SCE are used to atomically update memory locations, as shown below.

L1:
   LLE   T1, (T0)  # load counter
   ADDI  T2, T1, 1 # increment
   SCE   T2, (T0)  # try to store, checking for atomicity
   BEQ   T2, 0, L1 # if not atomic (0), try again
   NOP             # branch-delay slot

Exceptions between the LLE and SCE cause SCE to fail, so persistent exceptions must be avoided. Examples are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.

LLE and SCE function on a single processor for cached non coherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.