SPECIAL3 011111 |
base |
rt |
rd |
0 0000 |
1 |
SCD 100111 |
6 |
5 |
5 |
5 |
4 |
1 |
6 |
SCDP rt, rd, (base) |
MIPS32 Release 6 |
Store Conditional DoubleWord Paired |
Store Conditional DoubleWord Paired
Conditionally store a paired double-word to memory to complete an atomic read-modify-write
if atomic_update then memory[GPR[base]]= {GPR[rd],GPR[rt]}, GPR[rt] = 1 else GPR[rt] = 0
The LLDP and SCDP instructions provide primitives to implement a paired double-word atomic read-modify-write
(RMW) operation at a synchronizable memory location.
Release 6 (with Config5ULS =1) formalizes support for uncached LLDP and SCDP sequences. (The description for uncached support does not modify the description for cached support and is written in a self-contained manner.)
A paired double-word is conditionally written to memory in a single atomic memory operation. GPR rd is the mostsignificant double-word and GPR rt is the least-significant double-word of the quad-word in memory. The write occurs to a quad-word aligned effective address from GPR base.
A paired double-word read or write occurs as a pair of double-word reads or writes that is quad-word atomic.
The instruction has no offset. The effective address is equal to the contents of GPR base.
The SCDP completes the RMW sequence begun by the preceding LLDP instruction executed on the processor. To complete the RMW sequence atomically, the following occur:
The paired double-word formed from the concatenation of GPRs rd and rt is stored to memory at the location specified by the quad-word aligned effective address.
A one, indicating success, is written into GPR rt.
Otherwise, memory is not modified and a 0, indicating failure, is written into GPR rt.
Though legal programming requires LLDP to start the atomic read-modify-write sequence and SCDP to end the same sequence, whether the SCDP completes is only dependent on the state of LLbit and LLAddr, which are set by a preceding load-linked instruction of any type. Software must assume that pairing load-linked and store-conditional instructions in an inconsistent manner causes UNPREDICTABLE behavior.
The SCDP must always compare its quad-word aligned address against that of the preceding LLDP. The SCDP will fail if the address does not match that of the preceding LLDP.
Events that occur between the execution of load-linked and store-conditional instruction types that must cause the sequence to fail are given in the legacy SCD instruction definition, except the block of synchronizable memory is a
quadword, not doubleword.
Additional events that occur between the execution of load-linked and store-conditional instruction types that may cause success of the sequence to be UNPREDICTABLE are defined in the SCD instruction definition.
A load that executes on the processor executing the LLDP/SCDP sequence to the block of synchronizable physical memory containing the paired double-word, will not cause the SCDP to fail.
Effect of CACHE operations, both local and remote, on a paired double-word atomic operation are defined in the SC instruction definition.
Atomic RMW is provided only for synchronizable memory locations. A synchronizable memory location is one that is associated with the state and logic necessary to implement the LL/SC semantics. Whether a memory location is synchronizable depends on the processor and system configurations, and on the memory access type used for the location. Requirements for Uniprocessor, MP and I/O atomicity are given in the SC definition.
The definition for SCDP is extended for uncached memory types in a manner identical to SC. The extension is defined in the SC instruction description.
Load-Linked and Store-Conditional instruction types require that the addressed location must have a memory access type of cached noncoherent or cached coherent, that is the processor must have a cache. If it does not, the result is
UNPREDICTABLE. Release 6 (with Config5ULS =1) extends support to uncached types.
The architecture optionally allows support for Load-Linked and Store-Conditional instruction types in a cacheless processor. Support for cacheless operation is implementation dependent. In this case, LLAddr is optional.
Providing misaligned support is not a requirement for this instruction.
Availability and Compatibility
This instruction is introduced by Release 6. It is only present if Config5XNP=0.
vAddr = GPR[base] (pAddr, CCA) = AddressTranslation (vAddr, DATA, STORE) dataquadword = {GPR[rd],GPR[rt]} if (LLbit && (pAddr == LLAddr))then // quadword aligned monitor // PAIREDDOUBLEWORD: two double-word data-type that is quad-word atomic StoreMemory (CCA, PAIREDDOUBLEWORD, dataquadword, pAddr, vAddr, DATA) GPR[rt] = 063 || 1’b1 else GPR[rt] = 064 endif LLbit = 0
TLB Refill, TLB Invalid, TLB Modified, Reserved Instruction, Address Error, Watch
LLDP and SCDP are used to atomically update memory locations, as shown below.
L1: LLDP T2, T3, (T0)# load T2 and T3 BOVC T2, 1, U32# check whether least-significant double-word may overflow ADDI T2, T2, 1 # increment lower - only SCDP T2, T3, (T0) # store T2 and T3 BEQC T2, 0, L1 # if not atomic (0), try again U32: ADDI T2, T2, 1 # increment lower ADDI T3, T3, 1 # increment upper SCDP T2, T3, (T0) BEQC T2, 0, L1 # if not atomic (0), try again
Exceptions between the LLDP and SCDP cause SC to fail, so persistent exceptions must be avoided. Some examples of these are arithmetic operations that trap, system calls, and floating point operations that trap or require software emulation assistance.
LLDP and SCDP function on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached coherent memory access types.