SC rt, offset(rs) |
nanoMIPS, availability varies by format. |
Store Conditional word/Store Conditional word using EVA addressing/Store Conditional |
SCE rt, offset(rs) |
nanoMIPS, availability varies by format. |
Store Conditional word/Store Conditional word using EVA addressing/Store Conditional |
SCWP rt, ru, (rs) |
nanoMIPS, availability varies by format. |
Store Conditional word/Store Conditional word using EVA addressing/Store Conditional |
SCWPE rt, ru, (rs) |
nanoMIPS, availability varies by format. |
Store Conditional word/Store Conditional word using EVA addressing/Store Conditional |
Store Conditional word/Store Conditional word using EVA addressing/Store Conditional
Word Pair/Store Conditional Word Pair using EVA addressing. Store conditionally to complete atomic read-modify-write. For SC/SCE, store from register $rt to address $rs + offset (register plus offset).
For SCWP/SCWPE, store from registers $rt and $ru to address $rs. For SCE/SCWPE, translate the virtual address as though the core is in user mode, although itis actually in kernel mode.Indicate
Indicate success or failure by writing 1 or 0 respectively to $rt.
nanoMIPS, availability varies by format.
101001 |
rt |
rs |
s[8] |
1011 |
0 |
01 |
s[7:2] |
00 |
6 |
5 |
5 |
1 |
4 |
1 |
2 |
6 |
2 |
offset = sign_extend(s, from_nbits=9) nbytes = 4 is_eva = False
101001 |
rt |
rs |
s[8] |
1011 |
0 |
10 |
s[7:2] |
00 |
6 |
5 |
5 |
1 |
4 |
1 |
2 |
6 |
2 |
offset = sign_extend(s, from_nbits=9) nbytes = 4 is_eva = True
101001 |
rt |
rs |
x |
1011 |
0 |
01 |
ru |
x |
01 |
6 |
5 |
5 |
1 |
4 |
1 |
2 |
5 |
1 |
2 |
offset = 0 nbytes = 8 is_eva = False
101001 |
rt |
rs |
x |
1011 |
0 |
10 |
ru |
x |
01 |
6 |
5 |
5 |
1 |
4 |
1 |
2 |
5 |
1 |
2 |
offset = 0 nbytes = 8 is_eva = True
if nbytes == 8 and C0.Config5.XNP: raise exception('RI', 'SCWP[E] requires word-paired support') if is_eva and not C0.Config5.EVA: raise exception('RI') va = effective_address(GPR[rs], offset, 'Store', eva=is_eva) # Linked access must be aligned. if va & (nbytes-1): raise exception('ADES', badva=va) pa, cca = va2pa(va, 'Store', eva=is_eva) if (cca == 2 or cca == 7) and not C0.Config5.ULS: raise UNPREDICTABLE('uncached CCA not synchronizablewhen Config5.ULS=0') # (Preferred behavior for non-synchronizableaddressisBusError). if nbytes == 4: # SC/SCE data = zero_extend(GPR[rt], from_nbits=32) else: # SCWP/SCWPE word0 = GPR[rt][31:0] word1 = GPR[ru][31:0] data = word0 @ word1 if C0.Config.BE else word1 @ word0 # Write this data to memory, but only if it can bedoneatomicallywith # respect to a prior linked load. The return valueindicateswhetherthewrite # occurred. success = write_memory(data, va, pa, cca, nbytes=nbytes, atomic=True) if success: GPR[rt] = 1 else: GPR[rt] = 0 C0.LLAddr.LLB = 0 # SC always clears LLbitregardless of address matches.
The SC, SCE, SCWP and SCWPE instructions are used to complete the atomic read-modify-write (RMW) sequence begun by a prior matching LL/LLE/LLWP/LLWPE instruction respectively.If the system can
guarantee that the write to memory can be completed prior to any other modification to the targeted data since it was read by the load-linked instruction which initiated the sequence, then the write will
complete and register $rt will be set to 1,indicating success. Otherwise, the memory write will not occur, and register $rt will be set to 0, indicating failure.
If any ofthe following events occur between a load-linked and a store conditionalinstruction,the store-conditional will fail:
The store-conditional will fail if a coherent store is completed (by either the current processor, another processor, or a coherent I/O module) into the block of synchronizable physical memory containing the load-linked data. The size and alignment of the block is implementation-dependent,
butitis atleast one word and at mostthe minimum page size.Typically,the synchronizable block size is the size of the largest cache line in use.
The store-conditional will fail if an ERET instruction has been executed since the preceding loadlinked instruction.
(Note that nanoMIPS™ also includes the ERETNC instruction, which will not
cause the store-conditional instruction to fail.)
If any of the following events occur between a load-linked and a store conditional instruction, the storeconditional may fail when it would otherwise have succeeded. Portable programs should not cause any
of these events:
The store-conditional may fail
if a load or store is executed on a processor executing a loadlinked/store-conditional sequence, and that
load or store is notto the block of synchronizable
physical memory containing the load-linked data. This is because the load or store may cause the load-linked data to be evicted from the cache.
The store-conditional may fail
if any PREF instruction is executed a processor executing a loadlinked/store-conditional sequence, due to the possibility of the PREF causing a cache eviction.
The store-conditional may fail on coherent multi-processor systems if a non-coherent store is
executed during a load-linked/store-conditional sequence and that store is to the block of synchronizable physical memory containing the linked data.
The store-conditional may fail
if the instructions executed starting with the load-linked instruction and ending with the store-conditional instruction do not lie in a 2048-byte contiguous region
of virtual memory.(The region does not have to be aligned, other than the alignment required for instruction words.)
The store-conditional may fail
if a CACHE operation is carried out during the load-linked/storeconditional sequence, due to the possibility of modifying or evicting the line containing the linked
data.In addition, non-local CACHE operations may cause a store-conditionalinstruction to fail on either the local processor or on the remote processor in multiprocessor or multi-threaded
systems.
The store-conditional must not fail as a result of any of the following events:
The store-conditional must not fail as a result of a load that executes on the processor executing
a load-linked/store-conditional sequence if the load targets the block of synchronizable physical memory containing the load-linked data.
The outcome of the store-conditional is not predictable (it may succeed or fail) under any of the following conditions:
The store-conditional result is unpredictable if the store-conditional was not preceded by a matching load-linked instruction. SC must be preceded by LL, SCE must be preceded by LLE, SCWP
must be preceded by LLWP, and SCWPE must be preceded by LLWPE.
The store-conditional result is unpredictable if the load-linked and store-conditional instructions
do not target identical virtual addresses, physical addresses and CCAs.
The store-conditional result is unpredictable if the targeted memory location is not synchronizable. A synchronizable memory location is one that is associated with the state and logic necessary to track RMW atomicity. Whether a memory location is synchronizable depends on the
processor and system configurations, and on the memory access type used for the location.
The store-conditional result
is unpredictable ifthe memory access does not use a CCA which
supports atomic RMW for the targeted address.
For uniprocessor systems, a cached noncoherent or cached coherent CCA must be used, or
additionally an uncached CCA can be used in the case that Config5.ULS=1.
For multi-processor systems or systems containing coherent I/O devices, a cached coherent CCA must be used, or additionally an uncached CCA can be used in the case that Config5.ULS=1.
When Config5.ULS=1, uncached load-linked/store-conditional operations are supported, with the following additional constraints:
The result of a store-conditional which is part of an uncached load-linked/store conditional sequence is unpredictable if during the sequence a local or remote CPU accesses the block of
memory containing the targeted data using any other CCA than that used by the load-linked and store-conditional instructions.
The result of an uncached load-linked/store-conditional sequence is only predicable if it targets
an address in the system which supports uncached RMW accesses.In particular,the system must implement a ”monitor”, which is responsible determining whether or not the address can
be updated atomically with respect to the prior linked load.In response to a store-conditional instruction,the monitor updates memory where appropriate and communicates the resultto
the processor that initiated the sequence.It is implementation dependent as to what form the monitor takes. The recommended response for load-linked/store-conditionalinstructions which
target a non-synchronizable uncached address is that the sub-system report a Bus Error to the processor.
Same processor uncached stores will cause an uncached load-linked/store-conditional sequence
to fail if the store address matches that of the sequence.
A PAUSE instruction is no-op’d when it is preceded by an uncached load-linked instruction. This
is because the event which would wake the CPU from the paused state may only be visible to the external monitor, not to the local processor.
The rules for uncached load-linked/store-conditional atomic operation apply to any uncached CCA
including UCA (UnCached Accelerated). An implementation that supports UCA must guarantee that a store-conditionalinstruction does not participate in store gathering and that it ends any
gathering initiated by stores preceding the SC in program order when the SC address coincides with a gathering address.
The effective address of a store-conditional operation must be naturally-aligned,i.e. word aligned for SC and SCE, and double-word aligned for SCWP and SCWPE: Otherwise an address exception occurs.
The following assembly code shows a possible usage of LL and SC to atomically update a memory location:
L1: ll t1, 0(t0) # Load counter. addiu t2, t1, 1 # Increment. sc t2, 0(t0) # Try to store, checking for atomicity. beqc t2, 0, L1 # If not atomic (0), try again.
Exceptions between the load-linked and store-conditionalinstructions cause the store-conditional to
fail, so instructions which can cause persistent exceptions must not be used within the load-linked/storeconditional sequence. Examples of instructions which must be avoided are are arithmetic operations
that trap, system calls, and floating point operations that trap or require software emulation assistance.
Load-linked and store-conditional must function correctly on a single processor for cached noncoherent memory so that parallel programs can be run on uniprocessor systems that do not support cached
coherent memory access types.
Support for the paired word instructions SCWP/SCWPE is indicated by the Config5.XNP bit. Paired word support is required for nanoMIPS™ cores, except for NMS cores, where it is optional.
Address Error.Bus Error.Coprocessor UnusableforSCE/SCWPE. ReservedInstructionfor SCE/SCWPE if EVA notimplemented.Reserved Instruction for SCWP/SCWPE ifload linked pair
not implemented. TLB Invalid. TLB Modified. TLB Refill. Watch.