Encoding:

POOL32I

010000

SYNCI

01100

base

offset

6

5

5

16

Format:

SYNCI offset(base)

microMIPS

Synchronize Caches to Make Instruction Writes Effective

Purpose:

Synchronize Caches to Make Instruction Writes Effective

To synchronize all caches to make instruction writes effective.

Description:

This instruction is used after a new instruction stream is written to make the new instructions effective relative to an instruction fetch, when used in conjunction with the SYNC and JALR.HB, JR.HB, or ERET instructions, as described below. Unlike the CACHE instruction, the SYNCI instruction is available in all operating modes in an implementation of Release 2 of the architecture.

The 16-bit offset is sign-extended and added to the contents of the base register to form an effective address. The effective address is used to address the cache line in all caches which may need to be synchronized with the write of the new instructions. The operation occurs only on the cache line which may contain the effective address. One

SYNCI instruction is required for every cache line that was written. See the Programming Notes below.

A TLB Refill and TLB Invalid (both with cause code equal TLBL) exception can occur as a by product of this instruction. This instruction never causes TLB Modified exceptions nor TLB Refill exceptions with a cause code of

TLBS. This instruction never causes Execute-Inhibit nor Read-Inhibit exceptions.

A Cache Error exception may occur as a by product of this instruction. For example, if a writeback operation detects a cache or bus error during the processing of the operation, that error is reported via a Cache Error exception. Similarly, a Bus Error Exception may occur if a bus operation invoked by this instruction is terminated in an error.

An Address Error Exception (with cause code equal AdEL) may occur if the effective address references a portion of the kernel address space which would normally result in such an exception. It is implementation dependent whether such an exception does occur.

It is implementation dependent whether a data watch is triggered by a SYNCI instruction whose address matches the

Watch register address match conditions.

Restrictions:

The operation of the processor is UNPREDICTABLE if the effective address references any instruction cache line that contains instructions to be executed between the SYNCI and the subsequent JALR.HB, JR.HB, or ERET instruction required to clear the instruction hazard.

The SYNCI instruction has no effect on cache lines that were previously locked with the CACHE instruction. If correct software operation depends on the state of a locked line, the CACHE instruction must be used to synchronize the caches.

Full visibility of the new instruction stream requires execution of a subsequent SYNC instruction, followed by a

JALR.HB, JR.HB, DERET, or ERET instruction. The operation of the processor is UNPREDICTABLE if this sequence is not followed.

SYNCI globalization:

The SYNCI instruction acts on the current processor at a minimum. Implementations are required to affect caches outside the current processor to perform the operation on the current processor (as might be the case if multiple processors share an L2 or L3 cache).

In multiprocessor implementations where instruction caches are coherently maintained by hardware, the SYNCI instruction should behave as a NOP instruction.

In multiprocessor implementations where instruction caches are not coherently maintained by hardware, the SYNCI instruction may optionally affect all coherent icaches within the system. If the effective address uses a coherent

Cacheability and Coherency Attribute (CCA), then the operation may be globalized, meaning it is broadcast to all of the coherent instruction caches within the system. If the effective address does not use one of the coherent CCAs, there is no broadcast of the SYNCI operation. If multiple levels of caches are to be affected by one SYNCI instruction, all of the affected cache levels must be processed in the same manner - either all affected cache levels use the globalized behavior or all affected cache levels use the non-globalized behavior.

Pre-Release 6: Portable software could not rely on the optional globalization of SYNCI. Strictly portable software without implementation specific awareness could only rely on expensive "instruction cache shootdown" using interprocessor interrupts.

Release 6: SYNCI globalization is required. Compliant implementations must globalize SYNCI, and portable software can rely on this behavior.

Availability and Compatibility

This instruction has been recoded for Release 6.

Operation:

vaddr = GPR[base] + sign_extend(offset)
SynchronizeCacheLines(vaddr)     /* Operate on all caches */

Exceptions:

Reserved Instruction exception (Release 1 implementations only)

TLB Refill Exception

TLB Invalid Exception

Address Error Exception

Cache Error Exception

Bus Error Exception

Programming Notes:

When the instruction stream is written, the SYNCI instruction should be used in conjunction with other instructions to make the newly-written instructions effective. The following example shows a routine which can be called after the new instruction stream is written to make those changes effective. The SYNCI instruction could be replaced with the corresponding sequence of CACHE instructions (when access to Coprocessor 0 is available), and that the JR.HB instruction could be replaced with JALR.HB, ERET, or DERET instructions, as appropriate. A SYNC instruction is required between the final SYNCI instruction in the loop and the instruction that clears instruction hazards.

/*
 * This routine makes changes to the instruction stream effective to the
 * hardware.  It should be called after the instruction stream is written.
 * On return, the new instructions are effective.
 *
 * Inputs:
 *    a0 = Start address of new instruction stream
 *    a1 = Size, in bytes, of new instruction stream
 */
   beq   a1, zero, 20f       /* If size==0, */
   nop                       /*   branch around */
   addu  a1, a0, a1          /* Calculate end address + 1 */
                              /* (daddu for 64-bit addressing) */
   rdhwr v0, HW_SYNCI_Step   /* Get step size for SYNCI from new */
                              /*   Release 2 instruction */
   beq   v0, zero, 20f       /* If no caches require synchronization, */
   nop                       /*   branch around */
10: synci 0(a0)               /* Synchronize all caches around address */
   addu  a0, a0, v0          /* Add step size in delay slot */
                              /* (daddu for 64-bit addressing) */
   sltu  v1, a0, a1          /* Compare current with end address */
   bne   v1, zero, 10b       /* Branch if more to do */
   nop                       /*   branch around */
   sync                      /* Clear memory hazards */
20: jr.hb ra                 /* Return, clearing instruction hazards */
   nop