pre-Release 6
CACHE 101111 |
base |
op |
offset |
6 |
5 |
5 |
16 |
Release 6
SPECIAL3 011111 |
base |
op |
offset |
0 |
CACHE 100101 |
6 |
5 |
5 |
9 |
1 |
6 |
CACHE op, offset(base) |
MIPS32 |
Perform Cache Operation |
Perform Cache Operation
To perform the cache operation specified by op.
The 9-bit offset is sign-extended and added to the contents of the base register to form an effective address. The effective address is used in one of the following ways based on the operation to be performed and the type of cache as described in the following table.
Table 3.3 Usage of Effective Address
Operation Requires an |
Type of Cache |
Usage of Effective Address |
Address |
Virtual |
The effective address is used to address the cache. An address translation may or may not be performed on the effective address (with the possibility that a TLB Refill or TLB Invalid exception might occur) |
Address |
Physical |
The effective address is translated by the MMU to a physical address. The physical address is then used to address the cache |
Index |
N/A |
The effective address is translated by the MMU to a physical address. It is implementation dependent whether the effective address or the translated physical address is used to index the cache. As such, an unmapped address (such as within kseg0) should always be used for cache operations that require an index. See the Programming Notes section below. Assuming that the total cache size in bytes is CS, the associativity is A, and the number of bytes per tag is BPT, the following calculations give the fields of the address which specify the way and the index: OffsetBit = Log2(BPT) IndexBit = Log2(CS / A) WayBit = IndexBit + Ceiling(Log2(A)) Way = AddrWayBit-1..IndexBit Index = AddrIndexBit-1..OffsetBit For a direct-mapped cache, the Way calculation is ignored and the Index value fully specifies the cache tag. This is shown symbolically in the figure below. |
Figure 3.4 Usage of Address Fields to Select Index and Way
A TLB Refill and TLB Invalid (both with cause code equal TLBL) exception can occur on any operation. For index operations (where the address is used to index the cache but need not match the cache tag), software must use unmapped addresses to avoid TLB exceptions. This instruction never causes TLB Modified exceptions nor TLB
Refill exceptions with a cause code of TLBS. This instruction never causes Execute-Inhibit nor Read-Inhibit exceptions.
The effective address may be an arbitrarily-aligned by address. The CACHE instruction never causes an Address
Error Exception due to an non-aligned address.
As a result, a Cache Error exception may occur because of some operations performed by this instruction. For example, if a Writeback operation detects a cache or bus error during the processing of the operation, that error is reported via a Cache Error exception. Also, a Bus Error Exception may occur if a bus operation invoked by this instruction is terminated in an error. However, cache error exceptions must not be triggered by an Index Load Tag or Index Store tag operation, as these operations are used for initialization and diagnostic purposes.
An Address Error Exception (with cause code equal AdEL) may occur if the effective address references a portion of the kernel address space which would normally result in such an exception. It is implementation dependent whether such an exception does occur.
It is implementation dependent whether a data watch is triggered by a cache instruction whose address matches the
Watch register address match conditions.
The CACHE instruction and the memory transactions which are sourced by the CACHE instruction, such as cache refill or cache writeback, obey the ordering and completion rules of the SYNC instruction.
Bits [17:16] of the instruction specify the cache on which to perform the operation, as follows:
Table 3.4 Encoding of Bits[17:16] of CACHE Instruction
Code |
Name |
Cache |
0b00 |
I |
Primary Instruction |
0b01 |
D |
Primary Data or Unified Primary |
0b10 |
T |
Tertiary |
0b11 |
S |
Secondary |
Bits [20:18] of the instruction specify the operation to perform. To provide software with a consistent base of cache operations, certain encodings must be supported on all processors. The remaining encodings are recommended
When implementing multiple level of caches and where the hardware maintains the smaller cache as a proper subset of a larger cache (every address which is resident in the smaller cache is also resident in the larger cache; also known as the inclusion property). It is recommended that the CACHE instructions which operate on the larger, outer-level cache; must first operate on the smaller, inner-level cache. For example, a Hit_Writeback _Invalidate operation targeting the Secondary cache, must first operate on the primary data cache first. If the CACHE instruction implementation does not follow this policy then any software which flushes the caches must mimic this behavior. That is, the software sequences must first operate on the inner cache then operate on the outer cache. The software must place a
SYNC instruction after the CACHE instruction whenever there are possible writebacks from the inner cache to ensure that the writeback data is resident in the outer cache before operating on the outer cache. If neither the CACHE instruction implementation nor the software cache flush sequence follow this policy, then the inclusion property of the caches can be broken, which might be a condition that the cache management hardware cannot properly deal with.
When implementing multiple level of caches without the inclusion property, the use of a SYNC instruction after the
CACHE instruction is still needed whenever writeback data has to be resident in the next level of memory hierarchy.
For multiprocessor implementations that maintain coherent caches, some of the Hit type of CACHE instruction operations may optionally affect all coherent caches within the implementation. If the effective address uses a coherent
Cache Coherency Attribute (CCA), then the operation is globalized, meaning it is broadcast to all of the coherent caches within the system. If the effective address does not use one of the coherent CCAs, there is no broadcast of the operation. If multiple levels of caches are to be affected by one CACHE instruction, all of the affected cache levels must be processed in the same manner - either all affected cache levels use the globalized behavior or all affected cache levels use the non-globalized behavior.
Table 3.5 Encoding of Bits [20:18] of the CACHE Instruction
Code |
Caches |
Name |
Effective Address Operand Type |
Operation |
Compliance Implemented |
0b000 |
I |
Index Invalidate |
Index |
Set the state of the cache block at the specified index to invalid. This required encoding may be used by software to invalidate the entire instruction cache by stepping through all valid indices. |
Required |
D |
Index Writeback Invalidate / Index Invalidate |
Index |
For a write-back cache: If the state of the cache block at the specified index is valid and dirty, write the block back to the memory address specified by the cache tag. After that operation |
Required | |
S, T |
Index Writeback Invalidate / Index Invalidate |
Index |
is completed, set the state of the cache block to invalid. If the block is valid but not dirty, set the state of the block to invalid. For a write-through cache: Set the state of the cache block at the specified index to invalid. This required encoding may be used by software to invalidate the entire data cache by stepping through all valid indices. The Index Store Tag must be used to initialize the cache at power up. |
Required if S, T cache is implemented | |
0b001 |
All |
Index Load Tag |
Index |
Read the tag for the cache block at the specified index into the TagLo and TagHi Coprocessor 0 registers. If the DataLo and DataHi registers are implemented, also read the data corresponding to the byte index into the DataLo and DataHi registers. This operation must not cause a Cache Error Exception. The granularity and alignment of the data read into the DataLo and DataHi registers is implementation-dependent, but is typically the result of an aligned access to the cache, ignoring the appropriate low-order bits of the byte index. |
Recommended |
0b010 |
All |
Index Store Tag |
Index |
Write the tag for the cache block at the specified index from the TagLo and TagHi Coprocessor 0 registers. This operation must not cause a Cache Error Exception. This required encoding may be used by software to initialize the entire instruction or data caches by stepping through all valid indices. Doing so requires that the TagLo and TagHi registers associated with the cache be initialized first. |
Required |
0b011 |
All |
Implementation Dependent |
Unspecified |
Available for implementation-dependent operation. |
Optional |
0b100 |
I, D |
Hit Invalidate |
Address |
If the cache block contains the specified address, set the state of the cache block to invalid. This required encoding may be used by software to invalidate a range of addresses from the |
Required (Instruction Cache Encoding Only), Recommended otherwise |
S, T |
Hit Invalidate |
Address |
instruction cache by stepping through the address range by the line size of the cache. In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system. |
Optional, if Hit_Invalidate_D is implemented, the S and T variants are recommended. | |
0b101 |
I |
Fill |
Address |
Fill the cache from the specified address. |
Recommended |
D |
Hit Writeback Invalidate / Hit Invalidate |
Address |
For a write-back cache: If the cache block contains the specified address and it is valid and dirty, write the contents back to memory. After |
Required | |
S, T |
Hit Writeback Invalidate / Hit Invalidate |
Address |
that operation is completed, set the state of the cache block to invalid. If the block is valid but not dirty, set the state of the block to invalid. For a write-through cache: If the cache block contains the specified address, set the state of the cache block to invalid. This required encoding may be used by software to invalidate a range of addresses from the data cache by stepping through the address range by the line size of the cache. In multiprocessor implementations with coherent caches, the operation may optionally be broadcast to all coherent caches within the system. |
Required if S, T cache is implemented |
The operation of this instruction is UNDEFINED for any operation/cache combination that is not implemented. In
Release 6, the instruction in this case should perform no operation.
The operation of this instruction is UNDEFINED if the operation requires an address, and that address is uncacheable. In Release 6, the instruction in this case should perform no operation.
The operation of the instruction is UNPREDICTABLE if the cache line that contains the CACHE instruction is the target of an invalidate or a writeback invalidate.
If this instruction is used to lock all ways of a cache at a specific cache index, the behavior of that cache to subsequent cache misses to that cache index is UNDEFINED.
If access to Coprocessor 0 is not enabled, a Coprocessor Unusable Exception is signaled.
Any use of this instruction that can cause cacheline writebacks should be followed by a subsequent SYNC instruction to avoid hazards where the writeback data is not yet visible at the next level of the memory hierarchy.
This instruction does not produce an exception for a misaligned memory address, since it has no memory access size.
This instruction has been recoded for Release 6.
vAddr = GPR[base] + sign_extend(offset) (pAddr, uncached) = AddressTranslation(vAddr, DataReadReference) CacheOp(op, vAddr, pAddr)
TLB Refill Exception.
TLB Invalid Exception
Coprocessor Unusable Exception
Address Error Exception
Cache Error Exception
Bus Error Exception
Release 6 architecture implements a 9-bit offset, whereas all release levels lower than Release 6 implement a 16-bit offset.
For cache operations that require an index, it is implementation dependent whether the effective address or the translated physical address is used as the cache index. Therefore, the index value should always be converted to an unmapped address (such as an kseg0 address - by ORing the index with 0x80000000 before being used by the cache instruction). For example, the following code sequence performs a data cache Index Store Tag operation using the index passed in GPR a0:
li a1, 0x80000000 /* Base of kseg0 segment */ or a0, a0, a1 /* Convert index to kseg0 address */ cache DCIndexStTag, 0(a1) /* Perform the index store tag operation */