Opcode/ Instruction |
Op/ En |
64/32 bit Mode Support |
CPUID Feature Flag |
Description |
66 0F 38 F8 /r MOVDIR64B r16/r32/r64, m512 |
A |
V/V |
MOVDIR64B |
Move 64-bytes as direct-store with guaranteed 64- byte write atomicity from the source memory operand address to destination memory address specified as offset to ES segment in the register operand. |
Op/En |
Tuple |
Operand 1 |
Operand 2 |
Operand 3 |
Operand 4 |
A |
N/A |
ModRM:reg (w) |
ModRM:r/m (r) |
N/A |
N/A |
Moves 64-bytes as direct-store with 64-byte write atomicity from source memory address to destination memory address. The source operand is a normal memory operand. The destination operand is a memory location specified in a general-purpose register. The register content is interpreted as an offset into ES segment without any segment override. In 64-bit mode, the register operand width is 64-bits (32-bits with 67H prefix). Outside of 64-bit mode, the register width is 32-bits when CS.D=1 (16-bits with 67H prefix), and 16-bits when CS.D=0 (32-bits with 67H prefix). MOVDIR64B requires the destination address to be 64-byte aligned. No alignment restriction is enforced for source operand.
MOVDIR64B first reads 64-bytes from the source memory address. It then performs a 64-byte direct-store opera- tion to the destination address. The load operation follows normal read ordering based on source address memory- type. The direct-store is implemented by using the write combining (WC) memory type protocol for writing data. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corre- sponding cache line from memory into the cache hierarchy. If the destination address is cached, the line is written- back (if modified) and invalidated from the cache, before the direct-store.
Unlike stores with non-temporal hint which allow UC/WP memory-type for destination to override the non-temporal hint, direct-stores always follow WC memory type protocol irrespective of destination address memory type (including UC/WP types). Unlike WC stores and stores with non-temporal hint, direct-stores are eligible for imme- diate eviction from the write-combining buffer, and thus not combined with younger stores (including direct-stores) to the same address. Older WC and non-temporal stores held in the write-combing buffer may be combined with younger direct stores to the same address. Direct stores are weakly ordered relative to other stores. Software that desires stronger ordering should use a fencing instruction (MFENCE or SFENCE) before or after a direct store to enforce the ordering desired.
There is no atomicity guarantee provided for the 64-byte load operation from source address, and processor imple- mentations may use multiple load operations to read the 64-bytes. The 64-byte direct-store issued by MOVDIR64B guarantees 64-byte write-completion atomicity. This means that the data arrives at the destination in a single undivided 64-byte write transaction.
Availability of the MOVDIR64B instruction is indicated by the presence of the CPUID feature flag MOVDIR64B (bit
28 of the ECX register in leaf 07H, see "CPUID - CPU Identification" in the Intel® 64 and IA-32 Architectures Soft- ware Developer's Manual, Volume 2A).
DEST := SRC;
MOVDIR64B void _movdir64b(void *dst, const void* src)
1. The Mod field of the ModR/M byte cannot have value 11B.
#GP(0) |
For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. If address in destination (register) operand is not aligned to a 64-byte boundary. |
#SS(0) |
For an illegal address in the SS segment. |
#PF (fault-code) |
For a page fault. |
#UD |
If CPUID.07H.0H:ECX.MOVDIR64B[bit 28] = 0. If LOCK prefix is used. |
#GP |
If any part of the operand lies outside the effective address space from 0 to FFFFH. If address in destination (register) operand is not aligned to a 64-byte boundary. |
#UD |
If CPUID.07H.0H:ECX.MOVDIR64B[bit 28] = 0. If LOCK prefix is used. |
Same exceptions as in real address mode. |
|
#PF (fault-code) |
For a page fault. |
Same exceptions as in protected mode.
#SS(0) |
If memory address referencing the SS segment is in non-canonical form. |
#GP(0) |
If the memory address is in non-canonical form. If address in destination (register) operand is not aligned to a 64-byte boundary. |
#PF (fault-code) |
For a page fault. |
#UD |
If CPUID.07H.0H:ECX.MOVDIR64B[bit 28] = 0. If LOCK prefix is used. |