dest = source
The movups instruction moves 4 single precision floating point values (32 bits each) from the source value (second operand) to the destination. The source and the destination can be an XMM register or a memory location. You can not use 2 memory addresses. The vmovups instruction can move 8 floats between YMM registers and memory.
This version does not require memory alignment. In the original designs movaps was faster than than movups (move unaligned). When in doubt use movups instead.
movups moves the values without inspection or conversion.
An XMM register is 128 bits total, while CPUs supporting AVX instructions have an additional 128 bits in each register accessible as YMM registers.
movups xmm1, xmm2 ; moves 4 floats from xmm2 to xmm1
; leaves the rest of xmm1 unchanged
movups xmm2, [x] ; moves 4 floats from variable x to xmm2
; leaves the rest of xmm2 alone
movups [y], xmm0 ; moves 4 floats from xmm0 to variable y
; moves precisely 128 bits
vmovups xmm1, xmm2 ; moves 4 floats from xmm2 to xmm1
; leaves the rest of xmm1 unchanged
vmovups ymm1, ymm2 ; moves 8 floats from ymm2 to ymm1
vmovups ymm2, [x] ; moves 8 floats from variable x to ymm2
vmovups [y], ymm0 ; moves 8 floats from ymm0 to variable y