When I replace unsafe vectorized code with safe, I quite often see non-contained addressing modes, example:
int Sum(int[] arr)
{
Span<int> span = arr;
Vector128<int> sum = default;
while (span.Length >= Vector128<int>.Count * 4)
{
var v1 = Vector128.Create(span.Slice(Vector128<int>.Count * 0));
var v2 = Vector128.Create(span.Slice(Vector128<int>.Count * 1));
var v3 = Vector128.Create(span.Slice(Vector128<int>.Count * 2));
var v4 = Vector128.Create(span.Slice(Vector128<int>.Count * 3));
sum += v1 + v2 + v3 + v4;
span = span.Slice(Vector128<int>.Count * 4);
}
return Vector128.Sum(sum);
}
loop codegen:
G_M19912_IG05: ;; offset=0x0015
vmovups xmm1, xmmword ptr [rax]
lea rdx, bword ptr [rax+0x10] ;; <---
vmovups xmm2, xmmword ptr [rdx]
lea rdx, bword ptr [rax+0x20] ;; <---
vmovups xmm3, xmmword ptr [rdx]
lea rdx, bword ptr [rax+0x30] ;; <---
vpaddd xmm1, xmm1, xmm2
vpaddd xmm1, xmm1, xmm3
vpaddd xmm1, xmm1, xmmword ptr [rdx]
vpaddd xmm0, xmm1, xmm0
add rax, 64
add ecx, -16
cmp ecx, 16
jge SHORT G_M19912_IG05
When I replace unsafe vectorized code with safe, I quite often see non-contained addressing modes, example:
loop codegen: