top | item 43314066 (no title) newgre | 11 months ago Why did the compiler even chose to fetch DWORDs only in the first place? It's unclear to me why the accumulator (apparently) determines the vectorization width? discuss order hn newest TinkersW|11 months ago The accumulator is a vector type, with 64 bit sum you can only fit 4 into a 256 bit register.After the loop it will do a horizontal add across the vector register to produce the final scalar result. unknown|11 months ago [deleted] unknown|11 months ago [deleted]
TinkersW|11 months ago The accumulator is a vector type, with 64 bit sum you can only fit 4 into a 256 bit register.After the loop it will do a horizontal add across the vector register to produce the final scalar result. unknown|11 months ago [deleted]
TinkersW|11 months ago
After the loop it will do a horizontal add across the vector register to produce the final scalar result.
unknown|11 months ago
[deleted]
unknown|11 months ago
[deleted]