Please profile tricks like this, as they may actually be significantly slower than their naive counterparts on modern hardware. The prefetcher knows what a linked list looks like, and it knows how to get it somewhere closer than main memory before the nodes are needed.
jacquesm|14 years ago
That's some pretty advanced magic.
Even the compiler has very limited insight into what your code is actually doing without simulating it, the prefetcher might be able to look a bit ahead in the execution stream and do branch prediction but absolutely no way does that extend to knowing stuff about your data structures.
Unless I have just been transported by a time warp I really think this is fiction.
If you're thinking of cache pre-fetching that actually has a really hard time dealing with stuff like linked lists because it has absolutely no idea at all about the data structure it is looking at. The 'next' and 'previous' pointers in the linked list might actually simply be values without any significance at all. And if they are dereferenced as pointers then that memory could be just about anywhere within the valid address range.
For arrays on the other hand such pre-fetching can be useful.
exDM69|14 years ago
Nope, the Intel guys and gals do this kinda magic day in, day out. Or at least the chips they manufacture do. I'm not familiar with the internals of the prefetcher of any CPU at this level, but let me wave hands here. This is what the prefetcher could do:
All it takes is for the prefetcher to get a cache line when requested, and then observe what is inside and look at pointer-sized and aligned values that look like pointers and are pretty close to the original cache line's (virtual) address. Now if these values happen to be sane virtual addresses in the current process, the prefetcher might as well fetch them one cache closer to the CPU. If it hits, it might yield big performance boost in real world apps. If it misses, it's just a little wasted electricity.
All modern CPUs do dirty little tricks like this if it helps them outshine their competitors.
Btw. you can add prefetch instructions in your code manually if you do linked list traversals or similar. In GCC you can use __builtin_prefetch() compiler intrinsic.
ssp|14 years ago
That seems unlikely to me. Which processors do this?
idunno246|14 years ago
forgotusername|14 years ago
In no circumstance will the CPU magically understand the semantics of some data structure.
zaptheimpaler|14 years ago
Hence, they have no capability to understand a structure like a linked list and will not cache it intelligently. All you can rely on is that they will cache arrays and traverse them in a way that efficiently uses the cache.
noselasd|14 years ago