top | item 7495774

(no title)

deletes | 12 years ago

Comment on the blog that got deleted:

I did a similar test in C and have gotten very similar results. When N is around 4000 the trashing version starts to differ substantially. A 3x difference can already be seen when N is 1000.

This means if your program is running on two threads over different parts of the matrix, every single iteration requires a request to RAM.

I'm skeptical over this part, I have tried to replicate this behavior but was unsuccessful. Even though cores are sharing L3, I doubt that a thread will overwrite the entire cache on every iteration.

discuss

lettergram|12 years ago

Author here, I have to manually approve comments, and did not delete this one (way too much spam not to do so), sorry. Different compilers and architecture will have different results, I explained that in a previous post.

Either way, you should see a noticeable difference as the size increases, which was the point.