> However, the developers were soon to clarify that the 100x claim applies to just a single function, “not the whole of FFmpeg.”
So OP is correct. The 100x speed up is according to some misleading micro benchmark. The reason is that that transform is a huge amount of code and as OP said this will blow out the code cache while the amount of data you’re processing results in a blowout of the data cache. Net overall improvement might be 1% if even that.
And it's probable that the developer is comparing code compiled at -O0 (no optimization) against hand-coded assembler, like they did the last time they claimed a 90x speed up.
So just to to summary: either a 100x, or a 100% speedup (depending on which source)
- comparing hand-coded assembler vs. unoptimized C code.
- on a function that was poorly written in the first place.
- in code that's so rarely used that nobody could be bothered to fix it for decades.
- and even then, a tiny function whose overall CPU cost was about 2% of CPU cost to perform the obsolete task that nobody cared about enough to fix.
- so basically code that fails the profile before optimize rule, and should never have been optimized in the first place.
> The 100x speed up is according to some misleading micro benchmark.
Honestly though, nobody who has any idea how anything works would have expected ffmpeg to suddenly unearth a 100x speedup for everything. That's why the devs did not clarify this right away. It's too laughable of an assumption.
vlovich123|7 months ago
So OP is correct. The 100x speed up is according to some misleading micro benchmark. The reason is that that transform is a huge amount of code and as OP said this will blow out the code cache while the amount of data you’re processing results in a blowout of the data cache. Net overall improvement might be 1% if even that.
rerdavies|7 months ago
So just to to summary: either a 100x, or a 100% speedup (depending on which source)
- comparing hand-coded assembler vs. unoptimized C code.
- on a function that was poorly written in the first place.
- in code that's so rarely used that nobody could be bothered to fix it for decades.
- and even then, a tiny function whose overall CPU cost was about 2% of CPU cost to perform the obsolete task that nobody cared about enough to fix.
- so basically code that fails the profile before optimize rule, and should never have been optimized in the first place.
I think that covers it.
majewsky|7 months ago
Honestly though, nobody who has any idea how anything works would have expected ffmpeg to suddenly unearth a 100x speedup for everything. That's why the devs did not clarify this right away. It's too laughable of an assumption.
sgarland|7 months ago
fuzztester|7 months ago
thus sayeth the lord.
praise the lord!