Yeah, I wrote my CS dissertation on this. It started as me writing a GPGPU video codec (for a simplified h264), and turned into me writing an explanation of why this wouldn't work. I did get somewhere with a hybrid approach (use the GPU for a first pass without intra-frame knowledge, followed by a CPU SIMD pass to refine), but it wasn't much better than a pure CPU SIMD implementation and used a lot more power.
astrange|1 year ago