How to make png encoding much faster? I'm working with large medical images and after a bit of work we can do all the needed processing in under a second (numpy/scipy methods). But then the encoding to png is taking 9-15secs. As a result we have to pre-render all possible configurations and put them on S3 b/c we can't do the processing on demand in a web request.Is there a way to use multiple threads or GPU to encode pngs? I haven't been able to find anything. The images are 3500x3500px and compress from roughly 50mb to 15mb with maximum compression (so don't say to use lower compression).
gred|4 years ago
Now zlib specifically is focused on correctness and stability, to the point of ignoring some fairly obvious opportunities to improve performance. This has led to frustration, and this frustration has led performance-focused zlib forks. The guys at AWS published a performance-focused survey [1] of the zlib fork landscape fairly recently. If your stack uses zlib, you may be able to find a way to swap in a different (faster) fork. If your stack does not use zlib, you may at least be able to find a few ideas for next steps.
[1] https://aws.amazon.com/blogs/opensource/improving-zlib-cloud...
lightcatcher|4 years ago
Although these don't directly solve the PNG encoding performance problem, maybe some of these ideas could help?
* if users will be using the app in an environment with plenty of bandwidth and you don't mind paying for server bandwidth, could you serve up PNGs with less compression? Max compression takes 15s and saves 35MB's. If the users have 50mbit internet, then it only takes 5.6s to transmit the extra 35MB, so you could come out 10s ahead by not compressing. (yes, I see your comment about "don't say to use lower compression", but no reason to be killed by compression CPU cost if the bandwidth is available).
* initially show the user a lossy image (could be a downsized png) that can be quickly generated. You could then upgrade to a full quality once you finish encoding the PNG, or if server bandwidth/CPU usage is an issue then you could only upgrade if the user clicks a "high-quality" button or something. If server CPU usage is an issue, the low then high quality approach could let you turn down the compression setting and save some CPU at the cost of bandwidth and user latency.
minhmeoke|4 years ago
[1] https://blender.stackexchange.com/questions/148231/what-imag...
[2] https://github.com/brion/mtpng
[3] https://emmaliu.info/15418-Final-Project/
Const-me|4 years ago
Not terribly hard if you only need 1-2 formats supported, e.g. RGBA8 only. You don't need to port the complete codec, only some initial portion of the pipeline and stream data back from GPUs, the last steps with lossless compression of the stream ain't a good fit for GPUs.
If you want the code to run on a web server, after you'll debug the encoder your next problem is where to deploy. NVidia teslas are frickin expensive. If you wanna run on public clouds, I'd consider their VMs with AMD GPUs.
pgroves|4 years ago
physicles|4 years ago
Here's what that custom format might look like.
(I'm guessing these images are gray scale, so the "raw" format is uint16 or uint32)
First, take the raw data and delta encode it. This is similar to PNG's concept of "filters" -- little processors that massage the data a bit to make it more compressible. Then, since most of the compression algorithms operate on unsigned ints, you'll need to apply zigzag encoding (this is superior to allowing integer underflow, as benchmarks will show).
Then, take a look at some of the dedicated integer compression algorithms. Examples: FastPFor (or TurboPFor), BP32, snappy, simple8b, and good ol' run length encoding. These are blazing fast compared to gzip.
In my use case, I didn't care how slow compression was, so I wrote an adaptive compressor that would try all compression profiles and select the smallest one.
Of course, benchmark everything.
bjornlouser|4 years ago
Maybe you could write the png without compression, compress chunks of the image in parallel using 7z, then reconstitute and decompress on the client side.
pgroves|4 years ago
primitivesuave|4 years ago
yboris|4 years ago
Another neat feature is that it's designed to be progressive, so you could host a single 10mb original file, and the client can download just the first 1mb (up to the quality they are comfortable with).
Take a look: https://jpegxl.info/
pgroves|4 years ago
dehrmann|4 years ago
pella|4 years ago