(no title)
emcq | 1 year ago
If you account for time loading from disk, the C implementation would be more like ~5s as reported in the blog post [1]. Speculating that their laptop's SSD may be in the 3GB/s range, perhaps there is another second or so of optimization left there (which would roughly work out to the 1.4s in-memory time).
Because you have a lot of variable width row reads this will be more difficult on a GPU than CPU.
pama|1 year ago
emcq|1 year ago
Let's say I create a "cache" where I store the min/mean/max output for each city, mmap it, and read it at least once to make sure it is in RAM. If the cache is available I simply write it to standard out. I use whatever method to compute the first run, and I persist it to disk and then mmap it. The first run could take 20 hours and gets discarded.
By technicality it might fit the rules of the original request but it isn't an interesting solution. Feel free to submit it :)
ww520|1 year ago
pama|1 year ago