(no title)
emcq | 1 year ago
Let's say I create a "cache" where I store the min/mean/max output for each city, mmap it, and read it at least once to make sure it is in RAM. If the cache is available I simply write it to standard out. I use whatever method to compute the first run, and I persist it to disk and then mmap it. The first run could take 20 hours and gets discarded.
By technicality it might fit the rules of the original request but it isn't an interesting solution. Feel free to submit it :)
gunnarmorling|1 year ago
pama|1 year ago
This is a fun project for learning CUDA and I enjoyed reading about it——I just wanted to point out that the performance tuning in this case is really on the parsing, hashing, memory transfers, and IO. Taking IO out of the picture using specialized hardware or Linux kernel caching still leaves an interesting problem to solve and the focus should be on minimizing the memory transfers, parsing, and hashing.