I ran across https://dl.acm.org/doi/pdf/10.1145/72935.72950 a few weeks ago, it seems like a potential non-sweepline highly-parallel method. I've had some promising results for first doing a higher-dimensional Hilbert-sort (giving spatial locality), and then being able to prune a very large percentage of the quadratic search space. It might still be too slow on the GPU. I'm curious if you have any write-ups on things that have been explored, or if I'd be able to pick your brain some time!
No comments yet.