(no title)
shihab | 1 month ago
That is, where does it truly make a difference to dispatch non-parallel/syscalls etc from GPU to CPU instead of dispatching parallel part of a code from CPU to GPU?
From the "Announcing VectorWare" page:
> Even after opting in, the CPU is in control and orchestrates work on the GPU.
Isn't it better to let CPUs be in control and orchestrate things as GPUs have much smaller, dumber cores?
> Furthermore, if you look at the software kernels that run on the GPU they are simplistic with low cyclomatic complexity.
Again, there's a obvious reason why people don't put branch-y code on GPU.
Genuinely curious what I'm missing.
ukoki|1 month ago
I need the heights on the GPU so I can modify the terrain meshes to fit the terrain. I need the heights on the CPU so I can know when the player is clicking the terrain and where to place things.
Rather than generating a heightmap on the CPU and passing a large heightmap texture to the GPU I have implemented the identical height generating functions in rust (CPU) and webgl (GPU). As you might imagine, its very easy for these to diverge and so I have to maintain a large set of tests that verify that generated heights are identical between implementations.
Being able to write this implementation once and run it on the CPU and GPU would give me much better guarantees that the results will be the same. (although necause of architecture differences and floating point handling they the results will never be perfect, but I just need them to be within an acceptable tolerance)
xmcqdpt2|1 month ago
If you wrote in open cl, or via intel libraries, or via torch or arrayfire or whatever, you could dispatch it to both CPU and GPU at will.
moron4hire|1 month ago
nicman23|1 month ago
one example is pme decomposition in gromacs.
storystarling|1 month ago
radarsat1|1 month ago
tucnak|1 month ago