top | item 46792306

(no title)

codedokode | 1 month ago

I think it is possible to run CPU code on GPU (including the whole OS), because GPU has registers, memory, arithmetic and branch instructions, and that should be enough. However, it will be able to use only several cores from many thousands because GPU cores are effectively wide SIMD cores, grouped into the clusters, and CPU-style code would use only single SIMD lane. Am I wrong?

discuss

order

dancek|1 month ago

This seems correct to me. Of course you'd need to build a CPU emulator to run CPU code. A single GPU core is apparently about 100x slower than a single CPU core. With emulation a 1000x slowdown might be expected. So with a lot of handwaving, expect performance similar to a 4 MHz processor.

Obviously code designed for a GPU is much faster. You could probably build a reasonable OS that runs on the GPU.

codedokode|1 month ago

You don't need an emulator, you can compile into GPU machine code.

fulafel|1 month ago

GPUs having have thousands of cores is just a silly marketing newspeak.

They rebranded SIMD lanes "cores". For eaxmple Nvidia 5000 series GPUs have 50-170 SMs which are the equivalent of cpu cores there. So a more than desktops, less than bigger server CPUs. By this math each avx-512 cpu core has 16-64 "gpu cores".

zozbot234|1 month ago

170 compute units is still a crapload of em for a non-server platform with non-server platform requirements. so the broad "lots of cores" point is still true, just highly overstated as you said. plus those cores are running the equivalent of n-way SMT processing, which gives you an even higher crapload of logical threads. AND these logical threads can also access very wide SIMD when relevant, which even early Intel E-cores couldn't. All of that absolutely matters.

saagarjha|1 month ago

Each SM can typically schedule 4 warps so it’s more like 400 “cores” each with 1024-bit SIMD instructions. If you look at it this way, they clearly outclass CPU architectures.

JonChesterfield|1 month ago

Merely mislead by marketing. The x64 arch has 512bit registers and a hundred or so cores. The gpu arch has 1024bit registers and a few hundred SMs or CUs, being the thing equivalent to an x64 core.

The software stacks running on them are very different but the silicon has been converging for years.