Intel is offering customer silicon on their latest architecture. So you could be ambitious and design your own block with as much SRAM as you can sacrifice other accelerator units for, depending on routing latency.
Edit: customer not custom, small difference but important if you're asking..
Custom / semi-custom is probably an old Altera FPGA kinda thing, which means verilog (or similar hardware-level programming languages), which is much more difficult to use in my experience.
I generally see the "compute" hierarchy as:
1. General Purpose CPUs -- easiest.
2. CPU acceleration -- AVX512, NEON, SVE, etc. etc. Specialized CPU instructions that only exist on certain versions of hardware.
3. General Purpose GPU -- OpenCL, DirectCompute, ROCm, CUDA. General purpose GPU instructions that work on a wide variety of hardware.
4. GPU accelerated units -- Raytracing, matrix-multiplication, bfloat16 support, wave intrinsics. I guess DirectX12 Ultimate has these but you definitely need to be checking to see if your hardware supports this before using them.
5. FPGAs -- Over here baby, but much much harder than #4 in practice.
-------
I know Intel has FPGA + Intel core chips, and that Microsoft has used them before for Bing search and other such projects. But no one ever told me it was easy.
dragontamer|3 years ago
I generally see the "compute" hierarchy as:
1. General Purpose CPUs -- easiest.
2. CPU acceleration -- AVX512, NEON, SVE, etc. etc. Specialized CPU instructions that only exist on certain versions of hardware.
3. General Purpose GPU -- OpenCL, DirectCompute, ROCm, CUDA. General purpose GPU instructions that work on a wide variety of hardware.
4. GPU accelerated units -- Raytracing, matrix-multiplication, bfloat16 support, wave intrinsics. I guess DirectX12 Ultimate has these but you definitely need to be checking to see if your hardware supports this before using them.
5. FPGAs -- Over here baby, but much much harder than #4 in practice.
-------
I know Intel has FPGA + Intel core chips, and that Microsoft has used them before for Bing search and other such projects. But no one ever told me it was easy.