Im skeptical as well. The primary reason IMO is the software. How do you easily reconfigure your FPGA to efficiently run whatever computationally intensive and/or specialized algorithm you have?
It is doable. I've seen it during my Computer Engineering courses 14 years ago.
Basically you analyze the code for candidates, select a candidate, upload your custom hardware design, run your operation on the hardware, and repeat.
The difficult part is that uploading your hardware to FPGA is in the order of tenths of seconds, which is ages when compared to the nano and micro seconds your CPU works.
So your specific operation must be worthwhile to upload.
A bit of FPGA on your CPU makes it more flexible, for example your could set a profile such as 'crypto' or 'video' to add some specific hardware acceleration to you general purpose CPU.
Imagine your CPU being able to switch your embedded GPU into another CPU core.
Let's say the current zen 2 had an FPGA onboard. AMD could sell you an upgraded design with AV1 support for a few dollars. Most people aren't going to buy a new CPU on the basis of a video decoder, but they'll buy an upgrade to the chip that auto "installs" itself. That's a sale AMD otherwise wouldn't have made.
Also, for the way most modern CPUs are used: how do you task switch? If the hardware is large enough, you can deploy multiple configurations at a time, but does software support that? Is is possible to have relocatable configurations?
In theory, you could even page out code, but I guess the speed of that will be slow. Also, paging in probably would be challenging because the logical units aren’t uniform (if only because not all of them will be connected to external wires)
This can be used with a client-server model, that is if there are enough free cells and I/O available on FPGA it could let it install the configuration and then any application could communicate with it concurrently, maybe with some basic auth.
Even GPUs multitask all the time, even though it's less obvious. Cooperative multitasking in this context means setting up and executing different shaders/kernels. The overhead involved in this is quite manageable.
Repurposing FPGAs to different tasks means loading a new bitstream into the device every time. So it is much more efficient to grant exclusive access to each user of the device for long stretches od time. The proper pattern for that is more like a job queue.
I believe there is some amount of support in OpenCL for FPGAs. If only we could get companies to property support OpenCL, we'd have a nice software interface to pretty much any kind of compute resource on a machine.
nomercy400|5 years ago
Basically you analyze the code for candidates, select a candidate, upload your custom hardware design, run your operation on the hardware, and repeat.
The difficult part is that uploading your hardware to FPGA is in the order of tenths of seconds, which is ages when compared to the nano and micro seconds your CPU works. So your specific operation must be worthwhile to upload.
A bit of FPGA on your CPU makes it more flexible, for example your could set a profile such as 'crypto' or 'video' to add some specific hardware acceleration to you general purpose CPU.
Imagine your CPU being able to switch your embedded GPU into another CPU core.
hajile|5 years ago
Let's say the current zen 2 had an FPGA onboard. AMD could sell you an upgraded design with AV1 support for a few dollars. Most people aren't going to buy a new CPU on the basis of a video decoder, but they'll buy an upgrade to the chip that auto "installs" itself. That's a sale AMD otherwise wouldn't have made.
Someone|5 years ago
In theory, you could even page out code, but I guess the speed of that will be slow. Also, paging in probably would be challenging because the logical units aren’t uniform (if only because not all of them will be connected to external wires)
varispeed|5 years ago
rjsw|5 years ago
threatripper|5 years ago
gmueckl|5 years ago
Repurposing FPGAs to different tasks means loading a new bitstream into the device every time. So it is much more efficient to grant exclusive access to each user of the device for long stretches od time. The proper pattern for that is more like a job queue.
dragontamer|5 years ago
Where FPGAs win are new architectures, like Systolic engines. Entirely different computer designs from the ground up.
wtetzner|5 years ago
SSLy|5 years ago
numpad0|5 years ago