The barrier to adoption of FPGAs is not so much the hardware issue as the toolchain issue. The toolchains are closed, frequently windows-only, slow, and not friendly to newbies.
(Not that shader language or CUDA is that accessible, but you can play with shaders in a browser now. The time to "hello world" or equivalent isn't too bad)
I argue the barrier to FPGA adoption is the actual use-case, not the tools. People who need them can use the tools just fine. FPGA designers are not idiots. They do not walk around scratching their heads wondering, "how come nobody is using our products? If only they were easier to use!" There is a new open-source attempt at replacing imcumbent tools bi-annually begun and dropped.
The use cases for FPGAs are a much harder impediment to adoption. Many people get "FPGA boners" when they even hear the word, fancying themselves "chip designers," but practical use cases are much rarer. As evidence, notice they predominate in the military world, where budget is less of an issue than the commercial world.
The technical issue with FPGAs is that they are still one level abstracted from any CPU. They are only valuable in problems where some algorithm or task can be done with specific logic more quickly than the CPU, given 1. the performance hit of reduced real estate and 2. reduced clock speed relative to a CPU and 3. more money than a CPU.
Further diminishing their value is that any function important enough to require an FPGA can more economically get absorbed into the nearest silicon. For example, consider the serial/deserial coding of audio/video codecs. That used to be done in FGPAs, but got moved into a standard bus (SPI) and moved into codecs and CPUs.
Because of this rarity, experienced engineers know that when an FPGA is introduced to the problem in practical reality, it's a temporary solution (most often to make time-to-market). This confers a degree of honor which is why people get so emotionally-aroused about FPGAs.
You can bet though, that if whatever search function Microsoft is running on those FPGAs proves to be useful, it will be soon absorbed into a more economical form, such as an ASIC, or, more likely, additional instructions to the CPU.
Really, installs on 1,600 servers such as this article reports, is not that impressive and certainly only a prototypical rollout.
Actually the build chain for both Xilinx and Altera works quite well in Linux, in fact I have heard that some consider it superior based on memory management issues.
Modelsim is available on Linux as well, but only their more expensive SE product. They charge a premium for Linux platform.
This gets even bigger if they throw their IP muscle behind it like they do with ICC. If you can get (for pay or free) fast matrix multiply, FFT, crypto, etc cores for the FPGA you will see even faster adoption.
If they're clever enough to make some of those IP cores available to say MATLAB adoption will be faster still.
Nothing sells hardware easier than "do no extra work but spend another couple of grand and see your application speed up significantly"
I don't know how much competition they're giving Xilinx, but Altera is doing the same thing with the same high performance ARM core: http://www.altera.com/devices/processor/soc-fpga/overview/pr... As the fine article notes, Intel is now doing some fabbing for Altera.
I've gotten the impression that putting a general purpose CPU in the corner of an FPGA was a pretty standard thing.
One of the things that should differentiate this new effort from Intel is FPGA "direct access to the Xeon cache hierachy and system memory" per "general manager of Intel's data center group, Diane Bryant".
I have always wanted to learn Verilog. However, I find it quite different from the typical programming language such as c or java. What is the best way for someone who has programming experience to learn Verilog?
The first thing to know is that Verilog is not a programming language. It is a hardware description language. This may sound picky, but it fundamentally changes the way you need to think about using the language. With Verilog/VHDL/HDL, you describe a circuit, which requires very different thinking to programming languages where you describe a sequence of instructions.
The other thing to know is that HDL languages are mostly the domain of electrical engineers and hence have suffered a lack of any "computer science" in them. The languages and all of the tools are clunky and reminiscent of 1970/1980's style programming when CS and EE diverged. Hence, do not expect to find decent online tutorials or freeware source code available. It's all locked up and proprietary as with all other EE tools.
To make either of these useful, you will need a hardware platform and some tools to play with. The DE1/2 is a reasonably priced entry board with plenty of lights, switches and peripherals to play with at a reasonable cost and is well matched with the text books above.
The best advice I ever got when I started to play with Verilog and FPGAs in undergrad was that one should think of the circuit first, then write the Verilog to describe it. As another poster said, this isn't programming; there are no usual sequential semantics (first compute this, then assign that value) even though code samples may look that way. The tricky (slash insanely cool) thing about HDLs is that they infer a lot of things -- latches, MUXes, ALUs -- out of a high-level description. But the abstraction is leaky, so you need to understand digital logic (state machines, latches, pipelines, ...) and then work up from there.
I guess what I'm really trying to say is, study digital logic first, then imagine the circuit you want to build, then write the Verilog that infers that circuit :-)
This may be huge. People will use the FPGA as they use the GPU now, but FPGA has the potential to greatly reduce the programming complexity associated with GPUs.
In the end, the success will boil down to how easy the development is, and how well designed the libraries will be - if the framwork will be capable to automatically reconfigure the hardware to offload CPU-intensive tasks, this has high tech potential for widespread adoption, not just datacenter-wise.
Can you elaborate on how FPGA has the potential to reduce programming complexity associated with GPUs? I personally think it is harder to program with Verilog than to program using CUDA.
If I recall correctly, Intel tried a few years ago to sell a system-on-chip combining an Atom CPU with a FPGA from Altera. I believe it didn't work very well, especially with regards to communication and synchronization between the two cores.
It didn't work very well, but there is a good reason: Nobody wanted a slow and comparatively low performance chip paired with a small FPGA connected via a (slow) PCI-express connect. There are hundreds of big FPGA boards with PCIe connectors that can be tied to big CPUs already. It was a non-product from the get-go.
Unlikely. Custom built Application Specific Integrated Circuits (ASICs) (i.e. bitcoin mining chips - e.g http://www.butterflylabs.com/) will always be faster than FPGAs (which are comparatively slow) and CPUs (which are fast, but general).
FPGAs are typically used in ASIC development to emulate the ASIC being developed. I've seen boards with 20 FPGAs emulate an ASIC design at <~1/10th of the speed at >>x10 power. While FPGAs are programmable hardware they are far less efficient than custom hardware for various reasons. Naturally ASIC emluation is an application where FPGAs have a very large advantage over software... At volume they're also a lot more expensive and good tools are also very expensive (virtually no mass produced commercial product uses FPGAs). Now obviously if the FPGA is inside the Xeon you're not really paying much more for it (except you lose whatever other function could be crammed in there).
Companies like Microsoft, Facebook, Google have enough servers to make a custom block inside Intel's CPU more attractive than an FPGA in terms of price/power/performance (and they can get that from ARM vendors which is probably scaring Intel).
CPU vendors have spent the last several decades moving more and more applications that used to be in the realm of custom hardware to the realm of software. There are certainly niches of highly parallelizable operations but a lot of general purpose compute is very well served by CPUs (and a lot of it is often memory bandwidth bound, not compute bound). Some of these niches have already been semi-filled through GPUs, special instructions etc.
The FPGA on the Xeon is almost certainly not going to have access to all the same interfaces that either a GPU or the CPU has and is only going to be useful for a relatively narrow range of applications.
I think what's going on here is that as the process size goes down simply cramming more and more cores into the chip makes less and less sense, i.e. things don't scale linearly in general. So the first thing we see is cramming a GPU in there which eventually also doesn't scale (and also isn't really a server thing). Now they basically have extra space and don't really know what to put in it. Also each of the current blocks (GPU, CPU) are so complicated that trying to evolve them is very expensive.
EDIT: Just to explain a little where I'm coming from here. I worked for a startup designing an ASIC where FPGAs were used to validate the ASIC design. I also worked on commercial products that included FPGAs for custom functions where the volume was not high enough to justify an ASIC and the problem couldn't be solved by software. I worked with DSPs, CPUs, various forms of programmable logic, SoCs with lots of different HW blocks etc. over a long long time so I'm trying to share some of my observations... If you think they're absolutely wrong I'd be happy to debate them.
EDIT2: Re-reading what I wrote it may sound like I am saying I am an ASIC designer. I'm not. I'm a software developer who has dabbled in hardware design and has worked in hardware design environments (i.e. the startup I worked for was designing ASICs but I was mostly working on related software).
FPGAs are terrible at emulating ASICs, but CPUs are even worse, yet FPGAs do excel at certain problems that can be expressed as programmable logic that operates in a massively parallel manner.
What if the Intel FPGA did have access to the same resources as a GPU? This isn't inconceivable, it's in the same socket as the CPU.
This gives you the ability to implement specialized algorithms related to compression, encryption, or stream manipulation in a manner that's way more flexible than a GPU can provide, and way more parallel than a CPU can handle.
YZF,why can't we start from an optimized FPGA - i.e. small memory blocks spread all around with massive bandwidth and low latency, and find a way to give decent enough access to the cpu to all that memory ?
And yes i know that the cpu will be the bottleneck, but it will be the bottleneck anyway.
[+] [-] pjc50|11 years ago|reply
(Not that shader language or CUDA is that accessible, but you can play with shaders in a browser now. The time to "hello world" or equivalent isn't too bad)
[+] [-] chillingeffect|11 years ago|reply
The use cases for FPGAs are a much harder impediment to adoption. Many people get "FPGA boners" when they even hear the word, fancying themselves "chip designers," but practical use cases are much rarer. As evidence, notice they predominate in the military world, where budget is less of an issue than the commercial world.
The technical issue with FPGAs is that they are still one level abstracted from any CPU. They are only valuable in problems where some algorithm or task can be done with specific logic more quickly than the CPU, given 1. the performance hit of reduced real estate and 2. reduced clock speed relative to a CPU and 3. more money than a CPU.
Further diminishing their value is that any function important enough to require an FPGA can more economically get absorbed into the nearest silicon. For example, consider the serial/deserial coding of audio/video codecs. That used to be done in FGPAs, but got moved into a standard bus (SPI) and moved into codecs and CPUs.
Because of this rarity, experienced engineers know that when an FPGA is introduced to the problem in practical reality, it's a temporary solution (most often to make time-to-market). This confers a degree of honor which is why people get so emotionally-aroused about FPGAs.
You can bet though, that if whatever search function Microsoft is running on those FPGAs proves to be useful, it will be soon absorbed into a more economical form, such as an ASIC, or, more likely, additional instructions to the CPU.
Really, installs on 1,600 servers such as this article reports, is not that impressive and certainly only a prototypical rollout.
[+] [-] mc_electron|11 years ago|reply
Modelsim is available on Linux as well, but only their more expensive SE product. They charge a premium for Linux platform.
[+] [-] kristoffer|11 years ago|reply
But closed source expensive tools are a problem.
[+] [-] wiredfool|11 years ago|reply
[0]http://www.anandtech.com/show/7334/a-look-at-alteras-opencl-...
[+] [-] msandford|11 years ago|reply
If they're clever enough to make some of those IP cores available to say MATLAB adoption will be faster still.
Nothing sells hardware easier than "do no extra work but spend another couple of grand and see your application speed up significantly"
[+] [-] rdrdss23|11 years ago|reply
MATLAB already have MATLAB->HDL, which works very well. We have a team that uses it exclusively for FPGA programming.
[+] [-] unknown|11 years ago|reply
[deleted]
[+] [-] xooyoozoo|11 years ago|reply
[+] [-] peterwwillis|11 years ago|reply
Scattered reports of maniacal cackling amid driving rain and lightning at Chipzilla's lab"
Is this just a Register thing, or do all UK rags use this kind of unprofessional hyperbole? It's literally the most annoying thing in the world.
[+] [-] rthomas6|11 years ago|reply
Zynq has been out and working in industry for a couple years now.
[+] [-] hga|11 years ago|reply
I've gotten the impression that putting a general purpose CPU in the corner of an FPGA was a pretty standard thing.
One of the things that should differentiate this new effort from Intel is FPGA "direct access to the Xeon cache hierachy and system memory" per "general manager of Intel's data center group, Diane Bryant".
[+] [-] nomnombunty|11 years ago|reply
[+] [-] deadgrey19|11 years ago|reply
The other thing to know is that HDL languages are mostly the domain of electrical engineers and hence have suffered a lack of any "computer science" in them. The languages and all of the tools are clunky and reminiscent of 1970/1980's style programming when CS and EE diverged. Hence, do not expect to find decent online tutorials or freeware source code available. It's all locked up and proprietary as with all other EE tools.
The best place is to start with a text book, this one (http://www.amazon.com/Fundamentals-Digital-Logic-Verilog-Des...) is a nice introduction to digital design with examples from Verilog.
Personally I prefer VHDL, and this fantastic introduction (http://www.amazon.com/Circuit-Design-VHDL-Volnei-Pedroni/dp/...)
To make either of these useful, you will need a hardware platform and some tools to play with. The DE1/2 is a reasonably priced entry board with plenty of lights, switches and peripherals to play with at a reasonable cost and is well matched with the text books above.
http://www.terasic.com.tw/cgi-bin/page/archive.pl?Language=E...
[+] [-] cfallin|11 years ago|reply
I guess what I'm really trying to say is, study digital logic first, then imagine the circuit you want to build, then write the Verilog that infers that circuit :-)
[+] [-] ddalex|11 years ago|reply
In the end, the success will boil down to how easy the development is, and how well designed the libraries will be - if the framwork will be capable to automatically reconfigure the hardware to offload CPU-intensive tasks, this has high tech potential for widespread adoption, not just datacenter-wise.
[+] [-] nomnombunty|11 years ago|reply
[+] [-] pbo|11 years ago|reply
[+] [-] deadgrey19|11 years ago|reply
[+] [-] mindcreek|11 years ago|reply
[+] [-] deadgrey19|11 years ago|reply
[+] [-] pling|11 years ago|reply
[+] [-] retroencabulato|11 years ago|reply
[+] [-] YZF|11 years ago|reply
FPGAs are typically used in ASIC development to emulate the ASIC being developed. I've seen boards with 20 FPGAs emulate an ASIC design at <~1/10th of the speed at >>x10 power. While FPGAs are programmable hardware they are far less efficient than custom hardware for various reasons. Naturally ASIC emluation is an application where FPGAs have a very large advantage over software... At volume they're also a lot more expensive and good tools are also very expensive (virtually no mass produced commercial product uses FPGAs). Now obviously if the FPGA is inside the Xeon you're not really paying much more for it (except you lose whatever other function could be crammed in there).
Companies like Microsoft, Facebook, Google have enough servers to make a custom block inside Intel's CPU more attractive than an FPGA in terms of price/power/performance (and they can get that from ARM vendors which is probably scaring Intel).
CPU vendors have spent the last several decades moving more and more applications that used to be in the realm of custom hardware to the realm of software. There are certainly niches of highly parallelizable operations but a lot of general purpose compute is very well served by CPUs (and a lot of it is often memory bandwidth bound, not compute bound). Some of these niches have already been semi-filled through GPUs, special instructions etc.
The FPGA on the Xeon is almost certainly not going to have access to all the same interfaces that either a GPU or the CPU has and is only going to be useful for a relatively narrow range of applications.
I think what's going on here is that as the process size goes down simply cramming more and more cores into the chip makes less and less sense, i.e. things don't scale linearly in general. So the first thing we see is cramming a GPU in there which eventually also doesn't scale (and also isn't really a server thing). Now they basically have extra space and don't really know what to put in it. Also each of the current blocks (GPU, CPU) are so complicated that trying to evolve them is very expensive.
EDIT: Just to explain a little where I'm coming from here. I worked for a startup designing an ASIC where FPGAs were used to validate the ASIC design. I also worked on commercial products that included FPGAs for custom functions where the volume was not high enough to justify an ASIC and the problem couldn't be solved by software. I worked with DSPs, CPUs, various forms of programmable logic, SoCs with lots of different HW blocks etc. over a long long time so I'm trying to share some of my observations... If you think they're absolutely wrong I'd be happy to debate them.
EDIT2: Re-reading what I wrote it may sound like I am saying I am an ASIC designer. I'm not. I'm a software developer who has dabbled in hardware design and has worked in hardware design environments (i.e. the startup I worked for was designing ASICs but I was mostly working on related software).
[+] [-] astrodust|11 years ago|reply
What if the Intel FPGA did have access to the same resources as a GPU? This isn't inconceivable, it's in the same socket as the CPU.
This gives you the ability to implement specialized algorithms related to compression, encryption, or stream manipulation in a manner that's way more flexible than a GPU can provide, and way more parallel than a CPU can handle.
[+] [-] hershel|11 years ago|reply
And yes i know that the cpu will be the bottleneck, but it will be the bottleneck anyway.