IBM scientists say radical new ‘in-memory’ architecture will speed up computers

[+] otakucode|8 years ago|reply

This is the promise of memristors. Despite innumerable articles being written about neuromorphic architectures like they'll be something miraculous, this ability to change from functioning as a bit of memory to being a bunch of functional logic on the fly at the speed of a memory read? That's going to be crazy. It will open up possibilities that we probably can't even imagine right now.

I've never understood why people don't get more excited about memristors. They could replace basically everything. Assuming someone can master their manufacture, they should be more successful than transistors. Of course, I'm still waiting to be able to buy a 2000 ppi display like IBM's R&D announced creating back in the late 1990s or so... so I guess I'd best not hold my breath.

[+] nyolfen|8 years ago|reply

> I've never understood why people don't get more excited about memristors.

personally, mostly because i've been seeing articles about how they're just about to totally upend computing for the last ten years

[+] farresito|8 years ago|reply

I can't talk for other people, but I personally don't try to get over-excited over anything unless I see a working proof of concept where the advantages are clear. It's kinda unfortunate, because there's great work being done in research that I should probably feel much more excited about, but I'm continuously bombarded with new discoveries that I never hear about again.

[+] tasty_freeze|8 years ago|reply

Transistors provide something that resistors, capacitors, inductors and memristors don't, and thus they can't be replaced. That thing is gain, ie, amplification (of either voltage or current). All the others are lossy, so if you imagine some building block, then cascade N of them in a row, eventually the signal will become too weak (at some N) to be useful.

[+] deepnotderp|8 years ago|reply

Well, first off, their endurance is currently not even good enough to replace DRAM, let alone replacing critical path transistors. They also take significantly more power to switch and are much slower. They also tend to be highly unreliable (they even have variable write time!) and hard to manufacture. Really, the only useful advantage they do have is their theoretically incredible density, but even most of this comes from the possibility of 3D stacking (also possible with transistors) and from the possibility of a crossbar memory, but crossbars have epic sneak path issues.

[+] coldtea|8 years ago|reply

>I've never understood why people don't get more excited about memristors

For the same reason they aren't excited about flying cars?

Because they wait until something is actually delivered that in a form they can use?

[+] BugsJustFindMe|8 years ago|reply

> I'm still waiting to be able to buy a 2000 ppi display like IBM's R&D announced creating back in the late 1990s or so...

You're not thinking of this and mistakenly adding an extra 0, are you? https://en.wikipedia.org/wiki/IBM_T220/T221_LCD_monitors#His...

[+] sliverstorm|8 years ago|reply

this ability to change from functioning as a bit of memory to being a bunch of functional logic on the fly at the speed of a memory read? That's going to be crazy. It will open up possibilities that we probably can't even imagine right now.

There might be clever ways to make a particular circuit dual purpose or something (Storage element that can do math on itself? A math pipeline that doesn't need pipeline flops?)

But arbitrary reprogrammability is already here. It's called an FPGA, and it sees only niche use, not in the least because developing a bitstream to program it with is a huge chunk of work. So the days of a chip that constantly morphs like a chameleon to unrecognizable new forms is probably a long way in the future.

I like to remind myself the industry is called VLSI- very large scale integration. In other words, the job is not about transistors. More than anything else, it's about managing the complexity of billions of transistors. Digital CMOS is a great example; transistors can already do so much more than CMOS, but it's a massively simplifying design scheme.

[+] xellisx|8 years ago|reply

Well with most claims that generalize computing power, such as the title of this, it's mostly aimed at something more specialized.

"The researchers believe this new prototype technology will enable ultra-dense, low-power, and massively parallel computing systems that are especially useful for AI applications."

So it seems that would benefit AI/Machine Learning more than GP processing, from that statement.

[+] NegativeK|8 years ago|reply

That's why. Memristor articles were all over the news for a bit, but have died down as either companies realized they're too hard to manufacturer or companies have quietly plodded toward making them competitive.

[+] jerf|8 years ago|reply

"this ability to change from functioning as a bit of memory to being a bunch of functional logic on the fly at the speed of a memory read?"

My excitement is tempered by considering the areal demands on the silicon for the putative "smart memory". Suppose, just for the sake of argument, you want your smart memory to be able to take a 4K block of 64-bit integers and add them together. It happens incredible quickly, sure, though you'd have to get an expert to tell me what the fanin can be to guess how many cycles this will take. But you're now looking at massively more silicon to arrange all that. And adding blocks of numbers isn't really that interesting, we really want to speed up matrix math. Assuming hard-coded matrix sizes, that's a whackload of silicon per operation you want to support; it's at least another logarithmic whackload factor to support a selection of matrix sizes and yet more again to support arbitrary sizes. In general, it's a whackload of silicon for anything you'd want to do, and the more flexible you make your "active memory" the more whackloads of silicon you're putting in there.

It may sound weird to describe a single addition set of gates as a "whackload", but remember you need to lay at least one down for each thing you want to support. If you want to be able to do these operations from arbitrary locations in RAM, it means every single RAM cell is going to need its own gates that implement every single "smart" feature you want to offer. Even just control silicon for the neural nets is going to add up. (It may not be doing digital stuff and it may be easier to mentally overlook, but it's certainly going to be some sort of silicon-per-cell.)

Even if you were somehow handed a tech stack that worked this way already, you'd find yourself pressured to head back to the architecture we already have, because the first thing you'd want to do is take all that custom silicon and isolate it to a single computation unit on the chip that the rest of the chip feeds, because then you've got a lot more space for RAM, and you can implement a lot more functionality without paying per memory cell. And with that, you come perilously close to what is already a GPU today. All that silicon dedicated to the tasks you aren't doing right now is silicon you're going to want to be memory instead, and anyone making the smart memory is going to have a hard time competing with people who offer an order of magnitude or three more RAM for the same price and tell you to just throw more CPUs/GPUs at the problem.

RAM that persists without power is much more interesting than smart memory.

[+] deepsun|8 years ago|reply

Maybe for the same reason as people don't get more excited about quantum computers -- they can replace basically everything as well, yet we don't see them around.

[+] AndrewKemendo|8 years ago|reply

I've never understood why people don't get more excited about memristors.

Probably because they have been promising them too long. I remember watching a video lecture in 2008 about how IBM was on the verge of revolutionizing everything with them.

Meanwhile in the same timespan GPU/DNN have upended the computing landscape.

[+] Illniyar|8 years ago|reply

I thought memristors is RAM that doesn't lose state when losing electricity.

Is this article about memristors? That wasn't clear. Can memristors perform computations?

[+] the_jeremy|8 years ago|reply

I was not technologically inclined in high school, but I distinctly remember a peer talking about how cool they were and how much they could be used to do. This article is the second time I've ever heard the word. So... I guess the fact that it's been 7 years and they're still just "going to do awesome things" is probably a good reason why.

[+] kensai|8 years ago|reply

The best thing about memristors is that actually this is how the brain really works (at an electronical equivalence). There is no von Neumann architecture in our cortex, rather neurons in networks which both compute and store memories.

[+] blennon|8 years ago|reply

"The result of the computation is also stored in the memory devices, and in this sense the concept is loosely inspired by how the brain computes."

For anyone who is interested in a simple model of how the brain does this, check out "associative memories". The basic idea is that networks of neurons both store memory (in their synapses) and perform the computations to retrieve or recall those memories.

A simple example is the Hopfield network, a single layer of neurons that are recurrently connected with a specific update function: https://en.wikipedia.org/wiki/Hopfield_network

Another is two layers of neurons that are reciprocally connected called a Bidirectional Associative Memory (BAM): https://en.wikipedia.org/wiki/Bidirectional_associative_memo...

edit: grammar

[+] ngold|8 years ago|reply

Really fascinating, the low energy aspect is intriguing.

[+] shmerl|8 years ago|reply

Weren't HP working on the similar idea? What happened to it?

UPDATE: Looks like memristors production didn't work out for them: https://www.extremetech.com/extreme/207897-hp-kills-the-mach...

[+] moh_maya|8 years ago|reply

https://arstechnica.com/science/2017/10/who-needs-a-cpu-phas...

I found this article very useful in understanding what the work was about..

They are using phase change memory to store data and perform computations.

[+] MikeDoesCode|8 years ago|reply

Didn't HP design and I think prototype something like this with memristors, calling it 'The Machine'?

edit So HP built one with 160Tb of memory, I remember it being proposed with memristors but haven't been able to check if the prototype used them... Does anyone know what is different about IBM's that let's them claim this as a first though?

[+] cmiles74|8 years ago|reply

As shmerl noted, The Machine turned out to be vaporware and they released something significantly different under the same name. HP couldn't mass produce memsistors. :-(

https://www.extremetech.com/extreme/207897-hp-kills-the-mach...

[+] blattimwind|8 years ago|reply

The prototype didn't even have non-volatile memory, because NV-DIMMs weren't available.

[+] unknown|8 years ago|reply

[deleted]

[+] PeachPlum|8 years ago|reply

Sounds very much like Content Addressable Parallel Processors such as the one that powered the Staran air traffic control system.

Caxton Foster's book is the major text I know on the subject.

https://en.wikipedia.org/wiki/Content_Addressable_Parallel_P...

[+] amelius|8 years ago|reply

What they describe sounds like a pipeline architecture or a systolic array, or a network of interconnected computers. None of these are new concepts from an architectural point of view, but the actual dimensioning could be new.

[+] naasking|8 years ago|reply

> None of these are new concepts from an architectural point of view, but the actual dimensioning could be new.

Depends what you mean by "architecture". I disagree that a network of computers is comparable to colocating a computing unit with its memory. There are orders of magnitude in difference in communication costs and failure modes, so at some point you just have to acknowledge that the models are fundamentally different, and should be treated as such.

Certainly they are Turing equivalent, so they aren't more "powerful" in a computability sense, we but what's more interesting is the tradeoffs in computational complexity.

[+] tboyd47|8 years ago|reply

Are these benchmarks unusual to anyone else? Things like changing the color of all the black pixels in a bitmap simultaneously and performing correlations on historical rainfall data. Is it because this technology is more suitable for certain types of computations?

[+] scientistem|8 years ago|reply

I think this is most analagous to simd in terms of issueing a single computation over multiple words of data, which would work well for both zeroing data and operating on scalar arrays.

[+] tim333|8 years ago|reply

I got a little confused by the

>scientists have developed the first “in-memory computing”

as your normal GPUs have register and cache memory mixed with the processors. I think the novel feature is they are mixing the processors with non volatile flash like memory rather than with RAM. Which I guess is interesting but the "will speed up computers by 200 times" presumable refers to an old school architecture rather than something like the 15/125 TFLOP Volta GPU which I'd imagine is faster than their thing. (https://news.ycombinator.com/item?id=14309756).

[+] YeGoblynQueenne|8 years ago|reply

Regarding the application of memristors to AI, here's a bit of a dissenting opinion:

  Unfortunately for neuromorphics, just about everyone else in the semiconductor
  industry—including big players like Intel and Nvidia—also wants in on the
  deep-learning market. And that market might turn out to be one of the rare cases
  in which the incumbents, rather than the innovators, have the strategic
  advantage. That’s because deep learning, arguably the most advanced software on
  the planet, generally runs on extremely simple hardware.

  Karl Freund, an analyst with Moor Insights & Strategy who specializes in deep
  learning, said the key bit of computation involved in running a deep-learning
  system—known as matrix multiplication—can easily be handled with 16-bit and even
  8-bit CPU components, as opposed to the 32- and 64-bit circuits of an advanced
  desktop processor. In fact, most deep-learning systems use traditional silicon,
  especially the graphics coprocessors found in the video cards best known for
  powering video games. Graphics coprocessors can have thousands of cores, all
  working in tandem, and the more cores there are, the more efficient the
  deep-learning network.

From:

https://spectrum.ieee.org/semiconductors/design/neuromorphic...

[+] unknown|8 years ago|reply

[deleted]

[+] rasz|8 years ago|reply

Micron had eval ram modules with buildin cellular automata for what, 10 years now?

http://www.micronautomata.com/research

[+] agumonkey|8 years ago|reply

For god's sake. I've been pitching this for years .. I can't sigh hard enough.

[+] zackmorris|8 years ago|reply

That's how I feel. I used FPGAs in the late 90s and wanted to try making a parallel chip with say 1024 cores and a few K of RAM per core and then program it with something like Erlang. Then the dot bomb happened, then the housing bomb, the Great Recession, and so on and so forth. The big players got more entrenched so everything was evolutionary instead of revolutionary and I'd say computer engineering (my major, never used) got set back 10-15 years.

But that said, I'm excited that 90s technology is finally being adopted by the AI world. I'm also hopeful that all these languages like Elixir, Go and MATLAB/Octave will let us do concurrent programming in ways that are less painful than say OpenCL/CUDA.

[+] samueloph|8 years ago|reply

There was a talk about this on this year's debconf

Delivering Software for Memory Driven Computing

https://debconf17.debconf.org/talks/206/

[+] em3rgent0rdr|8 years ago|reply

The concept of "Smart Memory" has been around for a while...from 2000 at least: https://dl.acm.org/citation.cfm?id=339673

[+] cmrdporcupine|8 years ago|reply

Watched the video, read the article, but I'm not entirely clear how 'in memory' components differ in principle from just having a CPU with very large register sets?

[+] m3kw9|8 years ago|reply

This will catch up to Moore’s law when it does come out one day

[+] partycoder|8 years ago|reply

Would this be classified as MIMD?

https://en.wikipedia.org/wiki/MIMD

[+] digi_owl|8 years ago|reply

Didn't early Palm Pilots have something they called run in place memory?

[+] yigitdemirag|8 years ago|reply

So sorry to find this thread on HN while intoxication. I need to write a blogpost about it in a couple of days. In short, these devices are more capable and realizable than most can imagine.

Disclaimer: Neuromorphic computing with PCRAM devices is my MSc and future PhD thesis topic.

[+] mhb|8 years ago|reply

If only there was room in the margin.

133 comments