top | item 40519031

(no title)

trsohmers | 1 year ago

Founder of REX Computing here; I highly recommend checking out my interview on the Microarch Club podcast linked elsewhere on the thread; will also answer questions on this thread if anyone has them.

discuss

order

crest|1 year ago

The teaser reminds me a lot of other *failed* high performance/high efficiency architecture redesigns that failed because of the unreasonable effort required to squeeze out a useful fraction of the promised gains e.g. Transputer and Cell. Can you link to written documentation of how existing code can be ported? I doubt you can just recompile ffmpeg or libx264, but level of toolchain support can early adopters expect? Does it require manually partitioning code+data and mapping it to the on-chip network topology?

trsohmers|1 year ago

We had a basic LLVM backend that supported a slightly modified clang frontend and a basic ABI. We tried to make it drastically easier for both the programmer and compiler to handle memory by having all memory (code+data) be part of a global flat address space across the chip, with guarantees being made to the compiler by the NoC on the latency of all memory accesses across one or multiple chips. We tested this with very small programs that could fit in the local memory of up to two chips (128KB of memory), but in theory it could have scaled up to the 64 bit address space limit. Compilation time for programs was long, but fully automated, specifically to improve upon problems faced by Cell and other scratchpad memory architectures… some of our original funding in 2015 from DARPA was actually for automated scratchpad memory management techniques on Texas Instruments DSPs and Cell (our paper: https://dl.acm.org/doi/pdf/10.1145/2818950.2818966)

This was all designed a decade ago, and REX has been in effectively hibernation since the end of 2017 after successfully taping out our 16 core test chip back in 2016, but being unable to raise additional funding to continue. I have continued to work on architectures that have leveraged scratchpad memories in different ways, including on cryptocurrency and machine learning ASICs, including at my current startup, Positron AI (https://positron.ai)

suranyami|1 year ago

This is inspiring stuff.

Was extremely interested in the Inmos Transputer in the 80s. Seems like an idea way ahead of its time… a bit like REX.

I find the parallels in design with the Actor concurrency model of Erlang, Elixir and Transputer/REX are very compelling.

Really hope something happens with this project or some spinoff from it.

The current interest in RISC-V is testament to the fact that it may still be viable.

I wish you great success. Wish there was a way to sponsor or crowd-source fund this.

CalChris|1 year ago

VLIW has low latency. Why is that important for an inference engine?