elseless | 7 months ago | on: Efficient Computer's Electron E1 CPU – 100x more efficient than Arm?
elseless's comments
elseless | 7 months ago | on: Efficient Computer's Electron E1 CPU – 100x more efficient than Arm?
The Efficient architecture is a CGRA (coarse-grained reconfigurable array), which means that it executes instructions in space instead of time. At compile time, the Efficient compiler looks at a graph made up of all the “unrolled” instructions (and data) in the program, and decides how to map it all spatially onto the hardware units. Of course, the graph may not all fit onto the hardware at once, in which case it must also be split up to run in batches over time. But the key difference is that there’s this sort of spatial unrolling that goes on.
This means that a lot of the work of fetching and decoding instructions and data can be eliminated, which is good. However, it also means that the program must be mostly, if not completely, static, meaning there’s a very limited ability for data-dependent branching, looping, etc. to occur compared to a CPU. So even if the compiler claims to support C++/Rust/etc., it probably does not support, e.g., pointers or dynamically-allocated objects as we usually think of them.
[1] Most modern CPUs don’t actually execute instructions one-at-a-time — that’s just an abstraction to make programming them easier. Under the hood, even in a single-core CPU, there is all sorts of reordering and concurrent execution going on, mostly to hide the fact that memory is much slower to access than on-chip registers and caches.
elseless | 1 year ago | on: ZFS 2.3 released with ZFS raidz expansion
elseless | 2 years ago | on: Apple unveils M3, M3 Pro, and M3 Max
elseless | 2 years ago | on: Ask HN: What is an A.I. chip and how does it work?
elseless | 3 years ago | on: MIT researchers uncover ‘unpatchable’ flaw in Apple M1 chips
elseless | 4 years ago | on: Sandwell Bitcoin mine found stealing electricity
elseless | 4 years ago | on: SSD makers start warning that mining products like ChiaCoin will void warranty
See Andrew Poelstra ‘15: https://nakamotoinstitute.org/static/docs/on-stake-and-conse...
elseless | 6 years ago | on: Raspberry Pi 4
It sets up systemd and iptables and generates all certs and keys and wraps them up into tidy, per-client .ovpn files
elseless | 7 years ago | on: Apple Turns Its Back on Customers and Nvidia with MacOS Mojave
Never assume, even several weeks after a macOS release, that working Nvidia drivers will be available!
elseless | 7 years ago | on: ARM Details “Project Trillium” Machine Learning Processor Architecture
NVDLA is fairly permissively licensed (free for commercial use), but of course Nvidia will steer the greater ecosystem around it. Perhaps ARM can be the Red Hat to NVDLA's Linux, or something like that. Still seems a bit strange to me.
elseless | 8 years ago | on: CS349D Cloud Computing Technology, Autumn 2017
Maybe half of this year's final projects focused on serverless/AWS Lambda, with another significant portion on hardware acceleration for cloud workloads (AI, databases, SDN).
elseless | 11 years ago | on: More from the Sony Pictures hack
elseless | 11 years ago | on: Time Travel Simulation Resolves “Grandfather Paradox”
Apparently, in the presence of CTCs, quantum computers are no more powerful than classical ones.
elseless | 12 years ago | on: Kern Type, The Kerning Game
Re: pointers, I should clarify that it’s not the indirection per se that causes problems — it’s the fact that, with (traditional) dynamic memory allocation, the data’s physical location isn’t known ahead of time. It could be cached nearby, or way off in main memory. That makes dataflow operator latencies unpredictable, so you either have to 1. leave a lot more slack in your schedule to tolerate misses, or 2. build some more-complicated logic into each CGRA core to handle the asynchronicity. And with 2., you run the risk that the small, lightweight CGRA slices will effectively just turn into CPU cores.