elseless's comments

elseless | 7 months ago | on: Efficient Computer's Electron E1 CPU – 100x more efficient than Arm?

Agree that code size is a significant potential issue, and that going out to memory to reprogram the fabric will be costly.

Re: pointers, I should clarify that it’s not the indirection per se that causes problems — it’s the fact that, with (traditional) dynamic memory allocation, the data’s physical location isn’t known ahead of time. It could be cached nearby, or way off in main memory. That makes dataflow operator latencies unpredictable, so you either have to 1. leave a lot more slack in your schedule to tolerate misses, or 2. build some more-complicated logic into each CGRA core to handle the asynchronicity. And with 2., you run the risk that the small, lightweight CGRA slices will effectively just turn into CPU cores.

elseless | 7 months ago | on: Efficient Computer's Electron E1 CPU – 100x more efficient than Arm?

Sure. You can think of a (simple) traditional CPU as executing instructions in time, one-at-a-time[1] — it fetches an instruction, decodes it, performs an arithmetic/logical operation, or maybe a memory operation, and then the instruction is considered to be complete.

The Efficient architecture is a CGRA (coarse-grained reconfigurable array), which means that it executes instructions in space instead of time. At compile time, the Efficient compiler looks at a graph made up of all the “unrolled” instructions (and data) in the program, and decides how to map it all spatially onto the hardware units. Of course, the graph may not all fit onto the hardware at once, in which case it must also be split up to run in batches over time. But the key difference is that there’s this sort of spatial unrolling that goes on.

This means that a lot of the work of fetching and decoding instructions and data can be eliminated, which is good. However, it also means that the program must be mostly, if not completely, static, meaning there’s a very limited ability for data-dependent branching, looping, etc. to occur compared to a CPU. So even if the compiler claims to support C++/Rust/etc., it probably does not support, e.g., pointers or dynamically-allocated objects as we usually think of them.

[1] Most modern CPUs don’t actually execute instructions one-at-a-time — that’s just an abstraction to make programming them easier. Under the hood, even in a single-core CPU, there is all sorts of reordering and concurrent execution going on, mostly to hide the fact that memory is much slower to access than on-chip registers and caches.

elseless | 1 year ago | on: ZFS 2.3 released with ZFS raidz expansion

Precisely this. And don’t forget about bugs in virtualization layers/drivers — ZFS can very often save your data in those cases, too.

elseless | 2 years ago | on: Apple unveils M3, M3 Pro, and M3 Max

In my opinion, the biggest problem with Apple’s external displays is their 60 Hz refresh rate. That’s half of what their own iPhone (!) and MacBook pro models support, and is a far cry from the 240 Hz (albeit at lower resolutions) displays that are starting to pop up from other manufacturers.

elseless | 2 years ago | on: Ask HN: What is an A.I. chip and how does it work?

H.T. Kung’s 1982 paper on systolic arrays is the genesis of what are now called TPUs: http://www.eecs.harvard.edu/~htk/publication/1982-kung-why-s...

elseless | 3 years ago | on: MIT researchers uncover ‘unpatchable’ flaw in Apple M1 chips

It has been reviewed by the author’s peers — it was accepted to ISCA ‘22.

elseless | 4 years ago | on: Sandwell Bitcoin mine found stealing electricity

The Landauer principle states that any computation involving information erasure (e.g., taking the hash of blocks) must correspond to some nonzero increase in entropy, which would be expressed here as heat.

elseless | 4 years ago | on: SSD makers start warning that mining products like ChiaCoin will void warranty

Proof-of-stake can never achieve the same trust model as proof-of-work. Proof-of-space-and-time can.

See Andrew Poelstra ‘15: https://nakamotoinstitute.org/static/docs/on-stake-and-conse...

elseless | 6 years ago | on: Raspberry Pi 4

This script is really nice: https://github.com/Nyr/openvpn-install

It sets up systemd and iptables and generates all certs and keys and wraps them up into tidy, per-client .ovpn files

elseless | 7 years ago | on: Apple Turns Its Back on Customers and Nvidia with MacOS Mojave

I found this out the hard way a month ago when I updated my hackintosh from 10.13 to 10.14.

Never assume, even several weeks after a macOS release, that working Nvidia drivers will be available!

elseless | 7 years ago | on: ARM Details “Project Trillium” Machine Learning Processor Architecture

ARM needs to get some skin in the accelerator game before RISC-V et. al. commoditize its cash cow.

NVDLA is fairly permissively licensed (free for commercial use), but of course Nvidia will steer the greater ecosystem around it. Perhaps ARM can be the Red Hat to NVDLA's Linux, or something like that. Still seems a bit strange to me.

elseless | 8 years ago | on: CS349D Cloud Computing Technology, Autumn 2017

https://pastebin.com/CmxE4Z5Y

Maybe half of this year's final projects focused on serverless/AWS Lambda, with another significant portion on hardware acceleration for cloud workloads (AI, databases, SDN).

elseless | 11 years ago | on: More from the Sony Pictures hack

After what Sony did to Geohot, I must say that I have zero sympathy for them (as an organization) here. Obviously, the leak of personal data (SSNs, etc.) is a different story.

elseless | 11 years ago | on: Time Travel Simulation Resolves “Grandfather Paradox”

Scott Aaronson has done some work regarding CTCs and quantum computers: http://www.scottaaronson.com/papers/ctc.pdf

Apparently, in the presence of CTCs, quantum computers are no more powerful than classical ones.

elseless | 12 years ago | on: Kern Type, The Kerning Game

95/100. I thought the second-to-last was hardest. Serif/sans-serif didn't make as much of a difference as I expected.