top | item 40140180

(no title)

fpgamlirfanboy | 1 year ago

1024 fp16 macs is pretty good but the 128b vector datapath is weak sauce. on the other hand 2MB SRAM is legit. i wonder how many tiles (i don't think it's in the post).

discuss

order

ein0p|1 year ago

I think the main advantage of Movidius was its memory architecture. They had to disclose the HW arch for some sort of a tender in the EU which is how I stumbled upon their HW documentation years ago. Basically IIRC even though you’re right and 128 bit is weak sauce, the strength of that arch was that the memory was much “closer” to the cores and it was partitioned and accessed by multiple cores at the same time, boosting overall available bandwidth. The weakness of it was that it was (IMO) overly complicated for no good reason, and imposed a rather inflexible programmability model which if their software layer didn’t do certain things, you couldn’t do them at all. Which was a problem before transformers, because models tended to use a greater variety of ops