top | item 45696688 (no title) Mehvix | 4 months ago >None of that is specialized to run only transformers at this pointisn't this what [etched](https://www.etched.com/) is doing? discuss order hn newest imtringued|4 months ago Only being able to run transformers is a silly concept, because attention consists of two matrix multiplications, which are the standard operation in feed forward and convolutional layers. Basically, you get transformers for free. kadushka|4 months ago devil is in the details
imtringued|4 months ago Only being able to run transformers is a silly concept, because attention consists of two matrix multiplications, which are the standard operation in feed forward and convolutional layers. Basically, you get transformers for free. kadushka|4 months ago devil is in the details
imtringued|4 months ago
kadushka|4 months ago