(no title)
Greenpants | 2 years ago
If someone here on HN has a link to a page that has helped them get to the Eureka-point of fully grasping attention layers, feel free to share!
Greenpants | 2 years ago
If someone here on HN has a link to a page that has helped them get to the Eureka-point of fully grasping attention layers, feel free to share!
juliangoldsmith|2 years ago
The short version (as I understand it) is that you use a neural network to weight pairs of inputs by their importance to each other. That lets you get rid of unimportant information while keeping what actually is important.