Native Sparse Attention
139 points| CalmStorm | 8 months ago |aclanthology.org | reply
Here is the awards page: https://cspaper.org/topic/116/record-breaking-acl-2025-crown...
139 points| CalmStorm | 8 months ago |aclanthology.org | reply
Here is the awards page: https://cspaper.org/topic/116/record-breaking-acl-2025-crown...
[+] [-] noosphr|7 months ago|reply
I have a suspicion with how quiet all the major players got after the two weeks after deepseek R1 was released that they were reading and implementing everything in the papers that came with it as fast as humanly possible.
[+] [-] Art9681|7 months ago|reply
I applaud their open efforts. But being "altruistic" and being best are two different things.
[+] [-] nurettin|7 months ago|reply
[+] [-] CalmStorm|8 months ago|reply
[+] [-] sabaimran|7 months ago|reply
Isn't it very notable that the latency improvement didn't have a performance loss? I'm not super familiar with all the technical aspects, but that seems like it should be one of the main focuses of the paper.
[+] [-] ethan_smith|7 months ago|reply
[+] [-] laughingcurve|7 months ago|reply
[+] [-] visarga|7 months ago|reply
[+] [-] tony_borlini|7 months ago|reply
https://deep.liveblog365.com/en/index-en.html?post=50
[+] [-] pyuser583|7 months ago|reply
[+] [-] israrkhan|7 months ago|reply
[+] [-] laughingcurve|7 months ago|reply
[+] [-] gnabgib|7 months ago|reply
The awards page for ACL seems to disagree with this editorialized title: https://2025.aclweb.org/program/awards/
[+] [-] fourdnet|7 months ago|reply
[+] [-] ninjin|7 months ago|reply
https://aclanthology.org/2025.acl-long.1126