(no title)
mfalcon | 2 months ago
The first time I tried chatgpt that was the thing that surprised me most, the way it understood my queries.
I think that the spotlight is on the "generative" side of this technology and we're not giving the query understanding the deserved credit. I'm also not sure we're fully taking advantage of this funcionality.
ivansavz|2 months ago
I've tried several times to understand the "multi-head attention" mechanism that powers this understanding, but I'm yet to build a deep intuition.
Is there any research or expository papers that talk about this "understanding" aspect specifically? How could we measure understand without generation? Are there benchmarks out there specifically designed to test deep/nuanced understanding skills?
Any pointers or recommended reading would be much appreciated.