top | item 39385142

(no title)

They explicitly address this in page 11 of the report. Basically perfect recall for up to 1M tokens; way better than GPT-4.

discuss

westoncb|2 years ago

I don't think recall really addresses it sufficiently: the main issue I see is answers getting "muddy". Like it's getting pulled in too many directions and averaging.

a_wild_dandan|2 years ago

I'd urge caution in extending generalizations about "muddiness" to a new context architecture. Let's use the thing first.

andy_ppp|2 years ago

Did you think the extraction of information from a the Buster Keaton film was muddy? I thought it was incredibly impressive to be this precise.

tcdent|2 years ago

Page 8 of the technical paper [1] is especially informative.

The first chart (Cumulative Average NLL for Long Documents) shows a deviation from the trend and an increase in accuracy when working with >=1M tokens. The 1.0 graph is overlaid and supports the experience of 'muddiness'.

[1] https://storage.googleapis.com/deepmind-media/gemini/gemini_...

moffkalast|2 years ago

[deleted]