(no title)
grej
|
7 months ago
Related to this, is anyone aware whether there is a benchmark on this kind of thing - maybe broadly the category of “context rot”? To track things that are not germane to the current question adversely affecting the responses, as well as the volume of germane but deep context creating the inability of models to follow the conversation? I’ve definitely experienced the latter with coding models.
energy123|7 months ago
nijave|7 months ago