top | item 42991547

(no title)

The fundamental problem here is lack of context - a human at your company reading that text would immediately know that Gorilla was not an insider term, and it’d stick out like a sore thumb.

But imagine a new employee eager to please - you could easily imagine them OK’ing the document and making the same assumption the LLM did - “why would you randomly throw in that word if it wasn’t relevant”. Maybe they would ask about it though…

Google search has the same problem as LLMs - some meanings of a search text cannot be de-ambiguified with just the context in the search itself, but the algo has to best-guess anyway.

The cheaper input context for LLMs get, and the larger the context window, the more context you can throw in the prompt, and the more often these ambiguities can be resolved.

Imagine in your gorilla in the step example, if the LLM was given the steps, but you also included the full text of slack/notion and confluence as a reference in the prompt. It might succeed. I do think this is a weak point in LLMs though - they seem to really, really not like correcting you unless you display a high degree of skepticism, and then they go to the opposite end of the extreme and they will make up problems just to please you. I’m not sure how the labs are planning to solve this…

discuss

No comments yet.