top | item 36869182

(no title)

nomand | 2 years ago

Does what you're saying mean you can only ask questions and get answers in a single step, and that having a long discussion where refinement of output is arrived at through conversation isn't possible?

discuss

krisoft|2 years ago

My understanding is that at a high level you can look at this model as a black box which accepts a string and outputs a string.

If you want it to “remember” things you do that by appending all the previous conversations together and supply it in the input string.

In an ideal world this would work perfectly. It would read through the whole conversation and would provide the right output you expect, exactly as if it would “remember” the conversation. In reality there are all kind of issues which can crop up as the input grows longer and longer. One is that it takes more and more processing power and time for it to “read through” everything previously said. And there are things like what jmiskovic said that the output quality can also degrade in perhaps unexpected ways.

But that also doesn’t mean that “ refinement of output is arrived at through conversation isn't possible”. It is not that black and white, just that you can run into troubles as the length of the discussion grows.

I don’t have direct experience with long conversations so I can’t tell you how long is definietly too long, and how long is still safe. Plus probably there are some tricks one can do to work around these. Probably there are things one can do if one unpacks that “black box” understanding of the process. But even without that you could imagine a “consolidation” process where the AI is instructed to write short notes about a given length of conversation and then those shorter notes would be copied in to the next input instead of the full previous conversation. All of these are possible, but you won’t have a turn-key solution for it just yet.

cjbprime|2 years ago

The limit here is the "context window" length of the model, measured in tokens, which will quickly become too short to contain all of your previous conversations, which will mean it has to answer questions without access to all of that text. And within a single conversation, it will mean that it starts forgetting the text from the start of the conversation, once the [conversation + new prompt] reaches the context length.

The kind of hacks that work around this are to train the model on the past conversations, and then rely on similarity in tensor space to pull the right (lossy) data back out of the model (or a separate database) later, based on its similarity to your question, and include it (or a summary of it, since summaries are smaller) within the context window for your new conversation, combined with your prompt. This is what people are talking about when they use the term "embeddings".

nomand|2 years ago

My benchmark is having a peer programming session spanning days and dozens of queries with ChatGPT where we co-created a custom static site generator that works really well for my requirements. It was able to hold context for a while and not "forget" what code it provided me dozens of messages earlier, it was able to "remember" corrections and refactors that I gave it and overall was incredibly useful for working out things like recurrence for folder hierarchies and building data trees. This kind and similar use-cases where memory is important, when the model is used as a genuine assistant.