(no title)
neatze
|
2 years ago
GPT has no sense, or care when it is wrong or right, such sense is only (arguably) driven by human through prompt interaction and throughout training of model, while humans and other animals able to update there's internal state just from single observation or interaction, and integrate future information with such single observation for very long time.
mewpmewp2|2 years ago
1. Take light input. Video/images.
2. Take sound input.
3. Touch, heat input.
And other inputs from the environment. Then there were mechanisms which could also be neural networks that will transform this data into more digestible way for GPT and GPT was also in addition specifically trained to act based on this input.
Then it would run in cycles, where it gets this input, and it will provide output on how it plans to react to the data, maybe every 100ms.
It then could also have a storage it can use, where it can store data as part of the output to later retrieve it again.
So it would be a set of modules that is controlled and interpreted by GPT.
It could then do all of that above, no? And all of it should be just a matter of implementing. The only near time challenges may be certain types of inaccuracies and or producing tokens in some cases might take too long time to have fast reaction time.
So basically you'll try to run as frequent cycles as possible with the inputs mentioned above, other neural networks identifying the objects, in many different ways and all the context about the environment, unless a new version of GPT becomes completely multi-modal.
And you run those loops, then GPT gives output what it wishes to do, e.g. store some fact for later usage, move there, move here, etc. Or retrieve some information using embeddings then decide again, and short term memory would just be this context sized window, and if it needs more it just looks into its own memory for embeddings.
neatze|2 years ago