top | item 44457288

(no title)

ryeguy_24 | 8 months ago

Agree. Also, with respect to training, what is the goal that we are maximizing? LLMs are easy, predicting the next word and we have lots of training data. But what are we training for in real world? Modeling the next spatial photograph to predict things that will happen next? It’s not intuitive to me what that objective function would be in spatial intelligence.

discuss

kadushka|8 months ago

Why wouldn’t predicting the next frame in a video stream be as effective as predicting the next word?

curiouscavalier|8 months ago

Or that there is a sufficiently generalizable objective function for all “spatial intelligence.”