(no title)
apienx | 1 year ago
Processing visual input is the current bottleneck for robots that want to make sense of the physical world. Glad somebody's looking into it (no pun intended). I just hope their plan is more sophisticated than throwing more computational power at the problem.
Jensson|1 year ago
Then you realize the limitation isn't the training data but the base model that was trained from hundreds of millions of years of evolution, and you start to see the real potential hurdle we have to clear.
thfuran|1 year ago
hn_acker|1 year ago
Unfortunate typo. You meant 10^15 bytes at the end.
Thanks to your citation I was able to find a podcast transcript [1] with Yann LeCun's explanation:
> If you talk to developmental psychologists and they tell you a four-year-old has been awake for 16,000 hours in his or her life, and the amount of information that has reached the visual cortex of that child in four years is about 10 to 15 bytes.
The transcript is missing "the" (10 to the 15 bytes). The corresponding timestamp in the podcast on YouTube is 4:48.
[1] https://lexfridman.com/yann-lecun-3-transcript
[2] https://www.youtube.com/watch?v=5t1vTLU7s40
tomjakubowski|1 year ago
HeyLaughingBoy|1 year ago
apinstein|1 year ago
So the lack of that sensor will cause the brain to develop poor representations of motion in 3d space.
How lack of those representations would affect other representations is less clear; because seeing the fusion between the LLM (which similarly doesn't have an embodied world model representation) and the robot AI (which presumable does) obviously works really well.
Now, it's possible that the 2 models are just inter-communicating between their own features (apple the concept and apple the image/object) and then being able to connect that together. The point of this meaning that there could be benefits from separate training and then post-training connection to bridge any gaps in learned representations.
However, I'd think that ultimately a model that can train simultaneously on more sensory input vs less will have a better/more efficient world model with more useful & interesting cross-connections between that space and applied uses in non-physical domains.
phlipski|1 year ago
iambateman|1 year ago
Regulators need to get ahead of this and establish a federal framework for safe robotic entrepreneurship.
For example…does the second amendment give me the right to have a drone which is capable of autonomously shooting a deer? There will be tens of millions of people who disagree on that point alone.
And then we need international agreements - much like nuclear - governing what is “fair game” for the public to have access to.
We must pursue a robot-enhanced future, carefully.
anon291|1 year ago
IANAL but it seems this would fall under running a human controlled robot with a gun, which I believe is illegal
gibsonf1|1 year ago
ben_w|1 year ago
If we could model visual streams accurately, fast, and at low compute cost, I think self-driving cars and autonomous mobile robots would be much more widely available.
TeMPOraL|1 year ago
smokel|1 year ago
Invictus0|1 year ago