(no title)
neerajsi | 6 months ago
I dispute the search space problem for something like folding clothes. Like a lot of human actions in space, folding clothes and other motor tasks are hierarchical sequences of smaller tasks that are strung together, similar to a sentence or paragraph of text.
We can probably learn things from each other from few examples because we are leaning on a large library of subtasks that all have learned or which are innate, and the actual novel learning of sequencing and ordering is relatively small to get to the new reward.
I expect soon we'll get AIs that have part of their training be unsupervised rl in a physics simulation, if it's not being done already.
hexhowells|6 months ago
I disagree, you can model those tasks as hiearchical sequences of smaller tasks. But the terminal goal of folding clothes is to turn a pile of unfolded clothes into a neat pile of folded clothes.
The reason you would break down the task is because getting between those two states with the only reward signal being "the clothes are now folded" takes a lot of steps, and given the possible actions the robot can take, results in a large search space.