Learning Dexterity

[+] pmohun|7 years ago|reply

Fascinating:

"We observed that for precision grasps, such as the Tip Pinch grasp, Dactyl uses the thumb and little finger. Humans tend to use the thumb and either the index or middle finger instead. However, the robot hand’s little finger is more flexible due to an extra degree of freedom, which may explain why Dactyl prefers it. This means that Dactyl can rediscover grasps found in humans, but adapt them to better fit the limitations and abilities of its own body."

The learning of "emergent" behavior, specifically when it creates improvements to natural human motion is one of the main reasons why this type of work is so important. Similar to the way that we imitate design from nature (e.g. wings, suction cups), we can now accelerate development by observing how the bots perform the task in a variety of environments

[+] sgillen|7 years ago|reply

It's a really cool phenomena, it also means we have to make our simulations better. These sorts of RL algorithms are so good at finding "exploits" in the physics engine they are in that help them "cheat" sometimes, compared to what the researcher wanted.

[+] stcredzero|7 years ago|reply

This means that Dactyl can rediscover grasps found in humans, but adapt them to better fit the limitations and abilities of its own body.

Then why not evolve commensurate dexterity with mechanically simpler manipulators than a human hand? I'd bet that a robot with 4 arms with 4 different specialist manipulators and a few specialist tools (like a generalized screw-threading tool) could eventually be more efficient than a human with 2 human arms. You can even see this happening in real life, powered by human brains. The kind of dexterity people can get out of a crude instrument like a backhoe is very impressive.

The simpler device will always have an economic advantage.

[+] dokem|7 years ago|reply

Does anyone know a good graduate program/route for this kind of work? My undergrad was CS with some experience in (dumb) robotics and mechanical design but no ML. I am interested in applying ML/CV to physical systems like this however am I bit weary of going back to a CS program. I have seen some Mechanical programs with an emphasis on control that let you 'build your own degree'. If I could take a mix of ML/CV, control systems, kinematics I would be happy. Just looking for some input from people in this field.

[+] gdb|7 years ago|reply

(I work at OpenAI.)

Worth noting: it's a well-supported route to join OpenAI without any special graduate training. Many of our teams (including our robotics team!) hire experienced software engineers, teaching them whatever ML they need to know, or our Fellows program lets people do a more formal curriculum (https://blog.openai.com/openai-fellows/). We also have a number of software engineers who focus on what looks like traditional software engineering: see for example https://www.youtube.com/watch?v=UdIPveR__jw.

See our open positions here: http://openai.com/jobs!

[+] andreyk|7 years ago|reply

I am a 2nd year Phd at Stanford and you could definitely do such work here! Also at CMU, Georgia Tech, Berkeley, U Washington, and others. You can enter PhD via EE/MechE or CS - once you are focused on research it does not matter much. The FAIR/Google Residency programs may also be of interest.

[+] chubot|7 years ago|reply

Honest question: In the video, it looks like it works, but performs worse than about 90% of humans at the task of rotating a cube.

On the other hand, Alpha Go or even a rudimentary chess program does better than 99.99% of all humans.

So is it fair to say that deep learning is fundamentally missing something that humans do? Or that chess and Go are "easy" problems in some sense?

(It seems like with "unlimited" training hours it could eventually be better than a human? Or is that a hardware issue?)

[+] dgreensp|7 years ago|reply

In 2015, it was commonly thought that it would still be decades before a computer could beat a top human player at Go. Now, you are calling it “easy,” because it’s been done.

The first chess program was written by Alan Turing on paper between 1948 and 1950. He didn’t have a computer to run it, but he could still play a game with it by stepping through the algorithm by hand. In 1997, Deep Blue beat Kasparov, using traditional algorithms and not deep learning.

Clearly there are differences between these problems and dexterity. Chess, for example, can be described relatively simply using logic, and there is no dynamic or physical element; a rudimentary player can be written using pencil and paper; a winning player just needs enough compute power, apparently.

More importantly, there is a technology curve. You are asking about the ultimate limits of a technique moments after its first success puts it at the low end of the spectrum of human ability. Give it a decade or two.

I am just shocked the video was real-time and not sped up like so many of these videos are (eg watch a robot arm fold a shirt in thirty seconds when you play it at 5x speed).

[+] YeGoblynQueenne|7 years ago|reply

>> So is it fair to say that deep learning is fundamentally missing something that humans do?

Yes, it's missing the ability to generalise from its training examples to unseen data and to transfer acquired knowledge between tasks.

Like you say, the article describes an experiment where a robot hand learned to manipulate a cube. A human child that had learned to manipulate a cube that well would also be able to manipulate a ball, a pyramid, a disk and, really, any other physical object of any shape or dimensions (respecting the limits of its own size).

By contrast, a robot that has learned to manipulate cubes via deep learning, can only manipulate cubes and will never be able to manipulate anything but cubes, unless it's trained to manipulate something else, at which point it will forget how to manipulate cubes.

That's the fundamental ability that deep learning is missing, that humans have.

[+] QML|7 years ago|reply

(Before beginning, I want to note that these are solely my opinions, and therefore are probably wrong.)

In the space of possible problems solvable by computers, there are those of which are "easy" and those of which are "hard".

Arbitrarily defined, an "easy" problem is any problem that be solved by throwing more resources at it -- whether it'd be more data, or more compute. A "hard" problem on the other hand is the opposite: solvable only by a major, intellectual breakthrough; the benefit of solving a hard problem is that it allows us to do "more" with "less".

Now, the question is: which type of problems are being looked at by today's AI practitioners. I'd argue it is the former. Chess, Go, Dota 2 -- these are all "easy" problems. Why? Because it is easy to find or generate more data, to use more CPUs and GPUs, and to get better results.

Hell, I might even add self-driving cars to that list since they, along with neural networks, existed since the 1980s [1]. The only difference, it seems, is more compute.

All and all, I think these recent achievements only qualify themselves as engineering achievements -- not as theoretical or scientific breakthroughs. One way to put it: have we, not the computers and machines, learned something fundamentally different?

Maybe another approach to current ML / AI is needed? I remember a couple weeks ago there was a post on HN, about Judas Pearl advocating causality as an alternative [2]. Intuitively it makes sense: baby humans don't only perform glorified pattern matching, but they are able to discern cause-and-effect. Perhaps that is what today's AI practitioners are missing.

[1] https://en.wikipedia.org/wiki/History_of_autonomous_cars#198...

[2] https://news.ycombinator.com/item?id=17108179

[+] Jarwain|7 years ago|reply

Simulating the manipulation of a cube, and physics in general, is more complex than simulating a board game.

[+] oblio|7 years ago|reply

I can't find it right now, but there was a nice quote on Wikipedia about AI where they were saying that AI optimism stemmed from underestimating ordinary tasks. The AI researchers, being all from a STEM background, assumed that the hard problems were solving chess, go or math theorems, when in reality threading a needle or brushing your teeth requires a much, much more complicated model.

[+] yters|7 years ago|reply

Moravec's paradox.

[+] olingern|7 years ago|reply

I would say that the number of permutations (while many) in chess and go are finite, while this has to adapt to anything "at hand."

addendum: there are most definitely a greater number of environmental factors (x, y, z axes) involved in solving such problems.

[+] kingbirdy|7 years ago|reply

> Learning to rotate an object in simulation without randomizations requires about 3 years of simulated experience

It's interesting to me that this is about the same amount of time it takes humans to develop similar levels of motor control. I don't know enough about AI or neuroscience to say whether it's likely to be a coincidence or not, though.

[+] mxwsn|7 years ago|reply

It's not really meaningful.

Humans learn with entirely different stimuli and experience (can you imagine subjecting a person to learning this object manipulation task in the same way as this robot?).

In addition, Alpha Go Zero demonstrated an order of magnitude better efficiency in training time than Alpha Go due to only algorithmic differences. Humans are about converged to 3 year training time for babies learning these skills, but I doubt that these learning algorithms are as efficient as they will ever be.

[+] Ajedi32|7 years ago|reply

Interesting observation. I suspect it's probably coincidence though. Other tasks which humans are able to learn (such as [playing Dota][1]) have taken OpenAI much longer to master. OpenAI Five spends 180 years of training per day, per hero in order to learn Dota, and it still isn't at the level of professional players (though that may change soon).

Though I suppose you could argue that Dota benefits more from high-level reasoning, whereas basic motor control is a more intuitive skill. (And therefore better suited for this type of AI.)

[1]: https://blog.openai.com/openai-five/

[+] andreyk|7 years ago|reply

Basically coincidence; the models here are far simpler and do not at all mimic the human nervous system.

[+] runesoerensen|7 years ago|reply

Very cool. There's also a Times article about Dactyl: https://www.nytimes.com/interactive/2018/07/30/technology/ro...

[+] 0x8BADF00D|7 years ago|reply

> Rapid used 6144 CPU cores and 8 GPUs to train our policy, collecting about one hundred years of experience in 50 hours.

That seemed an order of magnitude higher than I expected. Is training usually this computationally expensive?

[+] boulos|7 years ago|reply

Disclosure: I work on Google Cloud.

Is it the cores that throws you off? (Those are used for the simulation, not the training). Second, I believe those were preemptible cores, so that's $60/hr for the cores and then $20/hr for V100s (which is what I think they used). $80/hr isn't bad considering how much a (small!) team of researchers costs.

[+] GlenTheMachine|7 years ago|reply

The notable thing about this is how little computation it needed. We have a long way to go, but this is a big step in the right direction.

[+] LeanderK|7 years ago|reply

haha I just read it and thought that's a magnitude less than I expected. Pretty often it is. A lot of papers from high profile institutions have a lot of computing power availiable.

First, it seems to be a lot and be really expensive, but think of it in man-hours. It quickly diminishes.

[+] cyberpunk0|7 years ago|reply

Not always but a lot of machine learning techniques now are effectively just brute forcing and highly expensive band wasteful to compute. Machine learning is getting close but I don't believe we will get there with our current methods

[+] sgillen|7 years ago|reply

Well if you look at the plot in the "Learning progress" section, you'll see that they did require almost two orders of magnitude more training time due to the randomizations they were adding to the simulation. Without these the policy isn't very robust but also takes a lot less time to train.

[+] sigi45|7 years ago|reply

Yeah but it doesn't matter; One model to rule them all.

[+] Animats|7 years ago|reply

Nice.

Take a look at position 44, where it seems to get stuck, with no move to make forward progress, and two fingers straight out. Did it lack image recognition to tell it what block rotation was needed?

It doesn't seem to work by discovering strategies for rotating the block one face at a time, then combining those. It's solving the problem as a whole. That has both good and bad implications.

[+] YeGoblynQueenne|7 years ago|reply

>> We’ve trained a human-like robot hand to manipulate physical objects with unprecedented dexterity.

To be precise, the "physical objects" appear to invariably be cubes of the same dimensions. Not arbitrary "physical objects". Which is probably the best that can be done by training only in a simulated environment.

[+] aeleos|7 years ago|reply

I am continually impressed by OpenAI, whenever we think that something is too difficult for our currently understanding of AI. With their Dota AI and this they have shown that more can be done with a lot less than previously thought.

[+] andreyk|7 years ago|reply

Not to be too negative, it's cool work, but I'd argue unlike the OpenAI result it is not so surprising this was doable with the techniques they used ; see eg this paper from Google http://www.roboticsproceedings.org/rss14/p10.pdf and this one from Stanford/DeepMind http://www.roboticsproceedings.org/rss14/p09.pdf . Yes there is the additional aspects of an object in hand, but fundamentally the techniques are the same.

Of course these works are cited in related works of paper as they should be; perhaps the OpenAI blog should also provide more context on where this stands wrt prior work, as many non-researchers may read this is may be quite misleading...

[+] hellofunk|7 years ago|reply

Holy cow, the robots are definitely coming. We really are at the ground floor of a technology that is going to change humanity, I am certain of that. Changes greater than any changes we've seen before.

[+] tomxor|7 years ago|reply

I guess someone has to be the negative one: I can't help feeling it's route to the correct face looks entirely accidental (and I don't mean that in a good way)... I'm sure it's "learned" some methods, but they don't look that efficient, reliable, purposeful or controlled. In a more noisy and dynamic environment I'd expect them to fail. Granted is possible these could be more due to training conditions than an inherent limitation of the underlying model.

[+] superfx|7 years ago|reply

It looks that way because they're moving rapidly from one face configuration to another. But there's no way that's happening by random. I would guess that even just holding the cube constant in a dynamic grip is quite difficult.

[+] toxik|7 years ago|reply

Agreed, it looks really uncoordinated. A lot of reinforcement learning algorithms have this problem, in my experience.

[+] dgreensp|7 years ago|reply

I agree it looks sloppy, but that doesn’t mean it isn’t reliable. All it has to do in any given moment is make progress towards the goal of having the cube in the proper orientation, on average. It may be that it can do that very reliably even with noisy inputs and outputs.

[+] poppingtonic|7 years ago|reply

Maybe if they randomized to n<=20 n-gons. I'd love to see Dactyl tackle a dodecahedron.

[+] andreyk|7 years ago|reply

Link to paper ( why no Arxiv :/ ): https://d4mucfpksywv.cloudfront.net/research-covers/learning...

TLDR (quick-ish skim, feel free to correct) they train a deep neural network to control a robot hand to choose desired joints state changes (binned into 11 discrete values; eg rotate this joint by 10 degrees) for a 20-joint hand given low-level (non-visual; so, current and desired 3D orientation of the object and exact numeric state of the joints) input of the state of a particular object and the hand. They also train a network to extract the 3D pose of a given object given RGB input. All this training is done in simulation with a ton of computation, and they use a technique called domain randomization (changing colors and textures and so friction coefficient and so on) to make these learned models pretty much work in the real world despite being trained only in simulation.

It's pretty cool work, but if I may pull my reviewer hat on not that interesting in terms of new ideas - still, it's cool OpenAI is continuing to demonstrate what can be achieved today with established RL techniques and nice distributed compute.

[+] bambax|7 years ago|reply

> a human-like robot hand

But why?? Why should robots' hands resemble human hands? They could have any number of fingers, or tentacles, or magnets, why should they be like human hands??

It seems "AI" really means "as close as possible to human behavior", even if we're not really that clever in said behavior.

Also, human intelligence being at least debatable, it's not obvious that the obsessive imitation of humans is the best way to attain "AI".

[+] dclowd9901|7 years ago|reply

The hand itself is an incredible piece of machinery.

[+] shady-lady|7 years ago|reply

Has the pricing come down on these robotic hands? Anybody have ballpark cost for the Shadow Dexterous Hand - 100k, 300k?

[+] elsewhen|7 years ago|reply

Pricing on robotic hands: http://www.androidworld.com/prod76.htm

[+] preparedzebra|7 years ago|reply

This is a great example of why AI innovation is not moving at the pace we are told to believe. This is using the same basic algorithms we've known about for decades, just more compute and differently formulated problems. We need a paradigm shift!

[+] unknown|7 years ago|reply

[deleted]

[+] habosa|7 years ago|reply

Any comments on why it seems to basically not use the middle finger at all?

[+] Symmetry|7 years ago|reply

That's very impressive. Robotic grasping is getting pretty good[1] but in-hand manipulation is a whole 'nother kettle of fish and this is really exciting.

[1] He said, tooting his employer's horn.

[+] Giho|7 years ago|reply

They should set up an accelerometer and gyroscope in each fingertip instead of pressure sensors. Could then maybe control without a camera.

[+] kylek|7 years ago|reply

I'd like to see it roll a coin on its knuckles. Or maybe some card tricks.

135 comments