Good question. It's not just ibm14, but everything people outside Google tried shows that RL is much worse than prior methods. NVDLA, BlackParrot, etc. There is a strong possibility that Google pre-trained RL on certain TPU designs then tested in them, and submitted to Nature.
No comments yet.