(no title)
ericjang | 2 years ago
When I joined Brain in 2016, I had thought the idea of training billion/trillion-parameter sparsely gated mixtures of experts was a huge waste of resources, and that the idea was incredibly naive. But it turns out he was right, and it would take ~6 more years before that was abundantly obvious to the rest of the research community.
Here's his scholar page (H index of 94) https://scholar.google.com/citations?hl=en&user=NMS69lQAAAAJ...
As a leader, he also managed the development of TensorFlow and TPU. Consider the context / time frame - the year is 2014/2015 and a lot of academics still don't believe deep learning works. Jeff pivots a >100-person org to go all-in on deep learning, invest in an upgraded version of Theano (TF) and then give it away to the community for free, and develop Google's own training chip to compete with Nvidia. These are highly non-obvious ideas that show much more spine & vision than most tech leaders. Not to mention he designed & coded large parts of TF himself!
And before that, he was doing systems engineering on non-ML stuff. It's rare to pivot as a very senior-level engineer to a completely new field and then do what he did.
Jeff certainly has made mistakes as a leader (failing to translate Google Brain's numerous fundamental breakthroughs to more ambitious AI products, and consolidating the redundant big model efforts in google research) but I would consider his high level directional bets to be incredibly prescient.
HarHarVeryFunny|2 years ago
I wonder if you know any of the history of exactly how TF's predecessor DistBelief came into being, given that this was during Andrew Ng's time at Google - who's idea was it?
The Pathways architecture is very interesting... what is the current status of this project? Is it still going to be a focus after the reorg, or too early to tell ?
ericjang|2 years ago
DistBelief was tricky to program because it was written all in C++ and Protobufs IIRC. The development of TFv1 preceded my time at Google, so I can't comment on who contributed what.
panabee|2 years ago
1. what was the reasoning behind thinking billion/trillion parameters would be naive and wasteful? perhaps part are right and could inform improvements today.
2. can you elaborate on the failure to translate research breakthroughs, of which there are many, into ambitious AI products? do you mean commercialize them, or pursue something like alphafold? this question is especially relevant. everyone is watching to see if recent changes can bring google to its rightful place at the forefront of applied AI.