The amount of capital needed to train these high-quality models is eye watering (not to mention the costs needed to acquire the data). Does anyone know of any well capitalized startups exploring this space?
The data here is effectively free. I don't think they would exhaust The Pile, which you can download for free. This is also true for text2image models like DALL-E 2: while OA may have invested in its own datasets, everyone else can just download LAION-400M (or if they are really ambitious, LAION-5B https://laion.ai/laion-5b-a-new-era-of-open-large-scale-mult... ).
OpenAI would be the best example. However these large language models also have limited business value today, making an startup a speculative bet that the team will beat Google/FB/AI/Academics at making a language model and find a viable business model for the resulting model.
I'd take one of those bets or the other, both are tough to pull off. Considering that the first task of such a startup would be to hand ~100-500MM to a hardware or cloud vendor I'd be hesitant to invest as an investor.
[+] [-] vackosar|4 years ago|reply
[+] [-] phoe18|4 years ago|reply
[+] [-] azinman2|4 years ago|reply
[+] [-] vackosar|4 years ago|reply
[+] [-] simulate-me|4 years ago|reply
[+] [-] visarga|4 years ago|reply
It's relative. It would cost more to open a 40 room hotel (about 320k/room), and hotels can't be copied like software.
[+] [-] gwern|4 years ago|reply
[+] [-] lumost|4 years ago|reply
I'd take one of those bets or the other, both are tough to pull off. Considering that the first task of such a startup would be to hand ~100-500MM to a hardware or cloud vendor I'd be hesitant to invest as an investor.
[+] [-] rafaelero|4 years ago|reply
[+] [-] vackosar|4 years ago|reply
[+] [-] PaulHoule|4 years ago|reply
[+] [-] vackosar|4 years ago|reply
[+] [-] imranq|4 years ago|reply