top | item 23020222

A new state-of-the-art open source chatbot

36 points| olibaw | 5 years ago |ai.facebook.com

7 comments

order

drusepth|5 years ago

According to the "Get the code" link [1], it looks like these models need pretty huge GPUs to even interact with the pre-trained models. Is that abnormal? I was under the impression that training the model is generally what takes the beefy GPU, and then using that model requires more consumer-adjacent hardware. A P100 GPU is $3000 [2].

[1] https://parl.ai/projects/blender/

[2] https://www.amazon.com/dp/B06WV7HFWV/

rahimnathwani|5 years ago

These are very big models, like 100x to 300x the # parameters of resnet-50.

2.7bn parameters (for the smaller model) means you have to do 2.7bn calculations for a single step of the model. You could fit the model in main memory, but how long is it going to take you to run all those calculations on a CPU? And the full model will need to run multiple times to output a single sentence.

shermanmccoy|5 years ago

Boiling it all down, when prompted, these models just regurgitate a similar sentence to what is observed in the training data for loosely that same input, using some glorified curve fitting. This does not necessarily imply the model understands the meaning of what it is spitting out. So the uninitiated will be really impressed with this kind of toy.

The researchers here appear to have placed particular emphasis on cleaning up what the model is spitting out, but I think it's lipstick on a pig. The area begging for more research is parsing out the meaning of anything but the most simple sentence.

AndrewKemendo|5 years ago

>Boiling it all down, when prompted, these models just regurgitate a similar sentence to what is observed in the training data for loosely that same input, using some glorified curve fitting

This is not that much different than what you do.

What criteria would you use to determine if something understands the meaning of a word/phrase/concept that isn't a string of definitions and metaphors? And at what level is sufficient?

Attempting to prove that something "understands the meaning" is a fruitless task with no quantifiable criteria - much like proving something is "conscious."