It's hard to tell if they're telling the truth about the number of GPUs they have. They open sourced the model and the inference is much more efficient than the best American models so it's not implausible that the training was also much more efficient.
No comments yet.