top | item 46325705 (no title) arikrak | 2 months ago I wouldn't have expected there to be enough text from before 1913 to properly train a model, it seemed like they needed an internet of text to train the first successful LLMs? discuss order hn newest alansaber|2 months ago This model is more comparable to GPT-2 than anything we use now.
alansaber|2 months ago