top | item 47127402

(no title)

kgeist | 7 days ago

Were those 16 mln sessions used only for alignment, chat format, reasoning, etc.? Or it's possible to train a base model too? If a single session is at least 32k tokens, then it's already 0.5 trillion tokens to train on, interesting.

discuss

order

No comments yet.