top | item 46206498

(no title)

mlpro | 2 months ago

They are not trained on the same data. Even a skim of the paper shows very disjoint data.

The LLMs are finetuned on very disjoint data. I checked some are on Chinese and other are for Math. The pretrained model provides a good initialization. I'm convinced.

discuss

order

No comments yet.