top | item 41703812

(no title)

johnsutor | 1 year ago

Seems like this is already being answered:

discuss

valine|1 year ago

Not really the first paper is just fine-tuning on synthetic data. The second paper doesn’t optimize the model weights.