top | item 41676545

(no title)

isotypic | 1 year ago

Why does pretraining or not matter in the ISPD 2023 paper? The circuit_training repo, as noted in the rebuttal of the rebuttal by the ISPD 2023 paper authors, claims training from scratch is "comparable or better" than fine-tuning the pre-trained model. So no matter your opinion on the importance of the pretraining step, this result isn't replicable, at which point the ball is in Google's court to release code/checkpoints to show otherwise.

discuss

negativeonehalf|1 year ago

The quick-start guide in the repo that said you don't have to pre-train for the sample test case, meaning that you can validate your setup without pre-training. That does not mean you don't need to pre-train! Again, the paper talks at length about the importance of pre-training.

marcinzm|1 year ago

This is what the repo says:

>Results >Ariane RISC-V CPU >View the full details of the Ariane experiment on our details page. With this code we are able to get comparable or better results training from scratch as fine-tuning a pre-trained model.

The paper includes a graph showing that it takes longer for Ariane to train without pre-training however the results in the end are the same.

anna-gabriella|1 year ago

That does not mean you need to pre-train either. Common sense, no?