top | item 40825731

(no title)

hughleat | 1 year ago

Yep! No GCC on this one. And yep, that's not far off how the pretraining data was gathered - but with random optimisations to give it a bit of variety.

discuss

order

boomanaiden154|1 year ago

Do you have more information on how the dataset was constructed?

It seems like somehow build systems were invoked given the different targets present in the final version?

Was it mostly C/C++ (if so, how did you resolve missing includes/build flags), or something else?

hughleat|1 year ago

We plan to have a peer reviewed version of the paper where we will probably have more details on that. Otherwise we can't give anymore details than in the paper or post, etc. without going through legal which takes ages. Science is getting harder to do :-(