(no title)
cec | 2 years ago
We haven’t experimented with model size yet, we just used the same configuration as the smallest Code Llama. We did play with dataset size and found thah performance tracks the usual scaling laws. Details in the paper
cec | 2 years ago
We haven’t experimented with model size yet, we just used the same configuration as the smallest Code Llama. We did play with dataset size and found thah performance tracks the usual scaling laws. Details in the paper
quadrature|2 years ago