top | item 37551455

(no title)

primordialsoup | 2 years ago

This is very interesting work, but it's not really a LLM. It doesn't have language abilities. They should have called it a seq2seq model, but I think that term is not in vogue these days :)

discuss

order

cec|2 years ago

We use the same architecture as other LLMs, but we include no natural language in our pretraining. We figured a single-domain training corpus would make evaluation easier. We’ll be looking at layering this on top of something like Code Llama next