top | item 39698258

(no title)

Note that the model is based on RoBERTa and has only 125m parameter. It is not competing against any of the new popular models, not even small ones like Phi or GeMMa.

discuss

jerpint|1 year ago

It’s also not meant to be a generative model - only to be used as an encoder model (they list retrieval as a potential use case )

3abiton|1 year ago

Given the current state of LLMs, I am not even sure this qualify to be called an LLM.

mistrial9|1 year ago

second opinion - BERT family are transformer-based, and that is a big threshold right there.. secondly I am not sure that two one-minute comments could capture what exactly went on with fine tuning or graph-based methods of constraint or whatnot.. with respect to the fitness of the production tools for intended purposes.