top | item 32061298

(no title)

lol1lol | 3 years ago

I think you are missing the point.

The hackability quotient of AllenNLP is way low.

I'll give you specific examples where AllenNLP overdid it, while HuggingFace was better just by keeping it simple.

Vocabulary class. HuggingFace just used a python dictionary. I can't think of one person who said they needed higher level abstraction. Turns out a python dictionary is pickle-able, saving to a text file is one line code, while the AbstractSinglettonProxyVocabulary is not and no one wants to care in the first place.

Tokenizer class. HuggingFace just used a python dictionary to return strings and integers. I can't think of one person frustrated by it. It's printable, picklable, and everything in between people can fiddle with. And boy where do I start about AllenNLP's overdoing of Tokenizers.

Trainer class. vs. HuggingFace example scripts. The scripts are just much more readable, tweakable, debuggable etc. HF didn't bother with AbstractBaseTrainer class bs.

It just shows they never understood the playing field.

- First, I don't think anyone thought AllenNLP was a good choice for high performance production systems. Again HuggingFace clearly understood the problem and built a fast tokenizer in Rust.

- A math, physics, linguistics, or even CS PhD student who know basics of coding would prefer bare bone scripts. They just want to hack it off and focus on research. Writing good code is not their objective.

Just my opinion.

discuss

marvinalone|3 years ago

AllenNLP was written for research, not for production. Many of the design choices reflect that.

As far as the vocabulary goes, a lot of AllenNLP components are about experimenting with ways to turn text into vectors. Constructing the vocabulary is part of that. When pre-trained transformers became a thing, this wasn't needed anymore. That's part of why we decided to deprecate the library: Very few people experiment with how to construct vocabularies anymore, so we don't want to live with the complexity anymore.

mountainriver|3 years ago

Hugging Faces APIs really aren't that great, I hear lots of people complain about them. All HF did was make transformers very accessible and sharable with a neat UI.

lol1lol|3 years ago

Last night I was running run_translator.py script and found that their scripts not actually allow people training models from scratch.

But hey, I was able to read the code, fix that small thing that needed to work for my case and ran my experiment.

I could never do that in AllenNLP. Go figure.