(no title)
napsternxg | 1 year ago
I had a similar experience few years back when participating in a ML competitions [1,2] for detecting and typing phrases in a text. I submitted an approach based on Named Enttiy Recognition using Conditional Random Field (CRF) which has been quite robust and well known in the community and my solution beat most of tuned Deep learning solutions by quite a large margin [1].
I think a lot of folks underestimate the complexity of using some of these models (DL, LLM) and just throw them at the problem or don't compare it well against well established baselines.
[1] https://scholar.google.com/citations?view_op=view_citation&h... [2] https://scholar.google.com/citations?view_op=view_citation&h...
PaulHoule|1 year ago
I have a BERT + SVM + Logistic Regression (for calibration) model that can train 20 models for automatic model selection and calibration in about 3 minutes. I feel like I understand the behavior of it really well.
I've tried fine tuning a BERT for the same task and the shortest model builds take 30 minutes, the training curves make no sense (back in the day I used to be able to train networks with early stopping and get a good one every time) and if I look at arXiv papers it is rare for anyone to have a model selection process with any discipline at all, mainly people use a recipe that sorta-kinda seemed to work in some other paper. People scoff at you if you ask the engineering-oriented question "What training procedure can I use to get a good model consistently?"
Because of that I like classical ML.
korkybuchek|1 year ago
abhgh|1 year ago
If I am building a set of models for a domain, I might fine-tune the representation layer. On a per-model basis I typically just train the SVM and calibrate it. For the amount of time this whole pipeline takes (not counting the occasions when I fine-tune), it works amazingly well.
shortrounddev2|1 year ago
When I presented the proposal, nobody read it and the meeting immediately turned to the vp of engineering and the ceo discussing neural networks and some other ML system that they had read about on HN the day before. When I tried to bring collaborative filtering up again, the VP said "I don't know what that is", so obviously he hadn't read the doc that I was assigned to write over the last week
sieabahlpark|1 year ago
[deleted]