top | item 18148198

(no title)

vjsc | 7 years ago

So we had this idea of a new feature for our product. The only way to quickly do it was to somehow implement a machine learning algo and that would give us the result that we wanted. Viola!! It seemed simple.

Now our company doesn't have any machine learning expert or a data science genius. Going for hiring one would take time. Taking someone up on contract would be very expensive (our CEO wasn't ready to shell out that kinda money). So the task fell on me. They asked me to go through the multitudes of Machine leaning MOOCs out there and get a working prototype ready in 2 weeks.

I had already done Andrew Ng's course back when it came out for the first time. But my memory had faded for the lack of practice.

I re-ran the course again. I went over a couple of online ML books too.

Then I started thinking of the problem at hand. Unfortunately, it turned out to be a chicken and egg problem. For the feature to work perfectly we needed a large amount of training data to train our models. But without the feature actually deployed, we didn't have any way to collect any training data.

So we ultimately fell back to simple algo, that took it's decisions based on a few hard coded rules. Things have been working fine till now.

discuss

order

hellogoodbyeeee|7 years ago

They gave you two weeks to become a data scientist and implement a working solution? That's nuts. I'm still pretty early career, but I have done data science work for about four years now and I wouldve quoted at least two months to figure out data, clean it, feature engineer, run models, compare results, and then deliver the best performing solution.

tedivm|7 years ago

And they didn't even have data!

speby|7 years ago

> They gave you two weeks to become a data scientist and implement a working solution? That's nuts.

Oh c'mon. Any large company today and the expectation or deadline for practically anything is "asap" or measured in a few weeks at most. Short-term thinking is a major player in publicly traded companies. Because of that, this is what opens the door for startups to play the long-game.

ellisv|7 years ago

> Unfortunately, it turned out to be a chicken and egg problem. For the feature to work perfectly we needed a large amount of training data to train our models. But without the feature actually deployed, we didn't have any way to collect any training data.

Everyone outside of data science seems really surprised by this and I can't count the number of times someone has asked me to build an algorithm for X but has none of the data to support doing so. It doesn't mean the feature/product can't be built but they often want a supervised learning solution without the cost (and time) of acquiring the ground truth data.

superflyguy|7 years ago

"The only way to quickly do it was to somehow implement a machine learning algo and that would give us the result that we wanted. Viola!!"

Designing the perfect viola using machine learning doesn't sound like it's something for beginners.

MasterScrat|7 years ago

Yes, sorry to be "that guy" as well, but it's voila ("voilà" if you want to be pedantic).

"Viola" either refers to a stringed instrument, or means "raped" in the sense "he raped" ("il viola"). So please don't use it as an interjection.

zwieback|7 years ago

Just take a violin and scale by a factor of 1.2 or so.

dragandj|7 years ago

> The only way to quickly do it was to somehow implement a machine learning algo and that would give us the result that we wanted.

Since no one of you had any experience with ML, how did you know that a ML algo (which one?), implemented "somehow" would give you the results you wanted? (Not a cynical comment; I am really interested in hearing about this).

jimcsharp|7 years ago

Not OP, but went through a similar situation and the feature was 'alert us about unintuitive correlations in our data so we can invent new KPIs'

e_ameisen|7 years ago

Co-author here. This is a surprisingly common situation. In fact starting with the simplest algo is usually the best way to prove the validity of your approach, and gather initial data to build a more complex model later.

In addition, trying for the feature to “work perfectly” from the get go, even with lots of data usually is quite hard.

probably_wrong|7 years ago

Maybe it's an instance of "when all you have is a hammer...", because I'm learning about it right now, but you could look into transfer learning - you train a ML model in a similar, easier task, and then you tweak it with your data.

That said, there's a good chance that your current algorithm is all you will ever need - many times a ML project is too much, and you already have good results.

minimaxir|7 years ago

Transfer learning only works if the original model is in the same domain (e.g. ImageNet for images, GloVe for text). A bespoke problem likely won't have a widely-available original model.

bonniemuffin|7 years ago

That seems fine to me. It's a good practice to start with hard-coded business rules instead of any kind of model, just to test the waters, collect some data, and see if a new feature even makes sense, before diving into building even the simplest model.

chippy|7 years ago

I've been talking to academic neural net / ML experts in computer vision and OCR / NLP and the thing they try to stress is that for almost all cases an algorithmic approach works better.

yonkshi|7 years ago

I don't think most ML experts would agree with that, a big reason DL became popular are the huge improvements they brought to CV and NLP fields.

In many ways, traditional approaches were harder because you need huge amount of domain expertise in CV & NLP, whereas a ML expert can solve simple CV problems with almost no domain knowledge.

Now, a lot of the business data, especially time series data, I agree that an algorithm/heuristic approach is easier and more robust. E.g. recommendation systems.

aaronblohowiak|7 years ago

While not ML it still would have been considered a form of AI back in the day — “expert systems” they used to call it :)