20 odd years ago I worked on someone else's helpdesk as application and PC support. Sometimes the calls that came in were a bit more interesting than Extra Term is blinking at me or Sophos ate my Word doc.
Call (ticket) came in along the lines of "Need help with a neural network I am developing in Excel". I bit. I showed the end user (who was orders of magnitudes cleverer than me) a bit about VBA. Some reasonably good habits like Option Explicit and making sure all eventualities are covered in If clauses. Keep your inputs, working and output separate and sum north/south and east/west and compare. Document the bloody thing! If I recall correctly, it was a Hopfield thingie, so I'm pretty off topic here.
The next call was for help with an Access database with fifty odd tables all linked to each other in a way that must surely be designed to invoke Cthulhu. I deleted it and we started again! My call notes were odd enough to get a mention at the next ops meeting.
Multi-head self-attention seems to be the new trendy architectural primitive.
I don't know how feasible it would be - I guess you could take a set of base operations (matrix multiplication, softmax, etc.) and randomly generate feature transformations and check if any of them yield good features (stick a linear readout at the end of it and test the performance on some downstream tasks).
That would be an unguided search - I guess you could try something like GA or something. Also, it uses neural network training as an inner loop step, so it would probably be to expensive. Better would be if you could get the gradient w.r.t. to the tentative operation somehow.
Problem is that training NNs is nontrivial and you might need things like BatchNorm and residual connections to make things stable, so you'd somehow have to search for good architectures for each operation as well.
At least there is work on greatly generalizing convolutions. They're much more broadly applicable (in neural networks) to very differently structured data than they appear to be in their standard form. (The "in neural networks" qualifier is there because quite a bit of this has been understood about the mathematical operation convolution for a long long time).
Another "beginner" intro that starts with describing FCs and neurons, and doesn't tell why we need NNs in the first place.
Although Deep NNs is not very interpretable, there's good intuitions behind the designs. These kind of articles will only make deep learning more mysterious.
I was thinking the same; there are so many articles explaining the basics.
For me it would be more helpful to start off with a real-life scenario where the mentioned method can be applied and even might excel compared to other methods; bonus points if you also explain what properties of the method make it so very well-suited for the specific real-life scenario.
There are so many methods in data science / machine learning and from what I remember from my university days one of the difficult tasks was to know when to use which method, depending on the properties of your data and on what you want to achieve; additionally, sometimes you also need to optimize/improve the method's hyperparameters and that's almost a whole separate discipline by itself.
Nonetheless, the posted article contains a lot of valuable information for a beginner, so it's definitely a good start.
Yes. In most problems I face I find gradient boosting is at least as good if not better then any neural network and much easier to implement and explain.
1. Some companies found a way to promote this type of blog articles past a certain threshold to stay on the front page long enough for... profits?
2. The demography of HN has changed substantially in recent times so that copypasta articles that don't add anything new to existing, better sources are actually valuable to them.
I'm inclined to agree. I've been meaning to take the time to implement a CNN from scratch, I went right to this article hoping it would have some code but no, just the same content over again.
[+] [-] gerdesj|4 years ago|reply
Call (ticket) came in along the lines of "Need help with a neural network I am developing in Excel". I bit. I showed the end user (who was orders of magnitudes cleverer than me) a bit about VBA. Some reasonably good habits like Option Explicit and making sure all eventualities are covered in If clauses. Keep your inputs, working and output separate and sum north/south and east/west and compare. Document the bloody thing! If I recall correctly, it was a Hopfield thingie, so I'm pretty off topic here.
The next call was for help with an Access database with fifty odd tables all linked to each other in a way that must surely be designed to invoke Cthulhu. I deleted it and we started again! My call notes were odd enough to get a mention at the next ops meeting.
The product is still flying so all good.
Convoluted ... what?
[+] [-] sillysaurusx|4 years ago|reply
Actually, you’re spot on. Hopfield networks were once popular, and are now popular again: https://arxiv.org/abs/2008.02217
(Hopularity leads to popularity, as they say.)
I wish there was some way to track down that spreadsheet. :) It’d be neat to read. Thanks for sharing.
[+] [-] woliveirajr|4 years ago|reply
Please, add a warn that this make people spit the coffee
[+] [-] amznbyebyebye|4 years ago|reply
[+] [-] optimalsolver|4 years ago|reply
We've been using these methods for years, and I doubt the first major development in NN vision (convolution) is the most optimal method possible.
Consider the extreme case of searching over all (differentiable) mathematical operations to see if something really novel can be discovered.
How feasible would this be?
[+] [-] IdiocyInAction|4 years ago|reply
I don't know how feasible it would be - I guess you could take a set of base operations (matrix multiplication, softmax, etc.) and randomly generate feature transformations and check if any of them yield good features (stick a linear readout at the end of it and test the performance on some downstream tasks).
That would be an unguided search - I guess you could try something like GA or something. Also, it uses neural network training as an inner loop step, so it would probably be to expensive. Better would be if you could get the gradient w.r.t. to the tentative operation somehow.
Problem is that training NNs is nontrivial and you might need things like BatchNorm and residual connections to make things stable, so you'd somehow have to search for good architectures for each operation as well.
[+] [-] jellyksong|4 years ago|reply
[+] [-] gspr|4 years ago|reply
Some recent developments:
* https://arxiv.org/abs/2010.03633 (disclosure: I'm one of the authors)
* https://arxiv.org/abs/2012.06333
[+] [-] xiaodai|4 years ago|reply
[+] [-] tracyhenry|4 years ago|reply
Although Deep NNs is not very interpretable, there's good intuitions behind the designs. These kind of articles will only make deep learning more mysterious.
[+] [-] giu|4 years ago|reply
For me it would be more helpful to start off with a real-life scenario where the mentioned method can be applied and even might excel compared to other methods; bonus points if you also explain what properties of the method make it so very well-suited for the specific real-life scenario.
There are so many methods in data science / machine learning and from what I remember from my university days one of the difficult tasks was to know when to use which method, depending on the properties of your data and on what you want to achieve; additionally, sometimes you also need to optimize/improve the method's hyperparameters and that's almost a whole separate discipline by itself.
Nonetheless, the posted article contains a lot of valuable information for a beginner, so it's definitely a good start.
[+] [-] mnky9800n|4 years ago|reply
[+] [-] antman|4 years ago|reply
[+] [-] hmwhy|4 years ago|reply
1. Some companies found a way to promote this type of blog articles past a certain threshold to stay on the front page long enough for... profits?
2. The demography of HN has changed substantially in recent times so that copypasta articles that don't add anything new to existing, better sources are actually valuable to them.
Edit: typo.
[+] [-] atum47|4 years ago|reply
[+] [-] jonathanhild|4 years ago|reply