top | item 14217275

(no title)

I'll have to watch this later, but I'd argue the issue, at least for me, isn't really surface level understanding. (At least, the kind I think could plausibly be imparted in 4 minutes. :))

The basic idea of deep learning has always seemed straightforward to me[0]. However, at least my perception is that it feels like there's a lot of deep magic going on in the details at the level that Google/Microsoft/Amazon/researchers are doing deep learning. That's honestly true of most active research areas[1], but since those results are also the results that keep getting a lot of attention, the "it's a black box" feeling makes sense to me. :)

[0] Having done both some moderately high level math and having a CS background, I feel like most ideas in CS fit this description, though. Our devil is the details.

[1] For instance: fairly recent results in weird applications of type theory are also super cool, and require some serious wizardry, but those get much less attention. (And are, I think, more taken for granted, since who doesn't understand a type system? /s)

discuss

trevyn|8 years ago

Having until very recently worked in deep learning at Google, I can assure you that if you read and watch enough recent public papers and talks, you will be very, very close to the latest thinking of researchers at these companies.

You're right that it can take some time to do this edification work and develop the understanding for yourself -- the research is broader and more specialized than it appears at first glance -- and it does help to be surrounded by smart people puzzling over the same types of problems, but there's very little secret magic here. It is, however, of benefit to these companies to develop a public image of exclusivity and wizardry in their research; I fell into this trap too, before I saw how the sausage is made.

If you want to make your own fundamental innovations in deep learning, it can be very resource-intensive, both computationally and otherwise. However, it is easy to apply the current state-of-the-art to a broad spectrum of applications in novel ways.

One of the reasons I left is that I think there is a big opportunity in applying these powerful basic principles and approaches to more domains. The research companies are, IMO, focused on businesses that are or have the potential to become very, very large, and that can take advantage of their ability to leverage massive amounts of capital. This leaves many openings for new medium-sized businesses. Of course, as you grow, you can take stabs at progressively larger problems.

iheartmemcache|8 years ago

I'm with you 100%. RF has been around for ages, but it still is "black magic" to most EEs (after most people finish the standard 100 level courses describing op-amps, people tend to go into the digital domain and leave analog work to that small demographic). One EE will be able to design a fantastic 7 layers of poly, multiprocessing chip in his garage using Cadence and the TSMC 65nm libs, while someone else will be able to design a flawless cavity filter at 16 ghz. People have specific domains of expertise, even when they hold the same "EE" or "CS" or "Math" degree from the same university, largely based on which courses they elected to take in their 3rd and 4th year.

Likewise, fields advance quickly. I can grok how an z80 or 6502 works from NAND to Tetris, but even a mediocre second year grad student would wipe the floor with me. I, too, went pretty far down the road of mathematics, but watching MSRI lectures from the last few years leaves me struggling to keep up, in the field (algebraic topology) where I once felt comfortable. If you don't keep up with your field you're going to be lost.

The reason I think the 'black magic' trope keeps on being bandied about is because most people reading the articles describing ImageNet et al just don't have the background necessary to grok it[1]. If you had asked them a year ago what the convolution operator was, they'd have scratched their head. When they try to go and read that ImageNet paper they'll be left even more confused because the last time they thought about linear algebra was in their freshman year of uni. It'd be analogous to trying to write some computational fluid dynamics modeling software after not having taken/not touching diff eqs for a decade.

[1] This isn't to disparage those who didn't- everyone has their domain of expertise. I'm just trying to emphasize why the conception of 'black magic' exists. It's quite simple - when one has a tenuous grasp on the foundational knowledge upon some theory is built, you will have difficulty learning abstractions built upon said foundations.

trevyn|8 years ago

Ah, this is interesting, because I've recently dabbled a bit in RF. My path went like this:

1) Interested in doing something with RF, don't know much about it, know that people say it's black magic.

2) Do some research... Ah, this is a pretty deep topic, and it might take a while to develop the necessary intuition.

3) Become competent enough to solve my immediate problem, recognize that it is a extensive field in which there is a lot of specialized practical knowledge that could be acquired.

4) Accept that I have higher life priorities than to go down the RF rabbit hole, but feel that I could learn it if I wanted to invest the time. No longer feels like black magic.

I think there is a distinction between fields like deep learning and RF, where most of the information is public if you know where to look, and say, cryptanalysis or nuclear weapon design or even stage magic, where the details and practical knowledge of the state-of-the-art are more locked behind closed doors. And for a field that you're not familiar with, it can be initially unclear which category it falls into. I think the existence of public conferences on the topic is a good indicator, though.

CarlsJrMints|8 years ago

I would love to hear more about these "weird applications of type theory". Any references?

electronvolt|8 years ago

So it turns out you can basically use type theory to encode a surprisingly large number of desirable traits about your program. (Caveat being that as you get more restrictive, you reject more "good" programs at compile time--no free lunch with Rice's theorem.)

For example: In this paper, they basically use types (with an inference algorithm) to catch kernel/user pointer confusion in the Linux kernel. (https://www.usenix.org/legacy/event/sec04/tech/johnson.html)

It turns out you can encode a lot of other interesting properties in a type system (esp. if you're building on top of the existing type system), though--you can ensure that a java program has no null-dereference checks (https://checkerframework.org/ has a system that does this), and Coq uses its type system to ensure that every program halts (as a consequence, though, it isn't actually Turing complete).

There's also cool things like Lackwit (http://ieeexplore.ieee.org/document/610284/) which basically (ab)used type inference algorithms to answer questions about a program ("does this pointer ever alias?", etc.).