I put a lot of effort in my undergraduate thesis, but none of the professors on my committee had much interest in advising me; and after my defense, the only professor who really gave me his undivided attention came to me and said “I’m glad you’re not staying here for grad school; you’re way too good for this place”.
An underappreciated aspect of this is finding an academic department that would allow you to submit something this concise as a senior thesis.
My experience, mostly in grad school, was that anyone editing my work wanted more verbiage. If you only needed a short, one-sentence paragraph to say something, it just wasn’t accepted. There had to be more.
Jeff Dean is an uncommonly good communicator. But he also benefited from being allowed, perhaps even encouraged, to prioritize effective and concise communication.
Most people aren’t so lucky, and end up learning that this type of concision will not go over well. People presume you’re writing like a know-it-all, or that you didn’t do due diligence on prior work.
I _never_ got that feedback. My mentors all emphasized economy of language and nobody cared how "thick" my thesis was.
This is a pretty amusing story about verbiage.
Back in the old days, you would send a manuscript/research article to colleagues/friends by _snail-mail_ to get their feedback. You'd wait a month, and maybe they would mail a 'red-inked' copy of your manuscript back to you.
My Ph.D. advisor sent out a draft to a colleague who was famous for being harsh with the red-ink.
After a month, my advisor receives the manuscript in the mail.
* He turns to page 1. No red ink!
* He turns to page 2. STILL no red ink! [He must looove the paper]
* Keeps turning pages (no red ink!!).
* On page 10--in red ink--is written, "Start here."
HN doesn't really encourage memes and other jokes, but this is printed at the wall on the UBC grad lounge, and it seems depressingly accurate https://imgur.com/gallery/wM7udMU
It's kind of remarkable. There really is no literature review in this paper. As a supervisor I would have no problem with a content part of this length, but I would also insist on doing the scholarly work that is not demonstrated here. Don't just throw out some code and describe it but put it into the context of what exists. Give credit to where ideas originate.
That shouldn't add too much. No more than a few pages. It would still concise but then also a scientific work.
My undergrad "project"'s report had to be of some minimum page count (about 300, I recall). I remember filling the report with the W3C specification of HTTP, Wikipedia articles and what not in order to convince the professor that I had done some "work" in order to build the project (It was based on using a interactive genetic algorithm for generating CSS files for webpages).
Also, I had to be submit 3 identical hard-bind copies of that bullshit report.
I had to write a 10k word undergrad final paper for law school (in Europe, law school is a regular university study, with a 3 year LLB and 1 or 2 year LLM). At around 8k or so, I went to my supervisor and said 'look, I've said everything I wanted to say, and in a drawn out way already which I' not happy about. If I have to add more, I will need to start another topic, and I'd rather keep this paper focused and continue that other subject in another paper. What do you want me to do?'. Then my supervisor said 'I'll get you in on a dirty secret in legal writing. When you need to hit a word count, you play with the precedent citations here and there. Go back to your desk, cite an extra sentence before and after every citation you have in there, and tadaa you're done'. Turned out that I had to include an extra paragraph here or there but still, golden advice :) And somewhat applicable to other fields as well, if you build in this sort of safety net from the start...
I'm not sure what a senior thesis is but my undergraduate thesis was I think 34 pages long. (Excluding the source code listing)
I had a friend who's advisor made them make everything longer the way you describe, theirs was in excess of 100 pages. (IIRC this advisor had suggested that while the guidelines say ~50 pages this was the bare minimum sufficient for a pass).
I have in the past been subadvisor to various bacc. theses.
I value conciseness dearly, and prefer quality over quantity in scientific writing, i.e. I would accept incredibly short theses, if the content is sufficiently presented (reproducible and comprehensive), and most of all, contains a valuable contribution.
The reason I typically have to request "more verbiage" and an own section on the state of the art, is because I need to force my students to confront their sitcom ideas with the history of "what has been done before, and what the actual current problems are".
Unfortunately, the approaches of most students are neither new nor particularly interesting in this regard.
I'm creating my master thesis, and I have been strongly encouraged to keep it short so that it can be submitted as a paper. So mine will end up, I still feel somewhat long, at around 14 pages.
I guess it's not totally surprising that Dean's undergrad thesis was on training neural networks and the main choice was between or in-graph replication. This is still one of the big issues with TensorFlow today.
One thing most people don't get is that Dean is basically a computer scientist with expertise in compiler optimizations, and TF is basically an attempt at turning neural network speedups into problems related to compiler optimization.
I'd like to thank my undergrad university for hosting my undergrad thesis for 25 years with only 1-2 URL changes. Some interesting details include: Latex2Html held up, mostly, for 25 years and several URL changes. The underlying topic is still relevant (training the weight coefficients of a binary classifier to maximize performance) to my work today, even if I didn't understand gradient descent or softmax at the time.
Wonder who was his advisor back then, because I don't think it's mentioned in the thesis. Or he did this on his own, which is not surprising by the way.
They were very in vogue at the time. This was just after backprop was coming into its own, and before ANNs totally were surpassed by SVMs, boosting and ensembles, etc.
This was just before the second AI winter. It involved neural networks, prolog, lisp, fuzzy logic, Japan overtaking US in AI, etc.
Lots of good work with neural networks was done back then:
A learning algorithm for Boltzmann machines
DH Ackley, GE Hinton, TJ Sejnowski - Cognitive science, 1985
Learning representations by back-propagating errors
DE Rumelhart, GE Hinton, RJ Williams - nature, 1986
Phoneme recognition using time-delay neural networks
A Waibel, T Hanazawa, G Hinton, K Shikano, KJ Lang - Readings
in speech recognition, 1990
As all the other responses point out, NNs were red hot back then.
The interest in NNs was ignited (in part) by this double volume collection of essays called "Parallel Distributed Processing" edited by Rumelhart and McClelland.
Dean even cites them. And, if you read the contributors, it contains many (though not all) of the heavy hitters.
Reading back on it, it will sound very familiar. All the amazing breakthroughs: object recognition, handwriting recognition etc all seemed to be there. But all that rapid progress just seemed to stop. There was this quantum leap and then you were back to grinding out for even 0.1% improvement.
For those who stuck through the second winter, things obviously paid off.
From my perspective neural networks were a big thing in the late 1980s when I was on a DARPA neural networks tools panel for a year, and wrote the initial version of the SAIC Ansim neural network project. We had some great results using simple backdrop networks. Good times.
The early 90s were an interesting time for NNs and other machine learning systems. I remember getting really interested, but being told that "NNs with more than 1 layer can't really be trained", so I went into simulation rather than training. It's really great that GPUs and deep backprop arose to recover the stature of NNs.
Not that incredible. Just about every CS / Psych / Cognitive Science Dept back then was into them. I did a project on NNs in my undergrad. Programmed in C. I’m sure thousands of others did as well.
Really interesting and innovative early work, and I think it also explains why tensorflow does not support within layer model parallelism. It's amazing how much our early experiences shape us down the road.
My entire career has consisted of reimplementing bits and pieces of things I've previously built all the way back to high school and then reimplementing whatever was new on the previous round in the next one.
As a side note, I already have a draft of my essay (not published yet) that replaces the mention of storage costs with a mention of Ruth Porat. The point is why Ruth Porat was hired in the first place.
To my eyes this seemed like a completely normal amount of whitespace. The only thing I personally prefer that you would reasonably reduce is moving the left block delimiter from its own line (But left block braces being on their own line is fairly common for C/C++ projects afaik)
The flagship university in each state actually always has some very very bright students. The very elite schools don't accept many students and the admissions criteria are far from perfect so a ton of students who would be good enough for top tier schools end up going to state schools. I know a couple different people who were top of their high school class and within a question or two of perfect on their SATs that happily went to state schools.
In the U.S., one's undergraduate institution does not correlate to success as much as it does in certain countries like France or Japan, where universities are a pipeline for elite selection and grooming.
Also, not all intelligent American kids can or want to go to elite schools, even if they are academically qualified. In the U.S., you often hear stories of kids turning down really good schools for ones they felt were a better "fit" (financially, culturally, etc.). And unlike the rest of the world, elite colleges in the U.S. are often private and expensive. Despite need-blind admissions, not everyone can afford them without going into heavy debt. (many middle-class parents make just enough money for their kid to not qualify for substantial financial aid).
So kids go to schools they can afford.
One of my college professors (who attended Princeton and MIT) once told me that in his observation, the top 5 percentile students in (good) state schools aren't that different from the kids who went to Princeton or MIT. I didn't believe him at the time, but having worked with different folks over the years, my experience inclines me to believe that there's some truth in that observation.
Owing to its population and economy, the U.S. has a large enough talent pool that the top percentile students at large, well-funded state schools (of which UMN is an example) are plenty smart. If you were to meet the really smart top-5-percentile kids from such state colleges (I have), you'd have no doubt that many of them could have attended MIT or CMU.
To be sure, good colleges can give you a headstart in life -- but it's what you do with that advantage that counts.
--
Examples of smart computer folk who went to decent, but non-elite schools for undergrad:
Doug Crockford (Javascript), SFSU
JJ Allaire (ColdFusion, Rstudio, etc.), Macalester College
Ward Cunningham (Wikis), Purdue
Rich Hickey (Clojure), SUNY Empire State (though he did go to Berklee College of Music)
John Carmack (Doom, Quake), U. Missouri Kansas City
Sergey Brin (Google), U. Maryland College Park (before Stanford)
Larry Page (Google), U. Michigan (before Stanford)
Dave Cutler (VMS, Windows NT), Olivet College
Bram Cohen (BitTorrent), U at Buffalo
Ryan Dahl (Node.js), UCSD, then U Rochester
Larry Wall (Perl), Seattle Pacific U (before Berkeley)
Alan Kay (Smalltalk, windowing GUIs), U Colorado, then U Utah.
Brendan Eich (Javascript, Mozilla), Santa Clara U (before UIUC)
Looking back. I should have gone to my in-state school. There might be an incremental difference in quality, but I would not have rushed to get it done a year early, and would have actually done meaningful research and probably PhD.
The so-called Ivy Leagues miss out on people who peak in academic ability after high school, plus there are more excellent people than there are spots at Ivy Leagues. The upper echelons of pretty much every standard university are going to have equally competent people.
halflings|7 years ago
Tackling a complex problem (still relevant today) at an early age, getting great results and describing the solution clearly/concisely.
My master thesis was ~60 pages long, and was probably about 1/1000 as useful as this one.
GuiA|7 years ago
¯\_(ツ)_/¯
Comparing your journey to others’ is pointless.
mlthoughts2018|7 years ago
My experience, mostly in grad school, was that anyone editing my work wanted more verbiage. If you only needed a short, one-sentence paragraph to say something, it just wasn’t accepted. There had to be more.
Jeff Dean is an uncommonly good communicator. But he also benefited from being allowed, perhaps even encouraged, to prioritize effective and concise communication.
Most people aren’t so lucky, and end up learning that this type of concision will not go over well. People presume you’re writing like a know-it-all, or that you didn’t do due diligence on prior work.
busyant|7 years ago
I _never_ got that feedback. My mentors all emphasized economy of language and nobody cared how "thick" my thesis was.
This is a pretty amusing story about verbiage.
Back in the old days, you would send a manuscript/research article to colleagues/friends by _snail-mail_ to get their feedback. You'd wait a month, and maybe they would mail a 'red-inked' copy of your manuscript back to you.
My Ph.D. advisor sent out a draft to a colleague who was famous for being harsh with the red-ink.
After a month, my advisor receives the manuscript in the mail.
* He turns to page 1. No red ink!
* He turns to page 2. STILL no red ink! [He must looove the paper]
* Keeps turning pages (no red ink!!).
* On page 10--in red ink--is written, "Start here."
nikanj|7 years ago
Certhas|7 years ago
That shouldn't add too much. No more than a few pages. It would still concise but then also a scientific work.
pulkitsh1234|7 years ago
Also, I had to be submit 3 identical hard-bind copies of that bullshit report.
roel_v|7 years ago
sleepychu|7 years ago
I had a friend who's advisor made them make everything longer the way you describe, theirs was in excess of 100 pages. (IIRC this advisor had suggested that while the guidelines say ~50 pages this was the bare minimum sufficient for a pass).
I guess it depends a lot on your advisor.
bipson|7 years ago
I value conciseness dearly, and prefer quality over quantity in scientific writing, i.e. I would accept incredibly short theses, if the content is sufficiently presented (reproducible and comprehensive), and most of all, contains a valuable contribution.
The reason I typically have to request "more verbiage" and an own section on the state of the art, is because I need to force my students to confront their sitcom ideas with the history of "what has been done before, and what the actual current problems are".
Unfortunately, the approaches of most students are neither new nor particularly interesting in this regard.
arbie|7 years ago
GLjEI4YbnGD27LB|7 years ago
dekhn|7 years ago
One thing most people don't get is that Dean is basically a computer scientist with expertise in compiler optimizations, and TF is basically an attempt at turning neural network speedups into problems related to compiler optimization.
I'd like to thank my undergrad university for hosting my undergrad thesis for 25 years with only 1-2 URL changes. Some interesting details include: Latex2Html held up, mostly, for 25 years and several URL changes. The underlying topic is still relevant (training the weight coefficients of a binary classifier to maximize performance) to my work today, even if I didn't understand gradient descent or softmax at the time.
mi_lk|7 years ago
russtrpkovski|7 years ago
mcilai|7 years ago
totoglazer|7 years ago
nabla9|7 years ago
Lots of good work with neural networks was done back then:
projectramo|7 years ago
The interest in NNs was ignited (in part) by this double volume collection of essays called "Parallel Distributed Processing" edited by Rumelhart and McClelland.
Dean even cites them. And, if you read the contributors, it contains many (though not all) of the heavy hitters.
Reading back on it, it will sound very familiar. All the amazing breakthroughs: object recognition, handwriting recognition etc all seemed to be there. But all that rapid progress just seemed to stop. There was this quantum leap and then you were back to grinding out for even 0.1% improvement.
For those who stuck through the second winter, things obviously paid off.
The intro essay is online:
https://stanford.edu/~jlmcc/papers/PDP/Chapter1.pdf
mark_l_watson|7 years ago
pimmen|7 years ago
Then when the data explosion started during the 00s, it laid the groundwork for the NN comeback.
coldsauce|7 years ago
silverlake|7 years ago
dekhn|7 years ago
plg|7 years ago
scottlegrand2|7 years ago
My entire career has consisted of reimplementing bits and pieces of things I've previously built all the way back to high school and then reimplementing whatever was new on the previous round in the next one.
slyrus|7 years ago
aquamo|7 years ago
ahuibers|7 years ago
lukeh|7 years ago
yuhong|7 years ago
elvinyung|7 years ago
pknerd|7 years ago
sigjuice|7 years ago
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
https://github.com/apple/darwin-xnu/blob/master/libsyscall/m...
benzoate|7 years ago
fjsolwmv|7 years ago
akhilcacharya|7 years ago
tdb7893|7 years ago
wenc|7 years ago
Also, not all intelligent American kids can or want to go to elite schools, even if they are academically qualified. In the U.S., you often hear stories of kids turning down really good schools for ones they felt were a better "fit" (financially, culturally, etc.). And unlike the rest of the world, elite colleges in the U.S. are often private and expensive. Despite need-blind admissions, not everyone can afford them without going into heavy debt. (many middle-class parents make just enough money for their kid to not qualify for substantial financial aid).
So kids go to schools they can afford.
One of my college professors (who attended Princeton and MIT) once told me that in his observation, the top 5 percentile students in (good) state schools aren't that different from the kids who went to Princeton or MIT. I didn't believe him at the time, but having worked with different folks over the years, my experience inclines me to believe that there's some truth in that observation.
Owing to its population and economy, the U.S. has a large enough talent pool that the top percentile students at large, well-funded state schools (of which UMN is an example) are plenty smart. If you were to meet the really smart top-5-percentile kids from such state colleges (I have), you'd have no doubt that many of them could have attended MIT or CMU.
To be sure, good colleges can give you a headstart in life -- but it's what you do with that advantage that counts.
--
Examples of smart computer folk who went to decent, but non-elite schools for undergrad:
Doug Crockford (Javascript), SFSU
JJ Allaire (ColdFusion, Rstudio, etc.), Macalester College
Ward Cunningham (Wikis), Purdue
Rich Hickey (Clojure), SUNY Empire State (though he did go to Berklee College of Music)
John Carmack (Doom, Quake), U. Missouri Kansas City
Sergey Brin (Google), U. Maryland College Park (before Stanford)
Larry Page (Google), U. Michigan (before Stanford)
Dave Cutler (VMS, Windows NT), Olivet College
Bram Cohen (BitTorrent), U at Buffalo
Ryan Dahl (Node.js), UCSD, then U Rochester
Larry Wall (Perl), Seattle Pacific U (before Berkeley)
Alan Kay (Smalltalk, windowing GUIs), U Colorado, then U Utah.
Brendan Eich (Javascript, Mozilla), Santa Clara U (before UIUC)
jl2718|7 years ago
EchoAce|7 years ago
anonytrary|7 years ago
hazz99|7 years ago
mlevental|7 years ago