top | item 22226380

The Missing Semester of Your CS Education

1163 points| anishathalye | 6 years ago |missing.csail.mit.edu

196 comments

order
[+] Jonhoo|6 years ago|reply
Over the years, we (@anishathalye, @jjgo, @jonhoo) have helped teach several classes at MIT, and over and over we have seen that many students have limited knowledge of the tools available to them. Computers were built to automate manual tasks, yet students often perform repetitive tasks by hand or fail to take full advantage of powerful tools such as version control and text editors. Common examples include holding the down arrow key for 30 seconds to scroll to the bottom of a large file in Vim, or using the nuclear approach to fix a Git repository (https://xkcd.com/1597/).

At least at MIT, these topics are not taught as part of the university curriculum: students are never shown how to use these tools, or at least not how to use them efficiently, and thus waste time and effort on tasks that should be simple. The standard CS curriculum is missing critical topics about the computing ecosystem that could make students’ lives significantly easier.

To help mitigate this, we ran a short lecture series during MIT’s Independent Activities Period (IAP) that covered all the topics we consider crucial to be an effective computer scientist and programmer. We’ve published lecture notes and videos in the hopes that people outside MIT find these resources useful.

To offer a bit of historical perspective on the class: we taught this class for the first time last year, when we called it “Hacker Tools” (there was some great discussion about last year’s class here: https://news.ycombinator.com/item?id=19078281). We found the feedback from here and elsewhere incredibly helpful. Taking that into account, we changed the lecture topics a bit, spent more lecture time on some of the core topics, wrote better exercises, and recorded high-quality lecture videos using a fancy lecture capture system (and this hacky DSL for editing multi-track lecture videos, which we thought some of you would find amusing: https://github.com/missing-semester/videos).

We’d love to hear any insights or feedback you may have, so that we can run an even better class next year!

-- Anish, Jose, and Jon

[+] fuzzy2|6 years ago|reply
That’s cool. Now, on to my possibly unpopular opinion: This isn’t what computer science is about. In fact, you don’t even need to use a computer to do computer science.

Sure, some stuff you learn in CS can make you a better software engineer. CS cannot make you a software engineer.

CS can definitely not make you adept at using computers and neither should it. That’s something earlier education institutions must tackle.

It’s always good to have optional courses for various topics of interest. _Requiring_ students to learn, say, MS Office (I had to), is just plain ridiculous.

[+] ip26|6 years ago|reply
"I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it." -- Bill Gates

Some people just accept absurd drudgery interacting with computers; it never even occurs to them to seek an easier way. But if you can teach the students to be "lazy" when it comes to that kind of repetitive drudgery, they'll be set for life.

[+] foobarian|6 years ago|reply
Since you mention scrolling using key repeat - it truly is painful to watch someone do it on default settings. And there are usually better ways to do that sort of thing. But sometimes, there is no substitute: going to the right spot in the middle of a word/line, moving around a page, browser title bars, etc. Here's the kicker though: most keyboard repeat rate settings around have a maximum that is pretty much unusable! But you can fix it thusly:

xset r rate 180 60

When I work on my computer it's like driving a Porsche. When I sit at someone else's it's like I tripped over a door threshold.

There are ways to adjust this on OSX too but it's a lot more touchy. Haven't attempted on Windows.

[+] dybber|6 years ago|reply
We previously had the same problem at University of Copenhagen, it was called "the hidden curriculum" among students.

When the undergraduate programme was reformed a few years ago these subjects where integrated into various courses, so they could be taught in a learning-by-doing fashion. As part of the first programming and problem-solving class (F#), we also teach LaTeX, Emacs and basic use of the command line, as part of a project-based software engineering course (second semester) Git is used extensively, and so on.

[+] floatrock|6 years ago|reply
Love this. I remember it wasn't until junior year when I was reading about the theoretical underpinnings of the unix kernel that I learned what the pipe operator I'd been copy-pasting on psets really did. I mean, it's great to learn those underpinnings, but most course 6 classes assumed you already knew all these tools... if it wasn't theory, it wasn't their responsibility to teach it.

"Missing Semester" describes it perfectly. Wish there had been something like this back in my day... I remember I felt as if I had learned all the theory behind fluid mechanics but didn't know the first thing about fixing a leaky faucet in my kitchen. Keep up the good work!

[+] brlewis|6 years ago|reply
My feedback is to keep doing what you're doing. Some people will talk about how this kind of mundane training might not be relevant some years later. By the same reasoning students shouldn't learn the layout of their campus or how to add and drop courses, because those skills won't be relevant after they graduate. Keep helping students not waste time now on tasks that should be simple.
[+] tikiman163|6 years ago|reply
I read through the article, and a lot of what you've got looks really useful. Your explanation of GIT and source control is especially valuable. These are all pretty much skills a good programmer should have, but I would make one exception.

Personally, I like and have frequently used VIM. It's a useful tool to know how to use if you need to edit text files on a GUI-less system. However, I have yet to meet a single programmer with any ability to work as a team member that chooses to use VIM while using an OS with a desktop environment. VIM does have an interesting and valuable ideology, but that ideology isn't perfect and I worry that exposing new programmers to only one command line text editor and itss specific ideology might provide too narrow of a perspective. It might be a good idea to alsi present the ideologies behind other command line text editing approaches such as perl scripting and regex in order to update many files in order to widen their perspective and consider what they need to do while selecting the tool to use.

[+] choppaface|6 years ago|reply
I’d recommend Docker / virtual machines as its own top-level module. There’s essentially no better way than containers to make projects portable and reproducible, and containers are now essential for pretty much any production system. I’ve taught docker to a few dozen people and while they hate it at first (the learning curve can be steep) they love it in the end.

Also wish this course covered bazel, maybe in potpourri. Make is important and bazel isn’t standard but bazel is pretty important for large C++ projects.

Docker and bazel both have fairly complicated interfaces and the time set aside for a course like this is the perfect time to play with them.

[+] ericd|6 years ago|reply
Ha some MIT alum friends and I were just talking about how great it would be if something like this existed, rather than having it be left to one’s first years as a junior engineer in a company. Thanks for doing this, it’s sorely needed.
[+] inson|6 years ago|reply
I'd recommend making a course on how to create a proper Makefile, from the beginning to the advanced level.
[+] godshatter|6 years ago|reply
I would go one step even more basic and teach touch typing. It amazes me when I watch someone code by laboriously copying some other code from somewhere else, from a full code snippet to a single variable name, paste it in and begrudgingly type what needs to be typed, slowly.

The thought that always goes through my mind when I see this is "this is your interface to code, people, shouldn't you spend a little bit of time at least trying to master it?"

Forget vim, if people just learned to type they could be twice as productive in simple notepad. Vim (which I use) raises this to another level still.

[+] kamaal|6 years ago|reply
Learning to use the editor macros well, could save anyone lots of effort typing. And should definitely be in your course.
[+] huijzer|6 years ago|reply
I think this is an amazing course and indeed missing. I learned the things presented in the course only after graduating CS and working at a company! They (rightly so) thought I was somewhat of a fool because I preferred to use Windows.

One thing on the data wrangling. I do think that the Linux tools are powerful, but would like to give some credits to R here. For example, merging tables (similar to SQL joins) is available in the standard library. This in combination with R Markdown for visualization makes it much more easy to use than Linux CLI tools.

[+] msaharia|6 years ago|reply
Hi, how are you getting the keystrokes you type in to overlay on the video? Is there a tool you're using for this? Would be very helpful if you could make it.
[+] dirtydroog|6 years ago|reply
We shouldn't really be encouraging people to program in vim. I mean, an 80 char line length limit is very 1995. People have massive monitors now.
[+] slumdev|6 years ago|reply
You use Vim in your example. Get your students out of it and into something like IntelliJ IDEA or VS Code or (at a bare minimum) Eclipse.
[+] xzel|6 years ago|reply
I took a Unix half credit course randomly where you basically did bash scripting, a huge bunch of command line tools and then eventually use all that to build your own linux distribution. I swear I learned more in the half credit class, and way more if you try to count it as useful information, than 90% of my other CS courses.

Edit: And since this got some traction, here is the current version of the class: https://www.cis.upenn.edu/~cis191/ it looks pretty similar to what I took but they added a little bit of Python.

[+] kbenson|6 years ago|reply
Going through Linux From Scratch[1] manually and reading each component, while allowing yourself the time to look into interesting bits of each, is essentially this. I did it back in the early 2000's, and view it as one of the most useful educations I've ever gotten on how a Unix-like system functions (even if a lot of the knowledge is somewhat obsoleted now for many Linux distros with the introduction of systemd). I was doing it as research for how to ship a custom distro for a small purpose-built device, and while that never panned out, I've never once regretted spending the time to go through the process.

1: http://www.linuxfromscratch.org/

[+] commandersaki|6 years ago|reply
I did a similar Unix course. It was on quarter based system so it only lasted 3 months. We learned Unix from first principles. There were programming assignments that started off as shell scripts. We wrote manpages for our shell scripts. Then we learned C with the expectation you figure out the hard parts really really quickly (we were a Java school). We implemented various different C and Unix programs. We even reimplemented some of our shell scripts. We implemented a custom version of ls(1) with built-in sorting features.

In the last 3 weeks we built a full-blown Bourne shell in C including background jobs, shell/environment variables, pipelines, i/o redirection, lex/yacc parsing, etc. A lot of the features were extra credit (e.g. single pipeline a | b, versus infinite pipeline a | b | ...).

The class is unforgettable in my mind. I now mentally parse everything I do in the shell and imagine system call tracing every program I run.

[+] awbraunstein|6 years ago|reply
I took this class as well and agree that it was the most useful course I took at Penn. I still use all of the things that I learned in this class every day (especially emacs). Luckily, I took it early in school and it made subsequent classes so much easier. Thank you Perry Metzger!
[+] lkbm|6 years ago|reply
In the lab for my second or third CS course, the professor was walking us through intro Unix usage, but he didn't take attendance so it ended up being just 2-4 of us showing up. After a few weeks, he cancelled the lectures and told us he'd just be around to answer questions, help with homework, etc.

The last lecture before he cancelled things was an intro to Vim. The next would've been Emacs.

And that's the story of how I became a life-long Vim user. :-)

[+] _sdfo|6 years ago|reply
There are two sides to this kind of teaching. If you're MIT, you can afford to hire great lecturers who know to teach the students fundamental and deep truths about the tools they're using and not trivia. That way, they can generalize onto other tools, so anyone who studied git can quickly adapt to using cvs or svn. My university (in a developing country) is on the other side. Last semester was a disaster.

We had an AWS course, half of which was memorizing information about pricing and S3 tiers. If I were going into a job as an AWS guy, I'd definitely have to know that, but this is just third year of undergrad in CS :-/ and not a training course. The quizzes also had deliberately deceiving questions, which is the worst type of trivia!

Even better example. The Windows Server course was also compulsory (just like the AWS course) and mainly consisted of memorizing trivial information about context menus, which buttons to click and the licensing terms/durations/prices for different Windows Server versions. I'm jaded from the experience. Got my first two Cs in both since I spent time learning stuff described in the post instead of that nonsense.

[+] gen220|6 years ago|reply
I graduated a top 50 school, that gave us an “engineering” degree but had no such similar course. There were no more than 20 students, out of 300ish, who understood the material of this course well enough to teach it. From my perspective, this knowledge was generally inversely correlated to GPA and strongly correlated to postgrad performance.

Take my anecdata for what it is, but I think these skills are strongly underrated by universities and students alike. Kudos to MIT for publishing this online; I know I would have benefitted from exposure to these topics in my first couple of years.

[+] ameixaseca|6 years ago|reply
Metaprogramming has an actual meaning and it means something completely different from what it is used for on this website:

"Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data." [1]

i.e.: it's a programming technique and not something related to any process related to building code.

I understand the idea of giving some more practical side of using computers in a "missing semester", but please pay attention to the nomenclature. It can be really confusing for someone studying CS that does not yet grasp the concepts that well.

[1] https://en.wikipedia.org/wiki/Metaprogramming

[+] wes1350|6 years ago|reply
This course is wonderful! I've read through all the material and watched all the lectures and I can say it has helped me tremendously thus far. I'm still trying to master all the tools they've mentioned but I already feel much more proficient with e.g. version control, vim, using the command line, etc. If you're an experienced dev then you might already know all of these things, but if you feel that you have some gaps in your knowledge with some of these tools, this course will likely point you in the right direction.
[+] arman_ashrafian|6 years ago|reply
At my school there is a 2 unit class that you must take along with the intro DS&A course that teaches bash, vim, git, testing, etc. It was definitely the class that helped me the most in my internship and also made me a Vim user.

http://ieng6.ucsd.edu/~cs15x/

[+] jsd1982|6 years ago|reply
I think the reason you don't see this kind of course offered is because it is primarily concerning training and not education.

Imagine if your training course 25 years ago focused on the Turbo C IDE and that's all the University offered. You would be amazingly proficient at Turbo C and know all its keyboard shortcuts but that wouldn't be too relevant in today's market.

Keeping such training material up to date with the latest trends is exhausting work and not worth maintaining, especially given how difficult it is to predict what may be the next tool de jure. Contrast this with more timeless educational topics and it starts to be clearer why this sort of thing is explicitly not taught.

[+] jedberg|6 years ago|reply
It's funny, when I was in school, I was always told the difference between a good CS school and an ok school was that the good school only taught you theory and left the practical application to the reader. The ok school had courses on tools and language syntax.

It's kind of awesome to see this coming out of MIT.

[+] xrd|6 years ago|reply
My book about the GitHub API from O'Reilly had a similar idea: thinking in and about tools is an important concept. O'Reilly permitted me to release it under creative commons so you can find it here for free:

https://buildingtoolswithgithub.teddyhyde.io/

[+] dahfizz|6 years ago|reply
I really wish this practical stuff was more emphasized. I graduated with a lot of very smart people who could write great code - but they could not compile, run, test, or check it into VCS to save their lives.

It made group projects hell.

[+] cosmotic|6 years ago|reply
Learning about a command-line debugger and command-line profiling tools would be helpful for those that find themselves in the past. IDEA and Visual Studio have had these things integrated for decades. I find the likelihood of knowing how to use a debugger or profiler is inversely proportional to the amount of time someone spends in the terminal.

It's astonishing how many developers rely on logging/print statements for debugging. The learning curve of using a tool like PDB or GDB is just too steep for most people.

[+] kyralis|6 years ago|reply
Logging and print statements are phenomenally useful debugging aids. I think people get too caught up in what a debugger can give them and forget how much value you can get out of logging as well.

I'm perfectly happy to fire up LLDB and step through a program, but my go-to first step is frequently to add logs, and one of the reasons is that it's often just plain faster. It's kinda like the binary search of debugging: Sure, I could fire up the debugger and step through the long, complicated, probably multi-threaded algorithm... or I could add some log statements, reproduce, and rapidly narrow down exactly where I should spend my time in the debugger.

If you've got a project with a particularly painful compile time for simple changes (which seems like a whole different issue...) or an issue where setting up a reproduction environment is difficult, then sure, fire up your debugger and set up your breakpoints appropriately. But I think debugging via logging gets a bad rap when it's frequently a completely rational first step.

[+] bachmeier|6 years ago|reply
> It's astonishing how many developers rely on logging/print statements for debugging.

One reason is that you don't switch contexts if you insert a print statement in your code. I personally find that to be the least distracting way to debug.

[+] mistahenry|6 years ago|reply
> It's astonishing how many developers rely on logging/print statements for debugging. The learning curve of using a tool like PDB or GDB is just too steep for most people.

On the other hand, since I can’t hook up a debugger to production, it’s also very useful to be able to understand my applications log output (especially tracing/debug statements) to triage the situation as quickly as possible.

Also any time I’m debugging USB or Bluetooth LE protocols, I’m relying on some kind of Wiretrace-like packet logger.

Debuggers have their place but they aren’t the end-all be-all of debugging.

[+] gowld|6 years ago|reply
> It's astonishing how many developers rely on logging/print statements for debugging

http://taint.org/2007/01/08/155838a.html

> While reading the log4j manual, I came across this excellent quote from Brian W. Kernighan and Rob Pike’s “The Practice of Programming”:

>> As personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugging sessions are transient.

[+] pvg|6 years ago|reply
The browser dev tools of various browsers are debuggers and are very widely used - I'd guess completely independently of amount of time spent in a terminal.

Other factors, post-C, are probably things lots of languages don't have good debuggers or good debuggers take a while to materialize, debuggers are less necessary in managed-runtime languages, etc.

[+] veeralpatel979|6 years ago|reply
Ruby has a great debugging story.

You just type "binding.pry" wherever you'd like to stop the application and in your terminal, it will open a Ruby shell where you can access the program state at that point in your code.

https://github.com/pry/pry

[+] duxup|6 years ago|reply
Testing and debugging is still a weak spot for me.

I did a bit of dorking around with web development on my own, decided to change careers, needless to say the boot camp didn't cover any of that and the 3 dev company I ended up with we're still trying to wrangle together good practices and pull old code into the future to the point that we're not doing nearly as much testing as we should... so I'm left to my own devices.

Even my usual web resources don't really cover much in the way of JavaScript / Node debugging outside "here is how you set something up ... k bye!".

[+] ken|6 years ago|reply
The quality of debuggers varies hugely, especially since different languages support different debugging features (e.g., conditions vs exceptions vs neither). Logging works pretty much the same everywhere.
[+] Aperocky|6 years ago|reply
Non CS graduate here, funny how my learning curve has been basically what y'all are saying.

I had one java class before I officially kick started my programming career by wiping windows off my laptop and installing ubuntu. Then proceeded to force myself to do everything from the command line, not that there were many other options. It escalated quickly from there.

Starting from the terminal is much more intuitive than writing 'int main/ public static void main' in an IDE.

[+] woodrowbarlow|6 years ago|reply
> Starting from the terminal is much more intuitive than writing 'int main/ public static void main' in an IDE.

i would take this a step further: starting from _any_ REPL is an advantage in learning programming. the tight feedback loop fosters experimentation. python is another good starting point in this regard.

[+] jacques_chester|6 years ago|reply
In other engineering disciplines this used to be called "shop class" or similar. In his day my dad was taught as much about carpentry, metal work, plumbing etc as he was taught about radios and circuits.

As an educational technique, "sink or swim" is about as efficient as spinning a wheel and handing out degrees.

[+] saboot|6 years ago|reply
This is not only useful for CS people, but to us hard science members. Coding is a mandatory prereq.. which is usually one class on C or C++. Then we're expected to collaborate on a project with hundreds of developers. This is a great resource for those of us who are a bit lost, thank you.
[+] haileris|6 years ago|reply
Good as a primer for those that aren't naturally hackers but decided to become computer science majors and have little to no experience with a unix-like operating system.

I learned Linux, the shell, basic scripting, and the terminal environment in high school out of necessity and then began to thoroughly enjoy it. Planning to enter university as a CS student I took the time to learn Vim, though I didn't start using it regularly until much later.

I can't exactly articulate why, but I'm fairly upset these sorts of things comprise an entire course. What happened to RTFM? Where is the general curiosity? Even if you have no prior experience with a majority of these things, these are the kinds of things you figure out over the weekend while doing your regular courseload.

[+] sn41|6 years ago|reply
When someone comes to me these days for knowledge about basic Unix/MacOS/Linux CLI tools, I direct them to the GNU core utilities documentation - it is very nicely organised according to task. I also demonstrate how CLI can do read-based operations as fast as SQL on medium size databases (just take an SQL dump and pipe) -

https://www.gnu.org/software/coreutils/manual/html_node/inde...

For various sorting and filtering followed by report generation in latex etc., a knowledge of awk also helps. I use the mawk interpreter since I have found it to be slightly faster with ASCII format files.

[+] sumoboy|6 years ago|reply
It's sad to see freshman CS students getting thrown into this unknown world of programming with zero understanding of basic programming principle such at simple logic, tools, and even a book. I just witnessed this last semester where the students first class was basically 'intro was a C++'. No books or resources supplied or referenced.

So it even gets better when the instructor implies don't get help from the internet or examples for assignments, in his eyes everything on the internet is bad or wrong. I gathered he basically wanted the students to fail, what a super hero. Glad to see the effort here to help CS students, great job.