The Imposter's Handbook

[+] eksemplar|9 years ago|reply

I have a degree in CS and I've never found myself in a situation where anyone would discuss bouble sort vs merge sort. Neither have I been in a situation where big-o was relevant beyond the basic concept of not doing obviously stupid shit.

What you've really missed is things like best practices, design patterns and concepts like SOLID, but a lot of people with CS degrees missed some of those as well.

If the book covers this, excellent, but why wouldn't it sell itself on valid points?

[+] jacobsenscott|9 years ago|reply

I have a CS degree, and while nobody sits around talking about data structures and complexity that's not the point. It gives you a foundation of knowledge that you automatically and subconsciously apply to every job you do.

A CS degree prevent you from making a lot of obvious (if you have a CS degree) and costly mistakes. It sort of gives you a crystal ball. You can see that some code isn't going to work when a db table gets to 100,000 records, or that some code is making the wrong space/time tradeoff, or that some code is using the wrong data structure from the standard library.

When your performance monitoring tool is telling you something is slow or leaking memory you have the foundational knowledge to understand why and fix it, rather than spending money on 10 more dynos or whatever.

Computers are so fast and cheap today a lot of this doesn't matter most of the time. The naive solution does just fine. But when the core dump hits the fan you better have one or two CS grads on staff.

[+] frobozz|9 years ago|reply

I disagree. The things that come up every day in practical, real-world professional software development, such as design patterns and SOLID are the things that professional autodidacts normally have plenty of experience in and knowledge of.

The things that they have missed by not taking a Computer Science degree are precisely those things that don't tend to come up, such as big-O and the behaviour of various sorting algorithms.

Anyway, here are two paragraphs on the linked page that you might like to see:

"More than just theory, this book covers many practical areas of the industry as well, such as: Database design, SOLID, How a compiler works, sorting and searching algorithms, Big-O notation, Lambda Calculus, TDD and BDD."

"One of the more subjective parts of the book, but I was asked by many people to write about these things. Specifically: SOLID, structural design, TDD, BDD, and design patterns."

[+] marcosdumay|9 years ago|reply

> Neither have I been in a situation where big-o was relevant beyond the basic concept of not doing obviously stupid shit.

How do you know you are doing stupid shit if you don't know about complexity and don't know your algorithms?

Really, I do work on CRUD applications from time to time, and I often have to select algorithms based on complexity. Yeah, didn't have to implement one of them for ages¹, but I do have to tell coworkers things like "here you use a set", "here you use a list", "this sorting algorithm isn't stable", or "nah, just use brute force and get done" once in a while.

1 - Or, better, did implement a B-tree for a side project just a couple of months ago. Ended up just throwing it away, but I didn't know I wouldn't use it at the beginning.

[+] teekert|9 years ago|reply

I have no degree in CS and I see these terms (Big O, np vs p, etc) regularly, mostly here on HN. No idea what they mean, this books sounds great to me.

[+] Periodic|9 years ago|reply

What I find most illuminating about the deep software theory is it helps me understand why things are best practices. There are a lot of things that are good to do, but understanding deeply how the compiler things and how the languages are designed gives me insight into why we choose to do the things we do.

The best example is taking a few options someone has for their function interfaces and reframing it in terms of type algebras which helps expose where it's overly complicated. The two best examples I have are making IO and mutations of input explicit and localized in single a location so the user doesn't have to guess which functions are changing things and which ones aren't. Once you have a good framework for describing these abstract boundaries a lot of other principles of design fall out from keeping it simple and readable.

[+] ryandrake|9 years ago|reply

There are 1. "I get it to compile" programmers, 2. "I get it to work" programmers, and 3. "I make it great" programmers. A solid computer science foundation, knowledge of the literature, and experience can (but not necessarily will) get you from 2 to 3. Sadly, I've encountered quite a few professional programmers, complete with CS degrees, who year after year barely scrape by at 1, so it's not a given.

[+] drak3|9 years ago|reply

The excerpt on the Boolean Satisfiability Problem reads

> The basic concept that people have figured out, so far, is that a number of NP-complete problems can likely be solved if we crack the Boolean Satisfiability problem.

And

> If NP-Complete problems get resolved, it is likely (though nobody knows for sure) that we'll crack every NP-Problem

Isn't the definition of a NP-Complete problem exactly that it is in NP _and_ every other problem in NP can be reduced to it in polynomial time. So we know _for sure_ ([Cook71]), that as soon we have a polynomial algorithm for SAT _every_ problem in NP can be solved in polynomial time, and not just some of them as the excerpt claims.

Am I missing something? Because this seems like a very confusing, if not downright wrong, way to explain NP-completeness and its link to SAT.

[Cook71] Cook, S.A. (1971). "The complexity of theorem proving procedures". Proceedings, Third Annual ACM Symposium on the Theory of Computing, ACM, New York. pp. 151–158.

[+] acs5|9 years ago|reply

Also, the description of the boolean satisfiability problem isn't the boolean satisfiability problem at all, but just what we might call the boolean evaluation problem, or a version of the circuit value problem, which is certainly in P.

I have no idea what's going on in the lambda calculus excerpt further down the page, in particular substituting (λx.x x) with (x x)? There seems to be a fairly big misunderstanding here. And lambda calculus isn't reduced in any particular order -- there are many ways to reduce the same term.

[+] chrisseaton|9 years ago|reply

In my opinion there are also serious errors with the description of currying.

https://github.com/imposters-handbook/feedback/issues/50

[+] lorenzhs|9 years ago|reply

Yes, you're absolutely right. The other excerpt paragraph right next to it is correct but terribly misleading:

> That "undecidable" part is what makes this problem [the Halting Problem] NP-hard.

I mean, that's true - but it's very misleading. The most common context in which NP-hardness is discussed is when talking about NP-complete problems - those that are NP-hard and in NP. The way you go about showing that a problem X is NP-complete is showing that it's in NP (most of the time, this is the easy bit) and then showing that it's at least as hard as some NP-complete problem Y. This is usually done by transforming an arbitrary instance of Y into an instance of X in (deterministic) polynomial time, and showing that the X-instance is satisfiable iff (<=>) the Y-instance is.

Then there are problems for which NP-hardness has been shown but it's unclear whether they're in NP. Those are often of a continuous nature. I think deciding whether a level of Super Mario World is doable falls into this category.

[+] marcosdumay|9 years ago|reply

You aren't missing anything. It's wrong.

[+] apeace|9 years ago|reply

I have a funny anecdote about CS.

I am a CS dropout who has been working in startups for a few years. About once a year, I see another programmer making a common mistake, and I draw on my CS knowledge to help them out.

The mistake is parsing HTML with regular expressions. It is so tempting to write a good ole' regex to grab that attribute value off of that element. And it works on the 5-6 samples you write your unit tests with. If you have run into this before, you may know that zalgo[0] tends to appear in this situation. Parsing HTML with a regex will always fail eventually.

Of course the reason is that HTML is not a regular language. It is a context-free grammar, and thus requires a parser to parse it.

The funny part is, I failed CS Theory once, and was in the middle of taking it again when I decided to drop out. But this tidbit of knowledge has always stuck with me, and I've used it again and again to fix or prevent bugs in real software.

Takeaway: CS knowledge does have real-world value to everyday programming. You just have to know what you're looking for. And of course, always use an HTML parser to parse HTML.

(I'll add one more note to anticipate a common response. If you look at any HTML parser, you will see regular expressions in the code. These regex are used for chunking the HTML, and from that point the chunks are parsed.)

[0] https://stackoverflow.com/questions/1732348/regex-match-open...

EDIT: Formatting

[+] brassic|9 years ago|reply

I had to scrape the comments out of a blog. As you might expect, there was a div for comments, with comments nested within it. This made it easy to grab the comments using an HTML parser with xpath support.

Unfortunately, the blog software had a bug and it was possible for markup to leak out of the comments. Sometimes a spurious </div> would close the comments div and the scraper would miss comments that came after it. However, the HTML did contain helpful HTML comments, something like "comments start here" and "comments end here". The reliable solution was to use a literal string search on these HTML comments to pull out the entire comments section, and then to use regexes to pull out the comments' content.

The only unbreakable rule is that there are no unbreakable rules.

[+] legulere|9 years ago|reply

I think the HTML/XML/JSON is not a regular language story is a bad one, because subsets of them are indeed regular. Most of the time the data you're trying to parse doesn't contain arbitrarily deep nesting and could actually be parsed with a regex. Further regexes of languages like perl can parse a superset of regular languages.

The real problem with regexes is that they are hard to maintain and are extremely hard to get right to neither have false positives nor false negatives.

[+] GFK_of_xmaspast|9 years ago|reply

This is a case where a little knowledge can be a dangerous thing, because "regexes" are often strictly stronger than just "regular expressions".

(That said, the moral of this story is still correct, i.e. use a parser, but mostly because html-in-the-wild is uniformly awful and it's better to let someone else worry about that).

[+] anexprogrammer|9 years ago|reply

The main thing people miss out on not having a degree is not getting past silly HR "must have degree" filtration.

Never once found a CS degree a worthwhile indicator of ability.

It may be a superb book, but not even giving a sample chapter out to judge writing style, quality of explanations, depth and so on?

[+] Clubber|9 years ago|reply

I think there is more to a degree than that, but you make a good point. Now a days, instead of trying to figure out a problem, coders try to figure out which framework has already figured out the problem. It allows for less experienced people to get things done, but limits you to being able to mortar together bricks rather than make bricks.

The problem arises when you need to make bricks. You don't necessarily need a CS degree to make bricks, but you need to learn most things taught with a CS degree.

[+] 6DM|9 years ago|reply

Even funnier, coming across companies that don't accept anyone who doesn't have a degree from xyz university.

[+] continuational|9 years ago|reply

Comments like these can only really have two causes:

- You've bought into the anti-intellectualism wave that's going on in the US.

- The universities near you are really poor.

Is it really the case that you learn so little on these universities that you literally have no advantage over those who didn't attend?

[+] cmrdporcupine|9 years ago|reply

Having a CS degree gets you through a Google-style interview -- where they will hammer you for hours on your ability to recall the skills you needed to pass your algorithms classes in university, and grade you almost solely on that.

And then you will start working there and almost never use those skills again, because you'll be using a vast library of datastructures and algorithms in the language of your choice, and the skill you will need the most is the ability to analyze why you might need one vs the other.

Unless you're on the team which is writing said libraries, which is rare.

[+] datguacdoh|9 years ago|reply

I've interviewed somewhere around 200 people at this point and from my perspective, the degree, university or GPA have had no correlation with how well the candidate does. Now, I interview for Ops roles and not software engineering, so this might not apply across the board for other roles, but I have my doubts.

[+] galdosdi|9 years ago|reply

It may even be useful to lack a degree, since the job market is otherwise good and you automatically filter out the lamest companies that way ;-)

(No, not a perfect heuristic, but I think it may not hurt)

[+] isp|9 years ago|reply

Via patio11's tweet: "Pretty brilliant idea: 'Imposter Handbook', for teaching self-taught devs what they missed by not doing a CS degree." - https://twitter.com/patio11/status/767204505578409984 (with replies)

[+] squeaky-clean|9 years ago|reply

I'm always pretty skeptical of anything that offers "Learn X Quick!". Is there any reason I should trust the author of this as a good source? They even state they only started learning this stuff a year ago. I wouldn't take a course or buy a book from anyone claiming only 1 year's study in any subject, why should CS be any different?

edit: It's also hard to find the author's name to find out who they are. It's not in text anywhere on the page, only in the image of the book cover.

[+] haddr|9 years ago|reply

Yep, people are really forgetting that most of the time it's just being clever rather than having (or using) in-depth knowledge is what allows to bring some good solution to the table.

An example from today: we have a very slow pattern matching code, that starts to be a bottleneck for the application. What can you consider? Well you can dive into sexy bloom filters, experiment with some Trie-based structures. But then when you analyse the problem it results that simple word lookup with a simple hashtable is the fastest solution for given constraints. No big deal, no rocket science.

Probably the same goes for rocket scientists, but one level higher ;)

[+] scaleout1|9 years ago|reply

when I was college I thought all the data structure and algorithm courses were complete waste of time. Primarily because I was working as an intern on the side, making crud applications, slinging xml, writing DAOs etc.

For first five years of my career I never had to touch any of the stuff I learned in school and I was particuarly happy that I mostly mailed it in in those classes. Eventually my career evovled into dealing with data at massive scale and working on some of biggest services on the planet and the way I have taught myself to program completely changed. No longer it was possible to just sling code and hope that it will just scale to million of users. All the stuff that I slept around in class was relevant again and I had to go back to coursera and take those classes all over again. So moral of story, if you will be slinging webapps rest of your life you probably dont need to know Big O, different search algo, linear algebra and statistics etc but if you think you will be working on stuff thats coming around like automanous cars, IoT, augmented reality etc, you should definitely read up on it

[+] jasonjei|9 years ago|reply

I think one of things that "millennial" self-taught programmers have trouble understanding is OS. While the need to implement OS functionality is now a niche (like process scheduling), I have noticed that "millennial" self-taught programmers have gaps in knowledge with respect to threading, mutex/semaphore, consumer-producer pattern, synchronization vs. lock-free (blocking vs. non-blocking), concurrency and parallelism. I still feel these are essential topics to understand since these issues often come up in most programming languages (JS, Ruby, Go, etc). When deciding between Nginx or Apache, for example, knowing the difference in philosophy is useful (asynchronous vs. thread-based). One of things that CS trained me very hard was to think about cost; when I programmed before my CS education, the old adage of everything looking like a hammer was particularly true for me. Also, many of the self-taught guys need to use fewer libraries/gems and more stdlib and primitives because every additional include is not always necessary and may unnecessarily increase complexity :)

Pipelining is another concept that I feel self-taught programmers have weaker foundations--many of whom I have worked with write code that waits for all results to become available, while an operation that's blocked by I/O doesn't necessarily mean we can't do stuff with the CPU while we're waiting for the next batch of I/O to come through.

I have also seen self-taught programmers accidentally write O(n!) or O(2* * n) functions and not realize it. I think data structures is definitely a good chapter to have. Especially when writing queries to a data store.

I think explaining how a hash table works would be excellent since it is such a useful and fast data structure. A lot of set-taught programmers sort of treat them like magical black boxes when it's not a very complex data structure yet it's practically O(1) for most insertion/reads/deletions.

Memory management is fortunately something we don't really need to worry as much about. With languages and interpreters that do a very good job of cleaning up after our code and now that memory is relatively cheap, we can afford to ignore it until we need to scale.

Of course, if you have a good product, you can get away with inefficiency and hire CS guys when you have built a unicorn. ;)

[+] nspriego|9 years ago|reply

Just curious, why the "millennial" descriptor? Is having trouble understanding the OS something unique to only "millennial" self-taught programmers?

[+] collyw|9 years ago|reply

Everyone has gaps in their knowledge. There are two devs where I work, and we are both better than average in my opinion. I know Django inside out and deal with server deployment, while the other guy knows more front end and HTTP / web stuff. We both know how to use a database properly and not to over engineer stuff. We compliment each other pretty well.

[+] xenihn|9 years ago|reply

I appreciate this post. I definitely lack knowledge on the stuff you described, sadly...

[+] unknown|9 years ago|reply

[deleted]

[+] fitzwatermellow|9 years ago|reply

As an alternative, I would recommend something like Michael Kerrisk's The Linux Programming Interface in lieu of a survey of foundational CS. Gaining a deep understanding of memory, files, processes, threads, signals, sockets, etc. As well as strong Emacs-Fu and bash scripting ability. These are the first steps on the path to mastery ;)

[+] riskable|9 years ago|reply

What's interesting is that what you're talking about is computing fundamentals. Stuff that kids should be learning about in high school before they go to college. At the very least they should teach kids what happens when they type a character in a text editor and then save that as a file. Kids should know that the key switch state change gets detected by the keyboard hardware, sent as a signal over the wire, detected/handled by the OS/kernel/driver, sent to the program as an actual keystroke which decides to "display" it to the user by updating the interface, etc etc.

Discussions surrounding what happens when two programs try to write to the same file. How to detect when a file changes. Stuff like that. These things just don't seem to come up in high school education and I can't help but wonder why. It'd go a long way to giving people explanations as to what's wrong with their computer when it's running slow or a basic means of interpreting error messages/conditions.

[+] anoother|9 years ago|reply

Proofreading error near the 'buy' button.

> I've learned more in this last year since I started programming over 25 years ago.

Should be '... this last year than since ...'

Personally, I think '... the past year than since ...' reads better also.

Do I get a free copy of the book for pointing that out? ;)

[+] mkchandler|9 years ago|reply

There is a repo for posting feedback (such as proofreading errors) here: https://github.com/imposters-handbook/feedback/issues

[+] bogomipz|9 years ago|reply

I am curious, do people actually pay $30 for an ebook without even being able to see a(propsed) table of contents of a sample of the writing?

This is unusua. With both Amazon and LeanPub you can at least gauge the writer's style or get a feel for writing quality by looking at a sample chapter and a table of contents.

I'm skeptical that all of those people praising the book bought the book site unseen.

[+] ljk|9 years ago|reply

apparently this is the ToC for "pre-release #2" http://i.imgur.com/ssYr5ki.png

seems like pretty simple stuff; couldn't all the information be found with one internet search away?

[+] cookiecaper|9 years ago|reply

This thread is long and this will probably get buried, but I'll leave it here anyway. I'm really excited about this kind of thing and have sometimes thought of writing one myself, primarily as a means for me to hammer out all the theoretical areas that are still foggy for me as an autodidact. If this can make it easy to pass impractical textbook-style interview questions and give a good, reliable foundation of CS knowledge that won't go away ten seconds after closing the Wikipedia page, I'd love to buy it (and I still may write my own some day just for good measure :P). I think autodidact programmers is a rapidly-growing and under-served market (though, unfortunately, I don't think it'll be allowed to go on much longer; I expect professional licensing organizations similar to the ABA to show up on the horizon soon).

[+] pjc50|9 years ago|reply

I like the approach of hammering out knowledge through writing it down. I taught myself to fill in a lot of knowledge gaps by answering questions on electronics.stackexchange.

[+] brians|9 years ago|reply

The example text about the y combinator looks mistaken to me. It says Y can "find" a fixpoint, and sketches an example of a fixpoint in a numerical function. It implies Y is doing something like convergence.

But that's not what Y is at all. It's called the fix point combinator, yes, but with the assumption you're going to use it in some curried lazy evaluation scheme with higher order functions.

All this on ycombinator.com, too!

[+] unknown|9 years ago|reply

[deleted]

[+] LeonM|9 years ago|reply

Anyone knows if there will be a hardcopy available? I live a paper free live, except for books, I just hate reading from a screen...

[+] RawData|9 years ago|reply

What programming language is generally used throughout the book?

[+] Jayakumark|9 years ago|reply

He also wrote another book http://www.redfour.io/ take off with elixir. Anyone have read that ?

[+] cheeseprocedure|9 years ago|reply

I went through the video version. I found it worthwhile overall, but grew frustrated later in the tutorial as the code displayed in the video (and in the associated GitHub repo) drifted substantially from what it had actually been guiding me to build.

[+] outworlder|9 years ago|reply

Is there a book like this for Calculus? My CS is fine, but I had horrible Calculus teachers and was unable to cross the chasm myself.

[+] camel_Snake|9 years ago|reply

Honestly the best resource is probably Khan Academy.

[+] awkward_yeti|9 years ago|reply

Where is the ToC ?

[+] internals|9 years ago|reply

I didn't find it online either, but did just pick up the book. This is ToC of pre-release #2 (~500 pages): http://i.imgur.com/ssYr5ki.png

227 comments