top | item 35226887

Ask HN: Why do many CS graduates lack foundational knowledge?

386 points| platzhirsch | 3 years ago

Recently, I have started interviewing interns in their final semester for an internship and to my surprise I frequently encounter a lack in what I would call foundational computer science knowledge. I don't mean data structures and algorithms, but for example

* Database Systems (relational algebra, SQL)

* Concurrent Programming

* Network Programming

It seems most are exposed to them partially through project work but without the base knowledge.

Is this typical for CS undergraduate degrees because you get to pick your own classes?

725 comments

order
[+] dathinab|3 years ago|reply
CS is Computer Science not software development or programming or computer engineering.

People/companies commonly treating both the same is IMHO one of the major problems of the current industry.

None of the topic you mentioned are fundamental to CS.

They are fundamentals of software development.

Wrt. to computer science they are at most specializations and even then what you might do in a science context of them might differ largely to what you would need to use them for production focused software development. Through they do contain some fundamentals like, e.g. set theory in relational databases and graph theory in network programming and concurrent programming.

You can (rightfully) have a master of computer sience _without having ever written a single line of code_. And going back ~20years that wasn't even that uncommon.

Now today a lot of universities have realized that this mismatch causes problems and are also teaching the fundamentals of software developments additionally to the fundamentals of computer science. Additionally of lot of computer science today requires the use of tooling which requires some programming and SQL.

Still what the "fundamentals of software development" are is a much less clear topic then the "fundamentals of computer science" (and even there people disagree all the time). And for example "relational databases/SQL" is one of the thing people can strongly disagree on weather its foundational to software development or not (anymore).

[+] WalterBright|3 years ago|reply
CS is about Computer Science, not programming. The foundational courses should be in math, physics, statistics, calculus, proofs, algorithms, complexity, type systems, theorems, computability, etc.

Learning how to use SQL is more of a trade school course.

I have a BS, Bachelor of Science. The foundational classes were math, math, math, math, and more math. There were no classes in how to operate a machine tool. I befriended the guy who ran the machine shop that built apparatus for the scientists, and he taught me how to run the machines. But that wasn't a class, it was just something I did on my own initiative.

[+] bruce511|3 years ago|reply
>> Learning how to use SQL is more of a trade school course.

While SQL the language is more "trade school", the original poster mentioned;

>> Database Systems (relational algebra, SQL)

Certainly I would expect a CS course to cover databases, in the sense of 3rd-normal-form etc, sorting, searching, indexing and so on. I wouldn't expect them to necessarily be proficient in any one database product, but I would expect them to understand different ways of storing data, and how to design a "data layout" based on good practices (again 3rd normal form etc).

[+] herewulf|3 years ago|reply
It needs to be more like a trade school. The vast majority of jobs call for workers who can create and maintain programs that actually do things, not just theory.

I've encountered far too many CS interns and grads who couldn't actually write reliable, non-spaghetti code. They also typically write commit messages that consist entirely of "Updated $THING" (no shit, I can see that from your diff), nothing about the why.

My perspective may be slightly skewed from being called in as a consultant to fix the software these "computer scientists" have been hacking on, but I encountered a lot of these people when I was a undergrad student also.

[+] dgb23|3 years ago|reply
Relational algebra is math. SQL is a practical application of relational bags.

There’s math and science behind concurrency and parallelism, and the coordination thereof.

Network programming is a practical application of several parts, each of having foundations in math and science. From coding/decoding, to state machines. Wires are all about math and physics too.

There’s more that is often left out. Specifically holistic systems and their research. Lisp, Smalltalk, Self etc. These systems reach into math, HCI, compilers, JITs…

[+] richrichardsson|3 years ago|reply
> I have a BS, Bachelor of Science.

Mildly off-topic, but is "BS" the usual shortening of this type of degree where you hail from? If it's United States I would be even more surprised that this would be the case, since who wants a "BS Degree"? In the UK at least we normally refer to them as BSc degrees and there's never really a need to expand.

[+] cmrdporcupine|3 years ago|reply
The relational data model is first order or propositional logic, and set theory. It's math & logic. It is intrinsic to computer science, since Codd's foundational papers.

SQL itself is full of problems, but the foundations behind it are definitely not; database theory and database internals is one of the deepest domains of computer science, up there with operating systems. And dismissing this knowledge as "trade school course"work is... sad.

[+] AnimalMuppet|3 years ago|reply
In the real world, though, 95% of those with CS degrees are going to work as software engineers, not as computer scientists. So what we really need is software engineering programs, and far fewer people in CS programs.

We need, essentially, the divide between chemistry degrees and chemical engineering degrees. That hasn't happened yet for CS, but it needs to.

And that's what the OP is asking for - people who have software engineering knowledge, not CS knowledge.

[+] 7952|3 years ago|reply
It is common in science to do experiments and learn lab skills to be able to do those experiments. Surely programming languages are just a way of doing practical work.
[+] gosh400|3 years ago|reply
That sort of erroneous thinking (about SQL) was the reason Sun Microsystems (Java, NeWS, Sparc processors, etc), didn't have a Relational DBMS, and then had to buy MySQL for $1B, in order to make itself an attractive purchase (to save the company) - which Oracle then did for $7B.
[+] mhh__|3 years ago|reply
Databases aren't about SQL as much as a taste of distributed systems.
[+] eurasiantiger|3 years ago|reply
Programming is just applied discrete mathematics.
[+] hardware2win|3 years ago|reply
>CS is about Computer Science, not programming.

Then why almost every semester has some heavy programming course?

[+] raincole|3 years ago|reply
SQL is a trade school course? Do you think

1) We should teach relational database purely from relational algebra without an practical example.

or

2) Relational database is a trade school course too.

If 1), maybe it makes sense for in academic sense, but I personally would rather to work with an intern who doesn't know database at all than someone who knows a bit of relational algebra but not SQL.

If 2), how about operation systems? Compiler? Networking? Are these all just implementation details and not belong to a CS cirriculum?

[+] alxmng|3 years ago|reply
How is computer science not programming? The vast majority of CS research papers are about programming.
[+] jiriknesl|3 years ago|reply
I think exactly the opposite. 95% of people studying CompSci are going to work as developers, system administrators or IT managers.

They should know systems programming, databases, operating systems, web app development, security, basic logic, data structures, algorithms, JavaScript, and some backend languages. Add some ability to interview users, design at least basic UI, organize tasks, release your product and monitor it.

Math, physics, statistics, calculus, proofs, theorems, and computability should be the interest when you decide to pursue PhD. For someone who's going to build ERP systems in Java, Oracle ad Angular these things are completely useless. Most of the developers are like this.

[+] glitchc|3 years ago|reply
The answer is simple: Those are typically optional topics in a CS curriculum. You might as well ask why an undergrad doesn't know compression algorithms, compilers, robotics, software reverse-engineering, cryptography, computer vision... All of the above are optional and will only be taken by those who are interested in them.

The more important question is: Are you explicitly mentioning databases, concurrency and networks in your posting? If not, then it explains why candidates are not filtering themselves out.

[+] platzhirsch|3 years ago|reply
This is insightful because in my degree those classes were mandatory as part of a three year degree.

To answer your question: we don't. I have no expectation around this. Primarily asking for my own curiosity in terms of differences in CS degrees.

[+] jakevoytko|3 years ago|reply
I think you're forgetting how computer science programs operate. You're listing a bunch of specialty skills that are likely covered once in a topic class and never again. You said "I don't mean data structures and algorithms," which excludes the actual foundational skill that they had to build on in every class, in addition to raw programming chops.

Why don't they know database systems? They might have taken a database course for 4 months 3 years ago and never touched a database again because it's not a trade school. School just validates that you can learn a series of related skills over a few months when necessary.

What's the last thing you started and dropped after a few months just before the pandemic? How comfortable would you be if you interviewed for a job exclusively on that skill?

[+] urthor|3 years ago|reply
University interns are mostly 21 years old.

21 year olds won't know very much of anything in general.

A 21 year old 3rd year college intern is... 21 years old

Three quarters of a four year computer science degree doesn't change the fact they're a 21 year old.

Even in the topics they have covered, the knowledge won't be very deep.

A true mental model of concurrent programming is not something easily obtained.

Frankly, most 35 y/o engineers don't truly appreciate the intricacies of intra-thread concurrent algorithms, unless it's their specialized area.

Frankly, most engineers in the industry are too lazy to learn SQL well.

Lower your expectations of 21 year olds. Lower your expectations of the workforce in general. Hackernews is a self selecting community of tech works who study their job in their spare time as a hobby.

Most people I've worked with go home and watch football after 5pm.

[+] colinwilyb|3 years ago|reply
This is a really great answer.

It's worth noting that if you're in the position of interviewing then you probably have 10+ /years/ of industry specific experience.

The applicant's prior ten years included 2 years of covid prison, a fractured high school experience, and learning basic life skills.

[+] jwmoz|3 years ago|reply
You know I used to be so busy with learning things when I was younger but now I've started watching football at the weekends. Finally experiencing the normie life :D
[+] ilyt|3 years ago|reply
> Frankly, most 35 y/o engineers don't truly appreciate the intricacies of intra-thread concurrent algorithms, unless it's their specialized area.

Frankly those who do would try to avoid it anyway unless it is truly the best/only solution to the problem.

The best way to not have concurrency bugs is to have as little of it as possible. If I have a choice of making single job faster vs just running 64 jobs in parallel on CPU I'd pick the second every time I can because code will be simpler and less buggy every single time.

[+] LegoZombieKing|3 years ago|reply
As a 21 yr old, 3 years into my computer science degree, I (for the most part) agree.

If I were just following the course material I wouldn’t expect too much from what you can learn from course work.

I have even become slightly addicted to learning everything I can about building software. From reading (3 of Robert C. Martins books, and a few other popular ones from hacker news’ top 100 reads) and taking Udemy courses[0] in my free time.

I also have been working at a small teacher resource website for the past few years, to get some on the job knowledge.

Even then I still don’t feel like I’ve got the best understanding of SQL, networking or concurrency. I spent most of my time learning specific languages and practices and principles like Agile and spec docs. I’m working on building my own web app and have been creating a dev log for it[1], as well as building an arguably crappy personal website[2].

[0] https://www.udemy.com/user/legozombieking/ [1] https://youtube.com/playlist?list=PLeFBzv7SGgs903uH6Mfm34J94... [2] https://www.cadleta.dev/

[+] techwizrd|3 years ago|reply
You hit the nail on the head in your post—these are things _you_ consider foundational knowledge.

I graduated with my Bachelors in CS in 2016, and those classes were optional senior electives. You were required to take a certain number as well as some required ones (i.e., Computer Architecture, Analysis of Algorithms). I chose to take Database Systems, Data Mining & Machine Learning, Robotics, Computer Vision, etc. as electives but not Concurrent Programming or Network Programming because I already felt comfortable with those topics. Others chose classes in topics like mobile application programming or programming language theory.

Those topics may be foundational for you, but not for others.

[+] urthor|3 years ago|reply
I'd argue that knowledge about concurrency, a model of CPU threading, and race conditions/data sharing between threads is foundational.

Database transaction locks, data (form of race condition), SQL (declarative graph traversal combined with a simple projection), slightly derivative.

Compilers and SQL are technically not the foundations IMO.

Jumping/reading/evaluating/copying data, binary trees/log base 2 hierarchies, state machines, set theory, functional programming, Von Neumann model plus knowledge of multiple pipelines for integer adding are the basics.

...But, studying compilers and SQL is highly advised. Compiling code, and an understanding of database transactions locks are incredibly important practical skills.

[+] PossiblyKyle|3 years ago|reply
The most surprising part is ML being an elective course in 2016. ML is mandatory in my university, but DL is elective
[+] sophiabits|3 years ago|reply
I think there's a tendency in our industry to forget just how hard it is to get decent at programming. All three of those topics you've listed are big, complicated, and hard to get right. I have coworkers today who struggle to write code that isn't susceptible to race conditions. Before you even get to concurrent programming, there are a thousand other things you need to learn beforehand.

Interns / juniors have little to no practical experience, and practical experience is where you _really_ learn how to program.

I think there is room for innovation in CS/SE education. Imo some sort of "code review" class where students analyze and report on a bit of code would do wonders for interns / juniors ability to onboard quickly into their first job. I've written about this in the past [1]

[1] https://sophiabits.com/blog/the-one-change-id-make-to-se

[+] throwaway2037|3 years ago|reply

    "just how hard it is to get decent at programming"
We can improve that with: "to get (and stay) decent". I'm getting older, and it is so hard to keep up with changes. The async (coroutine) revolution in several languages in the 5-10 years was one of the hardest hills to climb ("WTF? How do you debug this sh-t!?").

Your comment makes me think about a blog post that I once read (I cannot find it now). Roughly, it was a list of things you need to learn as a computer programmer, but are not explicitly taught in classrooms.

One nice example: What is "staging" in Git? Hell, I've used Git for years now, and that part of my mental model is _still_ fuzzy, but the model is good enough to do my job well. I cannot know everything crystal clear.

I would the same for concurrent programming. Not looking down upon anybody, but the vast majority of programmers will never go deeper than: "Oh, just use a thread/worker pool with a runnable to do work." I've gone deeper, but blew off both my feet several times! (Hat tip to in/famous C++ quote.) And I only need to write concurrent code a couple times a year. I'm always rusty when I get back into it.

[+] bobthepanda|3 years ago|reply
I think it would be hard to do, in a way that wouldn't end up sometimes being "the blind leading the blind."

I remember in university we had peer review for writing courses, yet that didn't really seem to elevate anybody's writing more than it had been before. The people who did well continued to do so and the people who did not also continued to do so.

[+] simonsarris|3 years ago|reply
When I was a student representative on the Computer Science curriculum committee (over ten years ago now, at an engineering college, Rensselaer Polytechnic Institute) the concerns were not about how we can ensure students learn more about database systems or concurrent programming. They were solely based around an almost exasperated set of older professors who were shocked that new students didn't seem to know what they knew before, and were unable to handle the curriculum as it had been.

For example several first year students not only had no familiarity with calculus, many were having a hard time with algebraic concepts. Concern was around finding some way to add a remedial math course before Calc 1. This pushed everything else out farther.

CS grads may lack foundational knowledge simply because high school grads lack foundational knowledge. What it took to "pass" high school 30-50 years ago seems substantially more rigorous than today.

The administration didn't care. They didn't even care if the students dropped out. Butts in seats = more money. That's it.

[+] ben7799|3 years ago|reply
I went to Rensselaer as well, graduated 24 years ago. I am mostly shocked at the stuff it is obvious other schools are not interested in teaching these days in terms of core computer science curriculum.

Maybe the issues around high school math are part of it, but this was kind of obvious in the 1990s as well, I remember sharing a cube as an intern with a graduate student at another well known school and they were taking graduate classes in the summer and were using the same book for a supposed graduate class that I had for a class my freshman year at RPI.

I think RPI's core program requiring Data Structures & Analysis, Fundamentals of CS/Models of Computation, Programming Language Design, study of Grammars, etc.. is an outlier for undergraduate CS curriculum.

It seems rare I come across someone who can analyze the complexity of an algorithm, check if something can be parsed by a Regular Expression, really understands recursive algorithms, etc.. unless they went to graduate school, and the industry doesn't seem to expect people to understand this stuff.

But you can't necessarily generalize. Not all high schools are the same and they never were. And old professors at RPI are just mirroring the historical reputation of RPI as a tough school. In the past they probably just had more freedom to fail all those students out of the school.

[+] thwayunion|3 years ago|reply
This is it -- colleges and universities are desperate for butts in seats ~> admissions standards slip ~> first year is remedial ~> lots of stuff gets dropped in later years.

Another major consideration: "CS degree" is a misnomer in the US. We have thousands of colleges and universities that vary from "best in the world" the "literally fraudulent". I've even come across a few small colleges where the CS department is staffed entirely by people with a few years of industry experience and an MS, who likely wouldn't even clear the bar for a senior eng role.

[+] ohdannyboy|3 years ago|reply
Databases, network programming and concurrency (beyond basic constructs like mutexes) are not fundamental. SQL is certainly not fundamental to the major, that's a specialty only needed by a small percent of programmers.

I would suggest looking at the CS departments page from whatever university you mainly recruit from to get an idea what the core of their program looks like.

[+] alkonaut|3 years ago|reply
Databases and networks have very little to do with CS. I think the problem is that we confuse "Computer Sceience" with software and programming. Computer science is to programming as Physics is to mechanics of repairing a car. You are basically a car mechanic hiring an intern physics student and wondering why they don't know anything about timing belts.

Perhaps the problem is that we train so many computer scientists to do programming, but that's the learn-on-the-job part I guess.

I met version control and unit tests on the first day of my first real programming job, because I studied physics, math and computer science. And I'm still (by far) the go to person in the office for all matters related to plasmas or cosmology.

[+] lordnacho|3 years ago|reply
I think the only thing you can really expect them to have done is DS&A plus a bunch of math.

All the things you mention are fairly big topics, and not only that, you only really understand them by doing a bunch of coding over several years. You can get introduced to them in university, but a degree course is only so many topics and they all need an introduction. Chances are if a student has done these things it's superficially, in practicals that are similar to practicals in other sciences: you don't really understand it, you write it up, and then you don't rely on what you did for further studies.

I studied a bunch of things at university, leaving without being particularly good at any of them. For instance I built a radio and a bridge in my first year on the engineering degree, but I couldn't just become an EE or civil engineer from that. I wrote a thesis about early WiFi for the business school, but that doesn't mean I could just be a product manager.

Similarly, a student may have done a bit of joining tables in SQL, a bit of multithreading, and a bit of routing during practicals, but you wouldn't think they really understood any of those things in the way someone with a couple of years on the job would.

[+] Mizoguchi|3 years ago|reply
I get co ops from first and second tier schools and they usually join lacking a lot of fundamentals of CS.

However in my experience most of them pick things up relatively quickly and end up becoming good engineers and reliable team players.

At a BSc level with no experience we should be looking for genuine curiosity, motivation and interest in learning and solving hard problems.

Everything else can be taught.

We have even expanded our program to include interns with degrees in other areas of science such as mathematics, physics, materials and mechanical engineering, some with very little programming experience, with great results.

Our objective is always to hire them and retain them, so we do invest plenty of resources in training them well.

[+] busterarm|3 years ago|reply
We almost exclusively hire new grads and have coops/interns from first tier schools. We kind of have this stupid bias against everyone else in our hiring process.

These engineers all turn out to be okay but we don't end up with any new or advanced ideas out of this pool. We cargo cult every behavior that Google does because a few of our senior engineers were ex-early-Googlers.

If feels like working with the blind. There's no interest from our engineering teams about new developments in the field and they don't even recognize when the work that they're doing is a fit for new patterns.

99% of these people don't know what a CRDT is and don't recognize when they're accidentally building one. Once every six months or so someone will post on Slack about their eureka moment of just discovering what a Bloom filter is and how it might apply to long-lived problems that we have.

To someone entirely from the practical/self-taught/trade side of things it's a kick in the teeth knowing what I bring to the table and how my org depends on me but doesn't respect me enough to hire other people with the same background.

[+] dboreham|3 years ago|reply
Because it takes much more than an undergraduate degree to learn the field properly. If you find a graduate who is clueful that's probably because they've been hobby programming since age 12, or working in the summers somewhere they could get good experience.
[+] notafraudster|3 years ago|reply
I did a CS degree 20 years ago. Databases, concurrent programming, and network programming were all 4th year electives (I took network, AI, and image processing personally). But something not being included in a core curriculum doesn't mean someone can't learn it.

My curriculum was: Year 1: Intro Comp Sci / Year 2: 2 courses in Logic, 2 courses in data structures and algorithms / Year 3: 2 courses in processor design, 1 course in finite state automata, 1 course in parsers / Year 4: a course in ethics, a course in team programming (which covered UML and version control), and two electives.

I believe a major was 14 courses so I'm missing one, or it may have been it was three electives. I didn't take databases because I was already a paid sysadmin before I started college and mostly at the time database courses were just ten tedious weeks of normalization crap.

Also, treat your interns better. The reason to hire interns is because you plan to devote some of your resources to help them in their professional development. Stop asking what you can get out of your interns and start asking how you can best give something to them.

[+] kypro|3 years ago|reply
When I learnt CS in university in my opinion 99% of them were unemployable after graduating. They simply didn't have the depth or hands on experience to build or understanding anything of any complexity.

This may be different in different universities and it may have changed today, but the tests we took were largely about remembering lecture talking points and being able to regurgitate them with or without any real understanding.

For example, you might learn a bit about relational databases, but your understanding will be limited to the talking points of the lecture. Eg, you might get question to explain the use of primary keys, but if you were asked them how you might design a relational database for some data with normalized tables they'd have no idea, because they'd never have actually put the talking points into use.

It upset me because by the time I had finished university I had launched two startups and worked professionally as a developer for 3 years. I was consistently helping students with practical exercises while at uni given I was one of the most capable on the course, but none of my experience really helped me in the tests because I discovered so little of it was about practical understanding, and mostly just an extended English exam tested mostly on writing ability and being able to regurgitate talking points in lecture slides.

It's not that the students weren't smart or capable individuals, its just that the course didn't incentivise obtaining depth of knowledge in what was taught so no one did.

[+] SamoyedFurFluff|3 years ago|reply
I wouldn’t consider network programming to be foundational to all computer science, tbh. Are these interns expected to be writing their own sockets? Similarly, are the intern projects involving writing out the operational trees of your sql? I would consider these to be great if the intern knew of these things but I wouldn’t expect an intern to be functional in any of it. (I would put kernel writing, compiler programming, security, mobile development, etc. in the same category. If they know some and it applies to the intern project cool, but I brought an intern on expecting them to be worse than useless lol. The goal is to get them towards useful for when the company hires them.)
[+] MattGaiser|3 years ago|reply
As an undergrad, my priority was learning what was most frequently on interviews and what most jobs cared about. None of this fit the bill.

SQL? Yes. Database theory? Have never discussed it beyond "what is an index"? So I never looked at that topic again.

Concurrent programming? Never dealt with it outside of courses and jobs that care about it mention it, so I self selected out. So I never looked at that topic again after the course.

Network programming? Took a course in it, but outside of a few devops use cases, I have never had any reason to recall that knowledge. I just memorized 5 versions of that test and went in to it with that.

My advice to my undergrad self would be to basically abandon anything that is not fun projects (so you get familiar with the languages themselves), hackathons (so you have culture fit), and leetcode.

I can't imagine the average ROI on learning these things is great.

[+] friendzis|3 years ago|reply
It's money and competition. First, universities/colleges want to maximize their profits and that means maximum number of students at minimum cost. Unless the whole schtick of a particular university is "superior quality" it usually makes financial sense to not offer highest possible quality and rather increase throughput.

Second, I don't know what part of the world you are in, but usually formal education courses have certain requirements to them, roughly x1 hours of social, x2 hours of humanities and so on. Then there are basic prerequisites like math. In the end, the final number of hours for subject is not that high as it would seem at first glance.

Finally, there is competition among education providers and their "tiers". Universities/colleges compete not only among themselves, but against codecamps too. The premise of code camp is to help somewhat computer literate people memorize a bunch of text macros that yield certain result on the screen. Colleges must adapt to compete, dropping the quality floor even lower.

In the end, unless you have graduates from "general" college/university you can expect that deep foundational understanding will be replaced with quick factoids on how to produce certain result in certain specific context without understanding said context or even being aware of said context.

Greybeards looked the same at "us", by the way.

[+] hardware2win|3 years ago|reply
Concurrent programming is a topic that many actual SEs struggle with, do not expect students to be proficient with it

Also what network programming means to you?

Basic networking knowledge or actually writing low lvl network code?

Besides that: higher edu institutions suck, unless it is something like top3 then do not expect a lot just because it is a degree, everything is up to the person.