top | item 39720388

Great ideas in theoretical computer science

288 points| __rito__ | 2 years ago |cs251.com | reply

93 comments

order
[+] diebeforei485|2 years ago|reply
This class is great because it introduces a lot of concepts but most importantly really sharpens your problem-solving skills.

> The method we tend to use in 251 is throwing you deep into the water without a paddle (i.e., giving you very basic definitions for a new topic you haven't seen and expecting you to solve things that you might be given in a course dedicated to that topic in say the third week). We do this over and over, pretty much starting from scratch on a new topic every week. As a result, it can be very frustrating—which, of course, is the intention. If you're constantly trying to solve things that are just outside of your reach, you develop better strategies for thinking about problems you're given.

[+] VoidWhisperer|2 years ago|reply
I'm not sure I necessarily agree with this strategy. In my CS undergrad, I had an algorithms class where the professor did this and expected the students to be able to do homework assignments each week that were worth a significant portion of our grade with basically no explanation on how to solve these extremely complicated optimization problems. When comparing our assignments to that of another student taking the same class with a different professor, theirs were much, much easier.

I dont think this teaching methodology made me gain any more from the course, if anything, it made me gain less because I constantly had the stress of trying to do these ridiculous problems on top of all of my other work every week

[+] cloogshicer|2 years ago|reply
Honestly, this just sounds like a terrible way to teach.
[+] tzs|2 years ago|reply
Theoretical computer science can be fun. But it can also be really annoying!

Around the time most schools were starting the 2014-15 school year I saw a post on Reddit in either a math or CS group from someone asking for how to solve a particular theoretical CS problem. The poster didn't say where the problem was from or why they needed a solution, but others quickly figured out it was from someone trying to cheat on the first homework set from COMS 331 (Theory of Computing) at Iowa State University.

No one helped and the poster deleted their post. I thought it was an interesting problem and gave it a try, expecting it to be a short but interesting diversion.

I didn't expect it to take too long because although I was not a CS major (I was a math major) I did take all of the undergraduate theoretical courses at my school and I got decent grades in them. That had been ~35 years earlier so I had forgotten a lot but the problem was from the first COMS 331 homework set so shouldn't require anything past the first week or so of that class, which should all be fairly basic stuff I would still remember.

I spent a couple days on it and got absolutely nowhere. Several times since then I've remembered it, thought about it for a few hours or a day and have continued to completely fail.

If anyone is curious, here is the problem:

Define a 2-coloring of {0, 1}∗ to be a function χ : {0, 1}∗ → {red, blue}. (For example, if χ(1101) = red, we say that 1101 is red in the coloring χ.)

Prove: For every 2-coloring χ of {0, 1}∗ and every (infinite) binary sequence s ∈ {0, 1}∞, there is a sequence

w₀,w₁,w₂,···

of strings wₙ ∈ {0, 1}∗ such that

(i) s = w₀w₁w₂ ···, and

(ii) w₁, w₂, w₃, · · · are all the same color. (The string w₀ may or may not be this color.)

[+] suyjuris|2 years ago|reply
Let us say that an index i is bad, if every finite subsequence of s starting at i is red (i.e. for every j ≥ i we have χ(s_i ... s_j) = red). Two cases:

Case 1: there are infinitely many bad indices. Here we go to the first bad index then the second, and so on. The colour of w₀ does not matter, and since subsequent words start at a bad index, they will all be red.

Case 2: there are finitely many bad indices. Then there is some k which is larger than all bad indices. We start by going to k (again, the colour of w₀ does not matter). Since k is not bad, there must be some blue word starting at k. We take that one and move to a larger index. Again, that index is not bad. We repeat this process to find our sequence.

[+] georgecmu|2 years ago|reply
I don't feel like doing a formal proof, but it looks like it should be a direct consequence of the pumping lemma: "Informally, it says that all sufficiently long strings in a regular language may be pumped—that is, have a middle section of the string repeated an arbitrary number of times—to produce a new string that is also part of the language." [1]

Needless to say, this exercise would be trivial if you just covered the pumping lemma and its applications in class, and next to impossible if you never heard of it.

[1] https://en.wikipedia.org/wiki/Pumping_lemma_for_regular_lang...

PS. I took 15-251 back when it was 15-299: a brand new class without a regular number assignment. Honestly, I would have enjoyed it a lot more now than I did back then. But several assignments still stand out for me, in particular "Building from scratch" [2]. Trying to get some of that feeling now, working through Turing Complete[3] with my daughter.

[2] https://www.cs.cmu.edu/afs/cs/academic/class/15299/handouts/...

[3] https://turingcomplete.game/

[+] danbruc|2 years ago|reply
My first idea is to think of this as the following game. Initially we consider all words uncolored and I start by picking an initial set of red words that can color any infinite binary sequence in red. Then you get to take away one of my words and make it blue. After that I get to pick replacement words from the remaining uncolored words to fix my set. Rinse and repeat.

I start with { 0, 1 } which will obviously do. You take - I guess without loss of generality - my 0. I pick 00 and 01 as replacements and end up with { 00, 01, 1 } which will work again. You take away my 01, I replace it with 010 and 011 and { 00, 010, 011, 1 } will still work, I guess, but it is certainly becoming less obvious. In general I will pick x0 and x1 as replacements if you take x away which will allow me to still color x, I just have to choose between x0 and x1 depending on the following bit.

If I am not mistaken, you will have to take away an infinite set of words from me in order to prevent me from being able to color any infinite binary sequence in red, if you take only finitely many, I will be left with a set of red words that still works. My guess is now that if you take away that infinite set in order to stop me, you will accidentally assemble a set of blue words that will be able to color any infinite binary sequence blue. Unfortunately I do not have the time right now to properly think this through.

I am also not too confident because I do not clearly see how being allowed to have the first word in the wrong color comes into play. On the other hand, maybe, maybe you need my 1 for sequences starting with one or something like that.

EDIT: Just had an additional thought, this might actually work. If you take my 0, then you can not also take my 1 later on or you will end up with { 0, 1 }. If you take 01, then you can not also take 10 later. So you always have to take one of my two latest replacements and leave the other one to me forever. We will build complementary sets, you have 0, I have 1, you have 01, I have 10, you maybe 011, I 010, ... Now it seems quite plausible that this will work out and we end up with two complementary set that both can color all infinite binary sequences in [mostly] a single color.

[+] scubbo|2 years ago|reply
Interesting problem! I've never formally studied computer science, and my math degree is far behind me, so I probably won't be the one to help you solve this, but I'm commenting to at least remind myself to check back here.

Some thoughts that immediately spring to mind (apologies if you've already thought of these - just getting the brain flowing!):

* We only need consider colorings for which 0 and 1 are differently coloured. If 0 and 1 are the same colour, then we're trivially done - let w1, w2, w3, etc. all have length 1

* this smells like it'll be some sort of combination of induction and contradiction? E.g. "assume this property is true for w1w2w3...wn-1, and that wlog they are red. Assume there is no wn that can continue the sequence of red substrings - thus, all strings starting from the end of wn are blue. If that's the case _and_ if we can convert the string [w1...wn-1] into blue substrings, then we're done. So show that there is some contradiction proving that impossible (maybe using the observation I made above that we only care about cases where 1 and 0 are different colours)"

[+] akoboldfrying|2 years ago|reply
The following proof is much tighter than the meandering process I followed to get to it. Let me know if the meandering would be interesting!

Call a position i in s "hard-red" (respectively, "hard-blue") if every positive-length substring of s beginning at i is red (respectively, blue); otherwise call i "soft". A position is "hard" if it is hard-red or hard-blue.

There are 3 possible cases:

1. s has no hard positions.

2. s has a positive but finite number of hard positions, with the last being i.

3. s has an infinite number of hard positions.

Case 1

Every position is soft, meaning that if we start a substring of s at that position and continue to grow it by appending digits from s, eventually (after a finite number of steps) the colour of the substring will change. So we can set w₀ arbitrarily to the first digit of s, then grow w₁ from position 2 of s until it has the same colour as w₀. Then we can grow w₂ from the next digit in s until it has the same colour, and so on. This results in all words having the same colour, including the first.

Case 2

Set w₀ to the first i-1 digits of s, and w₁ to the i-th digit of s. All positions > i are soft, meaning that, as for case 1, we can repeatedly grow substrings by appending digits from s until the substring turns the same colour as w₁.

Case 3

Since s has an infinite number of hard positions, it must have an infinite number of hard-red positions, an infinite number of hard-blue positions, or both. Suppose w.l.o.g. that it has an infinite number of hard-red positions (it may or may not also have an infinite number of hard-blue positions). Define p(k) to be the k-th hard-red position in s. Set w₀ to the first p(1)-1 digits of s, and for k >= 1 set word w_k to the substring of s beginning at p(k) and ending at p(k+1)-1. w₁, w₂, w₃, ... all begin at hard-red positions, so are all red. □

[+] ironSkillet|2 years ago|reply
The cardinality of the set if possible infinite binaries sequences is uncountable. My gut tells me there must be a way to assume the negation and construct a bijection between the naturals, leading to a contradiction.
[+] melagonster|2 years ago|reply
I know that I have no chance in real computer science department. but is this question for freshman?
[+] sudosysgen|2 years ago|reply
Can this be solved with a proof by contradiction? Let's assume that this is impossible for our sequence. There must then be at least one (sub)word of the wrong color. For this to be fatal, it must be impossible for any word containing it to be of the correct color while having the rest of the sequence (after and before that word) continue to be the same color, no matter how large we make this word to the right, because if this wasn't true, we could then simply use this bigger word and there would be a contradiction.

If this is true for more than one word, the problem can be solved by simply inverting the chosen color and using two larger words containing both subwords, which will always be of the same, originally wrong color, which leads to a contradiction.

Otherwise, if this is only true for one subword, and therefore the initial sequence and the rest of the sequence around cannot be made to be worded correctly while the subword is contained in a word of the correct color, but that the rest of the sequence can be covered with words of the same color, we can simply include this problematic subword inside w₀

In either of the three cases 0, 1 or 2+ "toxic subwords", it is always possible to find words of the same color covering the sequence. Therefore, there can be no sequence for which it is impossible to find a suitable w₀w₁w₂ ··, and the original proposition is proven.

Please tell me if you find any issue with this approach!

[+] rramadass|2 years ago|reply
Nice!

If folks would like to learn these ideas by hand via programming, i highly recommend Tom Stuart's Understanding Computation From Simple Machines to Impossible Programs - https://computationbook.com/

[+] NlightNFotis|2 years ago|reply
My favourite computing book.

Very highly recommended.

[+] __rito__|2 years ago|reply
Wow, thanks for sharing!
[+] __rito__|2 years ago|reply
[+] riedel|2 years ago|reply
Thanks. To me the first part linked above is rather "foundation of computer science". I remember this to be content of my first classes in CS ( turing machines, lambda calculus, finite automata, ... (Actually the course in 1999 in Uni Karlsruhe was unfortunately also was the last time I implemented things using Haskell).

This full list I guess goes beyond and looks interesting.

[+] soganess|2 years ago|reply
I don't know where, but there is another version of this class that includes "The Probabilistic Method"[1]. While not exactly the same thing, I can't imagine doing modern existential (vs constructive) proofs of topological solution space obstructions without that style of thinking.

[1]: https://math.bme.hu/~gabor/oktatas/SztoM/AlonSpencer.ProbMet...

Bonus points for those interested -- I think this paper(sadly, not mine) has a concise collection / history of obstruction proofs in its background section (1.2/1.3 and since it is a paper all of them are cited!): https://arxiv.org/abs/2309.09913

[+] pushedx|2 years ago|reply
You may be referring to the version of this course that was taught by Luis von Ahn (the creator of reCAPTCHA and the founder of Duo Lingo), when he was a professor at CMU.

https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15251-s...

He liked to call the course "SOME Theoretical Ideas FOR Computer Science", and it was known to be a very popular (and difficult) course.

[+] dogcomplex|2 years ago|reply
Damn. Impressed to see something this fundamental that never made it into my computer science postgrad education (and judging by the depths and difficulty of it, might not for a while!) Seems like a powerful technique that scales to cover quite a lot of other problems.
[+] abdullahkhalids|2 years ago|reply
What would versions of this course look like for other fields?

Great Ideas in Theoretical Physics?

Great Ideas in Experimental Physics?

Great Ideas in Economics?

etc

I did teach a course once called, Inventing the Information Age, in which we discussed all the inventions and ideas (starting from writing) all the way to modern computing infrastructure needed for a civilization to replicate our information age. This was not a single-field course, because the ideas/inventions were in language, physics, mathematics, and computer science. That made it more fun.

[+] josh-sematic|2 years ago|reply
Sean Carrol has a series he’s working on called “The Biggest Ideas in the Universe” which is on physics. Only the first one is out now (Time, Space, and Motion), but the second is due in May I think (on Quanta and Fields). Its intended audience is anyone with a high school education. It introduces basic ideas in calculus and then does use equations to motivate understanding, but not at the level you’d need if you were actually studying physics. I have a bachelor’s in physics but still deepened my understanding of General Relativity from the first one. Highly looking forward to part 2. https://www.preposterousuniverse.com/biggestideas/
[+] osti|2 years ago|reply
The CMU class is for freshmen, so students without advanced background can still comprehend most of it. But for physics I feel that without at least some good physics and math background, you can't really understand and appreciate the "great ideas".
[+] shanusmagnus|2 years ago|reply
That sounds awesome and very creative. Do you have a link to an old class site, or a syllabus, or anything that would give a flavor?
[+] trenchgun|2 years ago|reply
Your course sounds awesome. Do you have the materials somewhere available? Would love to read more about it.
[+] assbuttbuttass|2 years ago|reply
I TA'd for this course back in undergrad. Despite being in the CS department, it's structured like a math class, and all of the homework assignments are proof-based. I don't think I wrote code for that course a single time, except for pseudo-code for a Turing machine
[+] brcmthrowaway|2 years ago|reply
Computation is a fundamental propery of the universe? How is that? How do flora and fauna compute?
[+] ozim|2 years ago|reply
You have to have right amounts of inputs to get outputs.

Plants I grow are complex function of CO2, water, energy and bunch of other things things. But I do know if I don’t water enough or too much, plant function will return not good results.

[+] woopsn|2 years ago|reply
With ribosomes, hormones, spores, senses, brains etc., and by processing our genetic "code" into these things.
[+] graycat|2 years ago|reply
Uh, let's take an example: Mars gets a force from the gravity of the earth. That force has magnitude and direction so is a vector. Mars also gets such a vector from each of the Sun, Jupiter, Andromeda, dark matter, .... Then to determine the direction Mars will go, have to add all those vectors with the current value of the Mars momentum vector. Soooo. the universe adds vectors, that is, does computation. Done?
[+] sporkland|2 years ago|reply
Inspired by the title my two top ideas from theoretical compsci:

1. Move to front lists are at most 2x the optimal list order search time (and are often much better than any static list order). [1]. A similar conjecture exists for Splay trees but is unproven called the dynamic optimality conjecture.

2. Randomized algorithms (e.g quicksort) often have the same worst case time as a non-randomized one but often are much faster in practice than their non-randomized variants.

1: https://www.dgp.toronto.edu/public_user/JamesStewart/378note...

[+] Simon_ORourke|2 years ago|reply
Ahhh, I remember my old college course on time complexity, which was effectively the professor picking on random kids and saying something like - "I have some 4 dimensional matrix multiplication algorithm which then uses a bloom filters to match on random ids... what is the time complexity of this?", and some poor kid would stutter something like, "aaaaahhhh.... O(nlogn)", only to have the professor to shout "No, wrong!" and go pick on someone else.
[+] bo1024|2 years ago|reply
Interesting positioning as "great ideas" -- it seems like the topics are a pretty standard undergrad Theory of Computer Science course.
[+] lupire|2 years ago|reply
I wouldn't expect the theory of a field to be built upon Bad Ideas!

It is a strange course though. It is grad level (2xx at Harvard) but the syllabus is survey topics from a range of undergrad courses in CS. I would guess it's a smattering of advanced topics in those areas, but the description doesn't sell it as having all those topics as prereqs. Maybes it's meant as an extension school / professional master's class for people who do not have CS degrees? Refugees from math, physics, and autodidacts?

[+] shusaku|2 years ago|reply
That’s my impression too, maybe the name just gets undergrads more excited.
[+] Avtomatk|2 years ago|reply
And where is Category Theory and Lambda Calculus?
[+] djaouen|2 years ago|reply
I have nothing to add other than saying that Physicists and Philosophers often make better programmers than Computer Scientists. :)
[+] Simon_ORourke|2 years ago|reply
> Garbage collection This is a huge Yes now. Every language that's become successful since 1999 has GC and is designed for all normal users to use it to manage all memory. In five years, Rust or D might make that last sentence untrue, but even if that happens, GC will still be in the yes category.

Sorry, I had to guffaw when I read that last part about D. A dumpster fire of a language for garbage collection.

[+] a-dub|2 years ago|reply
"string theory"

i found this stuff to be a lot of fun!

[+] thejazzman|2 years ago|reply
The most painful class I ever took. I know, I'm not a real nerd, or something. I'm okay with it.
[+] russfink|2 years ago|reply
Interesting to read your experiences. Looking over this, it seems to take a highly theoretical approach and work up, rather than a practical/applied approach and work down. I was surprised to see automata covered so early, which is more in the vein of computability theory than that of introductory concepts.
[+] leetrout|2 years ago|reply
There are literally dozens of us making great careers solving problems with computers without CS degrees or any particular interest in the lowest levels of computing nor the mathematical backings / abstractions.

Horses for courses.

[+] zitterbewegung|2 years ago|reply
I think there should be a companion class on anti patterns or bad ideas in computer science. It’s much harder to see a horrible idea and then you have to argue why not to do something.

Something like variable names being too long or short. How to figure out what to do with unrealistic timelines.