top | item 43594464

(no title)

>The compiler itself is to be developed in OCaml.

These seems like a misstep that I've seen in a few other compiler implementation courses. For some reason these programming language professors always insist on completing the project in their personal favorite language (Haskell, OCaml, Standard ML, etc).

As a student this makes the project significantly more difficult. Instead of just learning how to implement a complier, you're also learning a new (and likely difficult) programming language.

discuss

Jtsummers|11 months ago

Learning a new programming language isn't that hard of a task, every decent programmer will learn a dozen or maybe even dozens over their career.

Also, neither OCaml nor SML are hard to learn. Haskell is more challenging, but that's because it's become, in a sense, multiple languages. The core of Haskell is no harder than OCaml or SML to learn, except for reasoning about lazy evaluation and some of its consequences. All the things people use on top of Haskell, though, does make it more to learn but what you'd need to reach equivalent utility as SML or OCaml for a compilers course is not that hard to learn.

remexre|11 months ago

Many universities that use OCaml in upper-division courses also use it in lower-division; my university requires all CS majors to take a course that is taught in OCaml, and covers higher-order programming, "advanced" (Hindley-Milner) type systems, equational reasoning, etc., typically in their sophomore year.

The compilers class can then be taught in it without worrying about that problem much.

mattgreenrocks|11 months ago

Algebraic data types are superb for compilers. Case exhaustiveness checks are really helpful in this domain, especially when doing any sort of semantic analysis.

Frankly, I try to avoid languages that don’t have ADTs as much as possible. They are incredibly useful for specifying invariants via your design, and their constraints on inputs lend themselves to easier implementation and maintenance.

sn9|11 months ago

You have to pick some language and some amount of students are likely to be new to it, so you might as well pick one that excels at the task of writing compilers as ML-family languages are.

DKordic|11 months ago

I have a challenge for You: make applications of Your choice of BlackPill [1] or Radxa ZERO 3E [2].

Hint: You might be interested in Forth Systems and Lisp Machines.

[1]: https://github.com/WeActStudio/WeActStudio.MiniSTM32F4x1 [2]: https://radxa.com/products/zeros/zero3e

linguae|11 months ago

In defense of such compilers professors, part of the purpose of a good undergraduate computer science program is to expose students to different programming language paradigms. Computing professionals are expected to grasp new languages, APIs, and tools quickly. Additionally, certain problems are easier to express in certain paradigms. For example, I would use a language like C or perhaps Rust in an operating systems class since we need to operate at a level that is much closer to the hardware. If I were teaching a course on machine learning, I’d use Python for its excellent ecosystem of libraries, though R and Julia are good alternatives. A course on relational databases should teach basic SQL.

Back in the 2000s there were some CS undergraduate programs that attempted to use Java in the entire curriculum, from introductory courses all the way to senior-level courses such as compilers. There was even an operating systems textbook that had Java examples throughout the text (https://www.amazon.com/Operating-System-Concepts-Abraham-Sil...).

I think using only one language for the entire undergraduate CS curriculum is a mistake. Sure, students don’t have to spend time learning additional languages. However, everything has to fit into that one language, depriving students the opportunity to see how languages that are better suited to specific types of problems could actually enhance their understanding of the concepts they are learning. In the case of Java, it’s a good general-purpose programming language, but there are classes such as computer organization and operating systems where it’s important to discuss low-level memory management, which conflicts with Java’s automatic memory management.

When it comes to writing compilers, it turns out that functional programming languages with algebraic data types and pattern matching make working with abstract syntax trees much easier. I learned this the hard way when I took compilers in 2009 at Cal Poly. At the beginning of the course, we were given two weeks to write an AST interpreter of a subset of Scheme. My lab partner and I didn’t like Dr. Scheme (now known as Racket), which we “endured” the previous quarter in a class on implementing programming language interpreters, and so we set about writing our interpreter in C++. It turned out to be a big mistake. We got it done, but it took us 40 hours to implement, and we had a complex class hierarchy to implement the AST. We realized that Dr. Scheme’s features were well-suited for manipulating and traversing ADTs. We never complained about Dr. Scheme or functional programming again, and we gladly did our other compiler assignments in that language.

16 years later, I teach Haskell to my sophomore-level students in my discrete mathematics class at a Bay Area community college that uses C++ for the introductory programming course.

DKordic|11 months ago

Interesting arguments.

Lisp Machines? "Separation of Concerns" [1]? Conway's Law?

[1]: http://alarmingdevelopment.org/?p=805

mrkeen|11 months ago

I grew up with "easy" languages. Pascal, VB, C, C++, Java, C#. And frankly I'm getting real sick of them.

I'm porting Dijkstra's algorithm over to C# at the moment, and in the last several hours here's the two most clownish things that have happened:

1) I have:

  if (node is null) {
    ..
  }

My IDE is telling me "Expression is always false according to nullable reference types' annotations". Nevertheless it enters that section every time.

2) I have:

  SortedSet<int> nums = [];
  Console.Out.WriteLine(nums.Min);

You know what this prints?

The minimal element of a set which has no elements is 0.

Yes, every language has its warts, and anecdotes like this aren't going to change anyone's mind. I only wrote these up because they're hitting me right now in one of those languages that companies use "because anyone can learn them". I like harder languages better than easy languages, because they're easy.

neonsunset|11 months ago

This is possible if you disregarded compiler warnings and assigned null to a non-nullable location.

tester756|11 months ago

In general I'd recommend using Min() (from LINQ) which works as expected

But this property has this remark: "If the SortedSet<T> has no elements, then the Min property returns the default value of T."

Feels like you used compiler hints incorrectly