Little languages are the past and yes, the future. We just don't recognise them.
It was common in the 60s and 70s to have the hardware manufacturer ship all the OS and languages with their hardware. The languages were often designed for specific problem domains. The idea of general purpose languages (FORTRAN, PL/1, etc) was uncommon. You can see this in K&R (the original edition anyway) where they justify the idea of a general purpose language, even though C itself derived from prior general languages (B & BCPL) and they had gotten the idea from their experience on Multics (written in PL/1, a radical idea at the time). So a 20 year old idea was still barely diffused into the computing Zeitgeist.
Most Lisp development (since the early 70s at least) is writing a domain-specific representation (data structures and functions) and then writing your actual problem in it. I used both Lisp and Smalltalk this way at PARC in the early 80s.
More rigid languages (the more modern algolish languages like python, c++, rust, C, js etc -- almost every one of them) doesn't have these kind of affordances but instead do the same via APIs. Every API is itself a "little language"
What are called little languages in the Bently sense is simply a direct interface for domain experts. And after all what was a language like, say, Macsyma but a (large) "little language"?
I came to this conclusion early in my career. It went something like this:
A - "To do this, just create this object, fill in these properties, and call these methods."
B - "Okay, I did that, but it crashed."
A - "Yeah, it's because you set the properties in the wrong order. This property relies on this other property under the hood. Set them in this order."
B - "Still crashes."
A - "Yeah, you called the methods in the wrong order. This method relies on that method. Call them in this order and it works."
My conclusion was that the lisp philosophy of building a lot of little sub language was equivalent to what people were doing with OO in C#/Java. Either way you have to learn the "right" way to put things together which is dictated by unseen forces behind the scene.
Of course, I also concluded that most people work differently than I do. For most people, if the code "looks right" (ie recognizable syntax) then they're able to tell themselves a story that it's familiar and their intuition is able to pick up the slack for finding the right enough way to use most arbitrary APIs (just as long as they don't exceed some level of incomprehensibility). On the other hand, I have to understand the underlying logic or I use the API the wrong way pretty much every time.
So for most people lots of APIs is actually a much better cognitive way for them to work whereas for me API soup and lisp macros are the same conundrum.
I don't think it's so much "little" languages (commonly DSL) that matter. It's more the jumps in expressivity. You don't use a full on Turing-complete language when you need to match strings written in a regular language. Instead, we write the language we want as a regexp, and then use a regexp engine to match it.
I agree with much of the problems listed in the article. The author even manages to stumble onto some of the solutions (e.g. Dhall being a total language).
"Expressiveness is co-decidability" is the main theme of these things. The crux of the issue is in our everyday programming tasks, we have many levels of decidability, ranging from RE all the way to things that require full Turing completeness.
The majority of work however, lies in the middle. There are so many things that can be done with pushdown automatons, or with deterministic automatons. Most codebases don't actually use those though. An issue is that there is a dearth of "mini" languages that support these things.
Another issue is that somehow we are enamoured with the idea that our languages must be able to express everything under the sun (up to TC/Recursively Enumerable). This seems to be more of an industry attitude than anything - there is this chase for the most powerful language (a lisp, clearly... everything else is a blub).
I've recently experimented with embedding an APL into my usual programming language, and it was a very interesting experience. It feels like having the power to do regular expression stuff, but with arrays. I want to do the same for the other levels of expressiveness.
Is the engineering footprint of an organization really better if everything is implemented in twenty different languages, versus just three or four? Everything else aside, quality of the language, scope, etc; just the number. You have to expect everyone to know each language; know the ins, outs, idioms, gotchas, etc. You have to be able to hire for the languages. You need the language runtimes in your environment, everywhere, docker, local dev machines. You have to keep up to date in X times more changelogs, version upgrades, CVEs.
The article pulls Shell as an early example. Shell did not become the powerhouse it is because its "great" (though some would argue it is, I'm not here to debate that); or because its small; or because its general purpose; or because its single-purpose. It became a powerhouse because its Old and Omnipresent. See, the problem with inventing New Things is that they are, by definition, not Old, nor Omnipresent. New Things have to start somewhere, but you're starting in last place.
> Regular expressions and SQL won’t let you express anything but text search and database operations, respectively.
Oh mylanta. Did you know that after the addition of back-expressions, Regular Expressions became turing complete? They are, functionally, a real programming language, just like C; well, except, far more annoying to write. And naturally, SQL "won't let you express anything but database operations", which is to say nothing about "SELECT 1+1"... let alone the little corner of the language called "Stored Procedures".
There’s a Groovy DSL that will allow you to perform operations on collections using SQL syntax. I’d argue it’s an improvement over the traditional procedural or functional approach’s.
On that note: Am I the only one that's constantly surprised by the absence of proper sandboxing solutions when so many programming languages now provide (otherwise pretty useful) means of running code dynamically in a script-like fashion?
In C#, I can pull in Roslyn, and compile a string on the fly as a C# script; but the way the .NET standard library is structured makes it pretty much unfeasible to prohibit outside interactions I don't want to allow (in my case, e.g: `DateTime.Now`, while allowing the Handling of `DateTime` values).
It's possbile to embed the Typescript compiler into a website, but running code on the fly and some simple sandboxing is not feasible without a serious pile of hacks.
I've recently read a forum thread about a library for compiling/running Elixir code as a script, but guess what: The runtime (apparently) makes sandboxing really hard.
And so on and so on. I just wished that the LUA approach of "if I don't give you a hook, you cannot do that" were just the default. I've seen so many overcomplicated enterprise-y solutions that are basically just a plea for a well-designed, local and small scripting API…
> And so on and so on. I just wished that the LUA approach of "if I don't give you a hook, you cannot do that" were just the default.
Yes, because languages are still not capability-secure. Memory-safe languages are inherently secure up until you introduce mutable global state, and that's how they typically leak all kinds of authority. If you had no mutable global state, then you can eval() all the live-long day and you wouldn't be able to escape the sandbox of the parameters passed in.
Examples of mutable global state:
* APIs: you can make any string you like, but somehow you can access any file or directory object using only File.Open(madeUpString). This is called a "rights amplification" pattern, where your runtime permits you to somehow amplify the permissions granted by one object, into a new object that gives you considerably more permissions.
* Mutable global variables: as you point out, eval() can access any mutable global state it likes, thus easily escaping any kind of attempt to sandbox it.
If these holes are closed then memory-safe languages are inherently sandboxed at the granularity of individual objects.
As far as I know most in-process sandboxing has been deprecated because it is in contrast to maintainability. E.g. Java decided against its Security Manager, because it is way too easy to leave the proper checks out of a new feature, leaving the whole thing vulnerable with a false sense of safety. Instead, process-level isolation is recommended.
Ruby used to have the $SAFE feature for sandboxing, but it was removed because it was buggy, added a lot of complexity, and wasn't actually that useful. Linux has all the various isolation features that make Docker work, but people still recommend not running untrusted code in Docker containers because of the potential for oversights/"bugs" in Linux's API. I suspect that programming languages / VMs don't include these features because they are very hard to get right and add a disproportionate amount of complexity for their utility.
I think WASM is filling in that gap to some extent. From the spec:
> Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module
And IIRC, the core instruction set is reasonably compact.
Your best approach is to run untrusted code in a separate process in a sandbox. Language developers don't normally deal with hostile users in the same way that os developers do.
A "little language" is just an abstraction over a small part of the domain. Equivalent abstractions can be and are written as libraries. In a Turing-complete host language, the primary difference is that these libraries don't get the privilege of inventing new syntax, but that's almost always a good thing.
We already have major headaches switching between JS, SQL, and {insert backend language here}. Introducing tens of little languages into a codebase may marginally increase readability of each chunk of code in isolation, but the amount of context-switching and required background knowledge it introduces would more than make up the difference.
In a abstraction strategy that's based around libraries, every library agrees on the same basic syntax and semantics. You interact with an embedded DSL via a well-defined interface that all libraries respect. The syntax may not be perfectly ideal to any one part of the project, but it's consistent throughout every part of the project. That has real value.
I think it's also a red herring to argue about lines of code: stringing together a bunch of little languages is not likely to lead to fewer lines of code than pulling in an equivalent number of third-party libraries, and it will almost certainly increase the total amount of code in your distribution, because each little language must not only implement the functionality desired, it also needs its own parser and interpreter. Take McIlroy's shell script: if you add up the C code to implement each of those little languages, you're at about 10k lines of code to make that bit of code golf possible.
I'm a huge fan of DSLs, and I like the analogy of modern programming as pyramid building. I just don't think independent, chained-together DSLs are the answer. I'd rather have a language like Kotlin that is designed around embedded DSLs that all respect the same rules and use the same runtime.
> Take McIlroy's shell script: if you add up the C code to implement each of those little languages, you're at about 10k lines of code to make that bit of code golf possible.
This is a fair point, but part of this has to be how battle tested the language in question is, right?
Bringing in a single language that's been run through its paces (bourne shell in this case) for text processing, seems like a much lower risk than bringing in a dozen different languages and hoping the places that they interface doesn't blow up (hope someone tried that particular combination before).
> We already have major headaches switching between JS, SQL, and {insert backend language here}.
I used to agree wholeheartedly with this, but now I pretty strongly disagree; in my decade or so of cumulative code-monkeying experience, having to understand the nuances of some "convert $BACKEND_LANGUAGE to JS and SQL" layer has been far more headache-prone than just, you know, writing JS and SQL. All about using the right tool for the job - and I know of very few languages that are the right tool for all three of those jobs (let alone the myriad other jobs that might pop up as soon as you expand beyond a simple CRUD app).
I disagree and would say DSLs should go away if possible.
DSLs like SQL are the norm and you can see the problem of them in basically every project.
You either use ORMs or you end up hand rolling SQL rows into Structs or Classes.
The whole mapping usually looks like crap and contains a bunch of implicit corner cases, which eventually end up being a footgun for someone.
Usually the SQL sever runs somewhere else, the ports are wrong, the language version is wrong, or a migration failed and a function is missing yada yada....
The same is usually true for Regexp. There are a billion dialects and every single one of them is basically unreadable, incomplete or just weird.
The same is true for microservices with tons of config files for dev, staging, testing and production...
Everything has its own version, can be down or mutate some random state somewhere while depending one other servcies.
It always breaks at the seams.
Increasing the amount of DSLs increases the amount of seams and thus makes software worse.
This article claims the problems it's trying to solve are: Hard to onboard new hires, code breaks because of lack of understanding of dependencies, and code changes become harder to manage. In my experience SQL, regexes, unix shell, and listening to Alan Kay, far from solving those problems, are the very things that most exacerbate them. General-purpose languages that are expressive enough to let one write business logic in the language of the domain, but without breaking the rules of the language or requiring new tooling - "internal" rather than "external" DSLs - are a far better way forward.
> SQL is a little language for describing database operations
Yeah, SQL is not a little language anymore.
It started as one, but because a lot has been added to it and SQL flavors for Oracle or Postgres are anything but tiny. Windowing, nesting, json handling...
I think the author is kinda proving with this that a successful little language does not stay little, and hence little languages are not the future.
And don't get me started on DSL in general. Just lookup my username and "DSL" on hackernews for endless rambling.
People have been making this argument since the 80s and possibly even earlier. My experience is often the opposite. Little languages are usually far, far harder than (mis-)using "big" languages for small tasks.
The problem is that your DSL has to be understood by other people, including future you. Programming tasks are vast, combinatorially explosive state spaces full of weird potential interactions between features. Once you get above the complexity and universal familiarity of say, arithmetic, it's difficult for others to understand what's going on just by looking at 1-2 live examples. You have to heavily invest in proper docs and tooling (if your language doesn't provide it for free). By the time you've completed that your "little language" usually isn't such a little effort anymore.
If you don't, you've just made the next CMake. Congrats you monster.
That's why we have languages with functions now, because people didn't want to manually do a register dance in assembly.
That's why we have name spaces, because naming conventions only take you so far.
That's why we have map and filter (or equivalent) because that's what most loops are doing anyway.
Generation after generation, we discover that we all use common abstractions. We name them design patterns, then we integrate them in the language, and now they are primitives.
And the languages grow, grow, bigger and bigger. But they are better.
More productive. Less error prone. And code bases become standardized, simple problems have known solutions, jumping to a new project doesn't mean relearning the entire paradigm of the system in place.
Small languages either become big, or are replaced by things that are big, for the same reason most people prefer a car to a horse to go shopping.
Not that horse ridding will totally disapear, but it will stay in their optimal niche, they are not "the future".
And even if you do invest in proper documentation and support, you still have to overcome the hurdle that people just _don’t want to spend time learning your one-off language_ - there’s nontrivial opportunity cost in learning something that won’t be useful anywhere else. So people will just do the bare minimum which will lead to misunderstanding and bugs.
I would add to that when people move to next job DSL dies I don't have a need for that DSL in next company. I could try to implement there - but IP rights would prevent that, getting new people on board with my ideas is just so much work that it is not useful.
That is like learning some SaaS application ins and outs you switch jobs and that specific experience is not useful at all for you.
General purpose language on the other hand is useful even if you move from one country to another and take job in different business niche.
As a developer there is no upside for me to spend my time on diving into some DSL I wont use in next job.
As a business person there is no upside for me to spend my time learning DSL or specific application interface in and out that I won't use in next job or in different position.
I only have only one experience to share. Back in mid 90s, was tasked with developing a webserver that provided targeted advertising. A requirement was providing the marketing team an accessible mechanism for defining rules. Basic stuff like encoding a marketing/ad-sale team rule such as "show ad of truck if user is male, at some age group". A little scripting language was developed, nothing fancier than conditional branching was involved on the surface. And the user base immediately got it and started using it, because it was a "little language".
I'd very much like to see someone coming up with nice syntax for hierarchical finite state machines and entity components systems just like people came up with nice syntax for queries in the form of linq and nice syntax for html generation in the form of jsx.
Doing these things in vanilla syntax of general purpose programing languages is not exactly great.
But little languages could be a nice interface for the non- or semi-programming tasks. Do you really want your domain experts to fiddle with the core of your application or do you want your programmers to do that? A little language could be a great interface to encode specific business rules and domain logic.
The author gives SQL as an example of a little language and we do indeed already provide SQL interfaces to analysts and let them do their thing.
It seems entirely possible to create a tool for making little languages that also supports interfaces for tooling and documentation. Tooling is actually quite abysmal for general purpose languages and as the article points out the tooling for little languages can be much more powerful when there’s a smaller surface area. Also we could build languages that are primarily geared around tooling and documentation instead of languages designed around different manners of defining functions and iterating over lists.
I also don’t think that anyone has ever suggested that making a custom language is a small endeavor.
Whatever the future of programming languages it will definitively not be popular at first and negative criticisms will be the top-rated commentary. And when the new paradigm comes I can almost guarantee that the majority of the HN crowd will be too old and set in its ways to make the transition. Why would the future be any different than the past with regards to paradigms shifts?
That's not the way to see the process. We have been highly successful at little languages already: they are, in essence, why when I write something like "a = a + 1" I can assume it works identically in C, Javascript, and Python. (Semantically, it doesn't! But it is a portable intent.)
You might object and say, "but variable assignment and addition, that's a big language thing." It isn't, though; it's just an infix expression. And infix didn't pop out of nowhere; it had to be invented as part of the gradual creep upwards from machine level "coding" into a more abstract semantics. Infix parsers are small, and while the complete language is larger, what it's presenting is infix-compatible. "Regex" is the same way: there is a general definition of regular expressions, and then there are some common variants of regex, the implemented semantics.
The boundary between "the language needs its own compiler and runtime support" and "the language exists as an API call you pass a string into, which compiles into data structures visible to the host language" is a fluid one. And the most reasonable way of making little languages involves seeing the pattern you're making in your host and "harvesting" it. In the previous eras, there were severe performance penalties to trying to bootstrap in this way, and so generating a binary was essential to success. But nowadays, it's another form of glue overhead. If you define syntactic boundaries on your glue, it actually becomes easier to deal with over time.
Documentation-wise, it's the same: if the language is sufficiently small, it feels like documentation for a library, not a language.
Yes, to add to your point: nobody has managed to used the "outputs" of the STEPS project to do something useful.
There was a cool "wordprocessor-like" (Franken?) demonstrated which was created with a small number of lines, it should be a huge success in the FOSS world no?
Well no, nobody managed to make it work.
In addition, “little” languages tend to eventually become turing-complete, because you keep needing that little extra bit of functionality.
And then you want to modularize your code because it becomes to big, and you want to create libraries for code reuse.
You end up wanting static typing for the usual reasons, which eventually leads to needing parametric types and recursively defined types, and the type system becoming turing-complete as well.
Or you keep working around the limitations of the little language, writing code generators and wrapping it in general-purpose language APIs.
Would you (mis)use c to do text processing, or would you use shell tools?
I suppose all this leads me to the suspicion that little languages fill in for shortcomings in big languages. Big languages can absorb the things that work, this negating the need for small languages in that sphere.
Although how far can that go? Can we keep making ever bigger languages? Or at some point does it crumble under its own weight?
Regarding CMake’s horrific documentation: I will literally be willing to pay money for someone who can show me how/if it’s possible to wire in a different language to CMake! I believe it’s possible, I’ve seen some functions deep in the crappy docs that make it look like it is, but I cannot for the life of me work it out. The language in question produces C object files!
This is a “deepity”. We already do this. We constantly do this in programming:
“The idea is that as you start to find patterns in your application, you can encode them in a little language—this language would then allow you to express these patterns in a more compact manner than would be possible by other means of abstraction. Not only could this buck the trend of ever-growing applications, it would actually allow the code base to shrink during the course of development!”
Functions, frameworks, little languages. It’s all abstractions on top of abstractions. You are shifting the knowledge of the abstraction for the more fundamental knowledge underneath that does the actual work.
You end up just sweeping the codebase growth under some other layer’s rug and blissfully forget about the woes of future maintainers. The code is still there, abstracted and exposed by the “little language”. Hiding this behind a cute moniker doesn’t seduce.
This isn’t the future of programming. This is already programming.
Here's something I don't understand: How are "little languages" different from a bunch of functionality wrapped into a library/module? Is it just that (with some convenient syntax sprinkled on top), or is there more to it?
I would imagine that most of the value comes from being able to "refactor" thought patterns to match the best way to cleave the domain into composable concepts -- and it seems like we do this all the time (and in all programming languages?).
This is what I always heard Lisp was best at. Instead of making totally new languages (with parsers, tooling, etc) you'd create little DSLs within your own code in the form of macros: come up with a "little language" for describing one part of your app, write a macro for it, and then it integrates smoothly with everything around it
Whether or not you agree this philosophy is a good one, and whether or not you like Lisp specifically, I think we can all agree that macros (in whichever language) are a much better way to do it than creating a bunch of tiny languages from scratch. I was surprised not to see the word "macro" appear in the article at all
> I’ve become convinced that “little languages”—small languages designed to solve very specific problems—are the future of programming,
Yes, but over time very specific problems become bigger/different problems which the little language isn't ideal for, the original developers move on leaving someone new to figure out the problem and language which is probably poorly documented and very brittle. Application developers probably aren't suited to writing and maintaining language code.
The only caveat is an external system provided with its own language - like RDB/SQL - which is proven and well maintained - but its hard to call SQL a little language.
Join any company and organisation and look at their build and deployment tooling.
Unless they are using Kubernetes and even then, you shall find a very complicated bunch of languages:
- shell scripts
- Dockerfiles
- Kubernetes YAML
- Makefiles
- Bazel
- Ansible
- python scripts
- Jenkins XML
- Groovy scripts
- Ruby scripts
- CloudFormation
- Terraform
- Fabric or other deployment deployment script
It's very hard to fit together and understand from a high level.
The last thing they were working on at my previous company was a YAML format to define a server, to go through the organisational structure of the company to manage computer systems.
Some people mentioned LISP in this comment thread. For me LISP is an intermediate language, I would never want to build a large system in LISP. It's not how I think about computation.
It doesn't have to be a big standalone DSL with a separate compiler or preprocessor. It can also be an embedded little language, like when you sprinkle HTML templates throughout your normal general-purpose language, and as only a syntax extension: https://docs.racket-lang.org/html-template/
(Aside: I'm seeing tasteful Racket and Scheme influences in Rust, even though they're very-very different languages. I'm hoping to contribute a little more influences.)
DSLs are not a replacement for but a complement to any existing language, general purpose or specialized. I have come to think of DSLs as programs for writing programs (similar to but not identical to macros). With a DSL, you can specify the grammar of a specific problem/program. Once that has been done, it is often quite straightforward to implement the grammar in any number of target languages. As an application developer, this may not be a huge advantage (though DSLs can also shine in any client/server interactions), but if you are a library author this can be very compelling because your library may be easily portable to most commonly used language runtimes in a generally rote kind of way. The port might not be optimal, but it should be correct, provided the high level logic of the DSL is. Performance optimizations can be done where needed.
What is great about this approach as an individual is that it requires you to tighten your ideas. When you have to implement all of the functionality in a DSL, you really start thinking about what you truly need. A big language nudges you towards using all of its features while a small language challenges you to consider what is truly essential.
Of course DSLs always run the risk of being write only and/or only comprehensible by the original author. Like any powerful tool, DSLs should be used judiciously and responsibly. Often that isn't the case, in part because I don't think the tooling for writing DSLs is generally very good. But I am betting that new tools that make DSL writing easy will have a profound effect on software development.
Calling SQL a "little language" must surely be a joke. Even if you ignore the differences between its many dialects, SQL is only "little" in that it is domain-specific rather than general purpose.
In fact the entire article seems to boil down to "DSLs are the future", which I'm sure I've seen articles about back when Ruby on Rails was dominating web technologies, Cucumber (and its various ports) created "BDD" testing fad and DevOps started gaining traction on top of various "Ruby DSLs" used as configuration formats.
I don't think DSLs are going to go away any time soon. But there is a trade-off between domain-specific "little languages" and general purpose programming languages (or "DSLs" that are actually subsets of the latter). It can be fun to have to work with a little language, it's not so fun to work with dozens of them, each with different rules you have to memorize, instead of just being able to use the same language for all (and in truth, this was the source of the Ruby DSL craze because developers were already using Ruby on Rails).
hm, im a little unhappy about the author comparing Knuth's solution to a handfull of shell utilities.
for one, the author says knuths program written in WEB was 10 pages long, discounting the fact that these 10 pages are HEAVILY annotated.
my other point is:
tr has 1917 LOC,
sort has 4856,
uniq has 663
and sed is in its own package at around 10 MB
all including comments and docs for sed
it's fine and good that you can use composition with shell utilities, but come on, write that example program in C99 and you'll be a not very happy coder at all. in general i find the comparison rather rude. Knuth was supposed to show ?his? programming language WEB and as a "critique" McIlroy farts out a shellscript like "lmao first".
indeed, you do not often need to count word frequencies.
but what was this article supposed to be really about? software engineering 101 aka dont-reinvent-the-wheel/DRY?
As you say, Knuth was asked to demonstrate his literate programming... In some ways this is a direct request for the non-pithy, articulated, first principles answer. I would more say Knuth was set up than that he was framed, but tomato-tomato. :)
The size of the language is a red herring. You really just want programs that are well structured, which can be greatly helped by choosing a "perfect language" for each task, but often helped just as well by choosing (or creating) a great library to express the business logic.
Unlikely. On one hand notations should be a commodity and they indeed should be unique to the task. There is no point and no way to try making a unified notation for music and chess. On the other hand there is no point to make multiple notations for the same thing. And programming is indeed the same thing from lowest to highest levels, composable like a Russian doll, or we won't be able to build large systems. So the future of programming is a single notation that actually reflects what programming is. We do not have it yet, this is why we have so many "programming languages".
[+] [-] gumby|3 years ago|reply
It was common in the 60s and 70s to have the hardware manufacturer ship all the OS and languages with their hardware. The languages were often designed for specific problem domains. The idea of general purpose languages (FORTRAN, PL/1, etc) was uncommon. You can see this in K&R (the original edition anyway) where they justify the idea of a general purpose language, even though C itself derived from prior general languages (B & BCPL) and they had gotten the idea from their experience on Multics (written in PL/1, a radical idea at the time). So a 20 year old idea was still barely diffused into the computing Zeitgeist.
Most Lisp development (since the early 70s at least) is writing a domain-specific representation (data structures and functions) and then writing your actual problem in it. I used both Lisp and Smalltalk this way at PARC in the early 80s.
More rigid languages (the more modern algolish languages like python, c++, rust, C, js etc -- almost every one of them) doesn't have these kind of affordances but instead do the same via APIs. Every API is itself a "little language"
What are called little languages in the Bently sense is simply a direct interface for domain experts. And after all what was a language like, say, Macsyma but a (large) "little language"?
[+] [-] Verdex|3 years ago|reply
I came to this conclusion early in my career. It went something like this:
A - "To do this, just create this object, fill in these properties, and call these methods."
B - "Okay, I did that, but it crashed."
A - "Yeah, it's because you set the properties in the wrong order. This property relies on this other property under the hood. Set them in this order."
B - "Still crashes."
A - "Yeah, you called the methods in the wrong order. This method relies on that method. Call them in this order and it works."
My conclusion was that the lisp philosophy of building a lot of little sub language was equivalent to what people were doing with OO in C#/Java. Either way you have to learn the "right" way to put things together which is dictated by unseen forces behind the scene.
Of course, I also concluded that most people work differently than I do. For most people, if the code "looks right" (ie recognizable syntax) then they're able to tell themselves a story that it's familiar and their intuition is able to pick up the slack for finding the right enough way to use most arbitrary APIs (just as long as they don't exceed some level of incomprehensibility). On the other hand, I have to understand the underlying logic or I use the API the wrong way pretty much every time.
So for most people lots of APIs is actually a much better cognitive way for them to work whereas for me API soup and lisp macros are the same conundrum.
[+] [-] chewxy|3 years ago|reply
I agree with much of the problems listed in the article. The author even manages to stumble onto some of the solutions (e.g. Dhall being a total language).
"Expressiveness is co-decidability" is the main theme of these things. The crux of the issue is in our everyday programming tasks, we have many levels of decidability, ranging from RE all the way to things that require full Turing completeness.
The majority of work however, lies in the middle. There are so many things that can be done with pushdown automatons, or with deterministic automatons. Most codebases don't actually use those though. An issue is that there is a dearth of "mini" languages that support these things.
Another issue is that somehow we are enamoured with the idea that our languages must be able to express everything under the sun (up to TC/Recursively Enumerable). This seems to be more of an industry attitude than anything - there is this chase for the most powerful language (a lisp, clearly... everything else is a blub).
I've recently experimented with embedding an APL into my usual programming language, and it was a very interesting experience. It feels like having the power to do regular expression stuff, but with arrays. I want to do the same for the other levels of expressiveness.
[+] [-] tempodox|3 years ago|reply
[+] [-] 015a|3 years ago|reply
The article pulls Shell as an early example. Shell did not become the powerhouse it is because its "great" (though some would argue it is, I'm not here to debate that); or because its small; or because its general purpose; or because its single-purpose. It became a powerhouse because its Old and Omnipresent. See, the problem with inventing New Things is that they are, by definition, not Old, nor Omnipresent. New Things have to start somewhere, but you're starting in last place.
> Regular expressions and SQL won’t let you express anything but text search and database operations, respectively.
Oh mylanta. Did you know that after the addition of back-expressions, Regular Expressions became turing complete? They are, functionally, a real programming language, just like C; well, except, far more annoying to write. And naturally, SQL "won't let you express anything but database operations", which is to say nothing about "SELECT 1+1"... let alone the little corner of the language called "Stored Procedures".
[+] [-] gpderetta|3 years ago|reply
[+] [-] 0x445442|3 years ago|reply
https://groovy-lang.org/using-ginq.html
[+] [-] btschaegg|3 years ago|reply
In C#, I can pull in Roslyn, and compile a string on the fly as a C# script; but the way the .NET standard library is structured makes it pretty much unfeasible to prohibit outside interactions I don't want to allow (in my case, e.g: `DateTime.Now`, while allowing the Handling of `DateTime` values).
It's possbile to embed the Typescript compiler into a website, but running code on the fly and some simple sandboxing is not feasible without a serious pile of hacks.
I've recently read a forum thread about a library for compiling/running Elixir code as a script, but guess what: The runtime (apparently) makes sandboxing really hard.
And so on and so on. I just wished that the LUA approach of "if I don't give you a hook, you cannot do that" were just the default. I've seen so many overcomplicated enterprise-y solutions that are basically just a plea for a well-designed, local and small scripting API…
[+] [-] naasking|3 years ago|reply
Yes, because languages are still not capability-secure. Memory-safe languages are inherently secure up until you introduce mutable global state, and that's how they typically leak all kinds of authority. If you had no mutable global state, then you can eval() all the live-long day and you wouldn't be able to escape the sandbox of the parameters passed in.
Examples of mutable global state:
* APIs: you can make any string you like, but somehow you can access any file or directory object using only File.Open(madeUpString). This is called a "rights amplification" pattern, where your runtime permits you to somehow amplify the permissions granted by one object, into a new object that gives you considerably more permissions.
* Mutable global variables: as you point out, eval() can access any mutable global state it likes, thus easily escaping any kind of attempt to sandbox it.
If these holes are closed then memory-safe languages are inherently sandboxed at the granularity of individual objects.
[+] [-] kaba0|3 years ago|reply
[+] [-] orangea|3 years ago|reply
[+] [-] rng_civ|3 years ago|reply
> Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module
And IIRC, the core instruction set is reasonably compact.
[+] [-] garethrowlands|3 years ago|reply
[+] [-] lolinder|3 years ago|reply
We already have major headaches switching between JS, SQL, and {insert backend language here}. Introducing tens of little languages into a codebase may marginally increase readability of each chunk of code in isolation, but the amount of context-switching and required background knowledge it introduces would more than make up the difference.
In a abstraction strategy that's based around libraries, every library agrees on the same basic syntax and semantics. You interact with an embedded DSL via a well-defined interface that all libraries respect. The syntax may not be perfectly ideal to any one part of the project, but it's consistent throughout every part of the project. That has real value.
I think it's also a red herring to argue about lines of code: stringing together a bunch of little languages is not likely to lead to fewer lines of code than pulling in an equivalent number of third-party libraries, and it will almost certainly increase the total amount of code in your distribution, because each little language must not only implement the functionality desired, it also needs its own parser and interpreter. Take McIlroy's shell script: if you add up the C code to implement each of those little languages, you're at about 10k lines of code to make that bit of code golf possible.
I'm a huge fan of DSLs, and I like the analogy of modern programming as pyramid building. I just don't think independent, chained-together DSLs are the answer. I'd rather have a language like Kotlin that is designed around embedded DSLs that all respect the same rules and use the same runtime.
[+] [-] Kamq|3 years ago|reply
This is a fair point, but part of this has to be how battle tested the language in question is, right?
Bringing in a single language that's been run through its paces (bourne shell in this case) for text processing, seems like a much lower risk than bringing in a dozen different languages and hoping the places that they interface doesn't blow up (hope someone tried that particular combination before).
[+] [-] yellowapple|3 years ago|reply
I used to agree wholeheartedly with this, but now I pretty strongly disagree; in my decade or so of cumulative code-monkeying experience, having to understand the nuances of some "convert $BACKEND_LANGUAGE to JS and SQL" layer has been far more headache-prone than just, you know, writing JS and SQL. All about using the right tool for the job - and I know of very few languages that are the right tool for all three of those jobs (let alone the myriad other jobs that might pop up as soon as you expand beyond a simple CRUD app).
[+] [-] LordHeini|3 years ago|reply
DSLs like SQL are the norm and you can see the problem of them in basically every project.
You either use ORMs or you end up hand rolling SQL rows into Structs or Classes.
The whole mapping usually looks like crap and contains a bunch of implicit corner cases, which eventually end up being a footgun for someone.
Usually the SQL sever runs somewhere else, the ports are wrong, the language version is wrong, or a migration failed and a function is missing yada yada....
The same is usually true for Regexp. There are a billion dialects and every single one of them is basically unreadable, incomplete or just weird.
The same is true for microservices with tons of config files for dev, staging, testing and production...
Everything has its own version, can be down or mutate some random state somewhere while depending one other servcies.
It always breaks at the seams.
Increasing the amount of DSLs increases the amount of seams and thus makes software worse.
[+] [-] lmm|3 years ago|reply
[+] [-] BiteCode_dev|3 years ago|reply
Yeah, SQL is not a little language anymore.
It started as one, but because a lot has been added to it and SQL flavors for Oracle or Postgres are anything but tiny. Windowing, nesting, json handling...
I think the author is kinda proving with this that a successful little language does not stay little, and hence little languages are not the future.
And don't get me started on DSL in general. Just lookup my username and "DSL" on hackernews for endless rambling.
[+] [-] AlotOfReading|3 years ago|reply
The problem is that your DSL has to be understood by other people, including future you. Programming tasks are vast, combinatorially explosive state spaces full of weird potential interactions between features. Once you get above the complexity and universal familiarity of say, arithmetic, it's difficult for others to understand what's going on just by looking at 1-2 live examples. You have to heavily invest in proper docs and tooling (if your language doesn't provide it for free). By the time you've completed that your "little language" usually isn't such a little effort anymore.
If you don't, you've just made the next CMake. Congrats you monster.
[+] [-] BiteCode_dev|3 years ago|reply
That's why we have languages with functions now, because people didn't want to manually do a register dance in assembly.
That's why we have name spaces, because naming conventions only take you so far.
That's why we have map and filter (or equivalent) because that's what most loops are doing anyway.
Generation after generation, we discover that we all use common abstractions. We name them design patterns, then we integrate them in the language, and now they are primitives.
And the languages grow, grow, bigger and bigger. But they are better.
More productive. Less error prone. And code bases become standardized, simple problems have known solutions, jumping to a new project doesn't mean relearning the entire paradigm of the system in place.
Small languages either become big, or are replaced by things that are big, for the same reason most people prefer a car to a horse to go shopping.
Not that horse ridding will totally disapear, but it will stay in their optimal niche, they are not "the future".
[+] [-] lovecg|3 years ago|reply
[+] [-] intelVISA|3 years ago|reply
[+] [-] ozim|3 years ago|reply
That is like learning some SaaS application ins and outs you switch jobs and that specific experience is not useful at all for you.
General purpose language on the other hand is useful even if you move from one country to another and take job in different business niche.
As a developer there is no upside for me to spend my time on diving into some DSL I wont use in next job.
As a business person there is no upside for me to spend my time learning DSL or specific application interface in and out that I won't use in next job or in different position.
[+] [-] eternalban|3 years ago|reply
Sometimes a DSL is really the right solution.
[+] [-] scotty79|3 years ago|reply
Doing these things in vanilla syntax of general purpose programing languages is not exactly great.
[+] [-] dmitriid|3 years ago|reply
- can you even design it properly?
- is it tested?
- is it debuggable?
- how does it integrate with the rest of your program(s)? with the rest of your system(s)?
- what's the performance, and does it matter?
- is it documented?
- who is going to maintain it 1 year from now? 5 years from now?
[+] [-] maweki|3 years ago|reply
But little languages could be a nice interface for the non- or semi-programming tasks. Do you really want your domain experts to fiddle with the core of your application or do you want your programmers to do that? A little language could be a great interface to encode specific business rules and domain logic.
The author gives SQL as an example of a little language and we do indeed already provide SQL interfaces to analysts and let them do their thing.
[+] [-] williamcotton|3 years ago|reply
I also don’t think that anyone has ever suggested that making a custom language is a small endeavor.
Whatever the future of programming languages it will definitively not be popular at first and negative criticisms will be the top-rated commentary. And when the new paradigm comes I can almost guarantee that the majority of the HN crowd will be too old and set in its ways to make the transition. Why would the future be any different than the past with regards to paradigms shifts?
[+] [-] syntheweave|3 years ago|reply
You might object and say, "but variable assignment and addition, that's a big language thing." It isn't, though; it's just an infix expression. And infix didn't pop out of nowhere; it had to be invented as part of the gradual creep upwards from machine level "coding" into a more abstract semantics. Infix parsers are small, and while the complete language is larger, what it's presenting is infix-compatible. "Regex" is the same way: there is a general definition of regular expressions, and then there are some common variants of regex, the implemented semantics.
The boundary between "the language needs its own compiler and runtime support" and "the language exists as an API call you pass a string into, which compiles into data structures visible to the host language" is a fluid one. And the most reasonable way of making little languages involves seeing the pattern you're making in your host and "harvesting" it. In the previous eras, there were severe performance penalties to trying to bootstrap in this way, and so generating a binary was essential to success. But nowadays, it's another form of glue overhead. If you define syntactic boundaries on your glue, it actually becomes easier to deal with over time.
Documentation-wise, it's the same: if the language is sufficiently small, it feels like documentation for a library, not a language.
[+] [-] scotty79|3 years ago|reply
[+] [-] renox|3 years ago|reply
There was a cool "wordprocessor-like" (Franken?) demonstrated which was created with a small number of lines, it should be a huge success in the FOSS world no? Well no, nobody managed to make it work.
[+] [-] layer8|3 years ago|reply
And then you want to modularize your code because it becomes to big, and you want to create libraries for code reuse.
You end up wanting static typing for the usual reasons, which eventually leads to needing parametric types and recursively defined types, and the type system becoming turing-complete as well.
Or you keep working around the limitations of the little language, writing code generators and wrapping it in general-purpose language APIs.
[+] [-] benj111|3 years ago|reply
I suppose all this leads me to the suspicion that little languages fill in for shortcomings in big languages. Big languages can absorb the things that work, this negating the need for small languages in that sphere.
Although how far can that go? Can we keep making ever bigger languages? Or at some point does it crumble under its own weight?
[+] [-] girvo|3 years ago|reply
[+] [-] CipherThrowaway|3 years ago|reply
[+] [-] adql|3 years ago|reply
Generating data files via templating languages was never a good idea
Using data languages as essentially code is also similarly bad idea.
Ansible does both at once.
[+] [-] ddevault|3 years ago|reply
[+] [-] auxfil|3 years ago|reply
“The idea is that as you start to find patterns in your application, you can encode them in a little language—this language would then allow you to express these patterns in a more compact manner than would be possible by other means of abstraction. Not only could this buck the trend of ever-growing applications, it would actually allow the code base to shrink during the course of development!”
Functions, frameworks, little languages. It’s all abstractions on top of abstractions. You are shifting the knowledge of the abstraction for the more fundamental knowledge underneath that does the actual work.
You end up just sweeping the codebase growth under some other layer’s rug and blissfully forget about the woes of future maintainers. The code is still there, abstracted and exposed by the “little language”. Hiding this behind a cute moniker doesn’t seduce.
This isn’t the future of programming. This is already programming.
[+] [-] ssivark|3 years ago|reply
I would imagine that most of the value comes from being able to "refactor" thought patterns to match the best way to cleave the domain into composable concepts -- and it seems like we do this all the time (and in all programming languages?).
[+] [-] brundolf|3 years ago|reply
Whether or not you agree this philosophy is a good one, and whether or not you like Lisp specifically, I think we can all agree that macros (in whichever language) are a much better way to do it than creating a bunch of tiny languages from scratch. I was surprised not to see the word "macro" appear in the article at all
[+] [-] helsinkiandrew|3 years ago|reply
Yes, but over time very specific problems become bigger/different problems which the little language isn't ideal for, the original developers move on leaving someone new to figure out the problem and language which is probably poorly documented and very brittle. Application developers probably aren't suited to writing and maintaining language code.
The only caveat is an external system provided with its own language - like RDB/SQL - which is proven and well maintained - but its hard to call SQL a little language.
[+] [-] samsquire|3 years ago|reply
Unless they are using Kubernetes and even then, you shall find a very complicated bunch of languages:
- shell scripts
- Dockerfiles
- Kubernetes YAML
- Makefiles
- Bazel
- Ansible
- python scripts
- Jenkins XML
- Groovy scripts
- Ruby scripts
- CloudFormation
- Terraform
- Fabric or other deployment deployment script
It's very hard to fit together and understand from a high level.
The last thing they were working on at my previous company was a YAML format to define a server, to go through the organisational structure of the company to manage computer systems.
Some people mentioned LISP in this comment thread. For me LISP is an intermediate language, I would never want to build a large system in LISP. It's not how I think about computation.
[+] [-] neilv|3 years ago|reply
A more recent treatment is Matthew Butterick's book: https://beautifulracket.com/
It doesn't have to be a big standalone DSL with a separate compiler or preprocessor. It can also be an embedded little language, like when you sprinkle HTML templates throughout your normal general-purpose language, and as only a syntax extension: https://docs.racket-lang.org/html-template/
(Aside: I'm seeing tasteful Racket and Scheme influences in Rust, even though they're very-very different languages. I'm hoping to contribute a little more influences.)
[+] [-] diffxx|3 years ago|reply
What is great about this approach as an individual is that it requires you to tighten your ideas. When you have to implement all of the functionality in a DSL, you really start thinking about what you truly need. A big language nudges you towards using all of its features while a small language challenges you to consider what is truly essential.
Of course DSLs always run the risk of being write only and/or only comprehensible by the original author. Like any powerful tool, DSLs should be used judiciously and responsibly. Often that isn't the case, in part because I don't think the tooling for writing DSLs is generally very good. But I am betting that new tools that make DSL writing easy will have a profound effect on software development.
[+] [-] hnbad|3 years ago|reply
In fact the entire article seems to boil down to "DSLs are the future", which I'm sure I've seen articles about back when Ruby on Rails was dominating web technologies, Cucumber (and its various ports) created "BDD" testing fad and DevOps started gaining traction on top of various "Ruby DSLs" used as configuration formats.
I don't think DSLs are going to go away any time soon. But there is a trade-off between domain-specific "little languages" and general purpose programming languages (or "DSLs" that are actually subsets of the latter). It can be fun to have to work with a little language, it's not so fun to work with dozens of them, each with different rules you have to memorize, instead of just being able to use the same language for all (and in truth, this was the source of the Ruby DSL craze because developers were already using Ruby on Rails).
[+] [-] mlatu|3 years ago|reply
for one, the author says knuths program written in WEB was 10 pages long, discounting the fact that these 10 pages are HEAVILY annotated.
my other point is:
tr has 1917 LOC,
sort has 4856,
uniq has 663
and sed is in its own package at around 10 MB
all including comments and docs for sed
it's fine and good that you can use composition with shell utilities, but come on, write that example program in C99 and you'll be a not very happy coder at all. in general i find the comparison rather rude. Knuth was supposed to show ?his? programming language WEB and as a "critique" McIlroy farts out a shellscript like "lmao first".
indeed, you do not often need to count word frequencies.
but what was this article supposed to be really about? software engineering 101 aka dont-reinvent-the-wheel/DRY?
or perhaps literate programming?
[+] [-] cb321|3 years ago|reply
As you say, Knuth was asked to demonstrate his literate programming... In some ways this is a direct request for the non-pithy, articulated, first principles answer. I would more say Knuth was set up than that he was framed, but tomato-tomato. :)
[+] [-] bloppe|3 years ago|reply
[+] [-] Mikhail_Edoshin|3 years ago|reply