top | item 41938819

Lingo: A Go micro language framework for building Domain Specific Languages

133 points| adityasaky | 1 year ago |about.gitlab.com | reply

50 comments

order
[+] wahern|1 year ago|reply
When I think of a DSL, I think of a language with specialized syntax, grammar, or constructs suited to the problem domain. Think SQL, AWK, or regular expressions. This is just a LISP variant with a typical host-side API for registering function names.

I'll never get how merely having function names that reflect the use case, plus a stripped down or absent standard library, qualifies as a DSL. I know some people have long used "DSL" in this way, especially among LISP fans, but... I just don't get it. If I want a DSL it's because I want something that gives me, e.g., novel control flow constructs a la AWK, or highly specialized semantics a la regular expressions, that directly suit the problem domain. If I'm not getting that kind of power, why tie myself to some esoteric dependency? Either way you're adopting a tremendous maintenance burden; it better be worth your while.

I'm a huge fan of Lua and have used it for many projects in different roles, but never once thought of any particular case as having created a DSL, even when stripping the environment to just a few, well-named, self-describing functions.

I don't mean to criticize this particular project. Good code is good code. It's just the particular conceptualization (one shared by many others, to be fair) of what a "DSL" means that bugs me.

[+] lispm|1 year ago|reply
> I know some people have long used "DSL" in this way, especially among LISP fans

generally this would be called an "embedded domain specific language". Some languages are relatively flexible to change the syntax. For example Common Lisp has reader macros to change the token syntax and macros to change the Lisp syntax. With that one can create all kinds of embedded languages, incl. domain specific languages (languages which are specific to a special domain). Examples would be embedded logic languages, query languages, rule based languages, languages to describe user interfaces, etc. The Common Lisp standard has a notorious example for that, a complex LOOP construct, which uses a very different syntax: https://www.lispworks.com/documentation/HyperSpec/Body/m_loo...

There are other real-world examples out there, for example an embedded domain specific language to describe 3d objects in the domain of parametric CAD, for description of technical things like turbines or other parts of an aircraft.

[+] LordHeini|1 year ago|reply
I would agree with your sentiment.

This is basically a lib with some extra syntax to parse CSV Files.

A 'proper' DSL would require a very specific domain where it is applied to. Like document creation, or solving a certain problem and only that but not much else. Turing completeness is usually not required as well.

For example Matlab or LaTex are domain specific as well as SQL. Those are used to do math, create documents or do mangle tables.

Imho just renaming forEach to Map to parse CSV files does not fit the bill so the linked example is not that great.

This project basically a DSL builder thingy with a text processor demo.

As an aside:

I am missing the most important thing when it comes to CSV which is configuration of the input.

Might be because this is more of an example but it is usually a sign of a lets say more academic project.

Working with CSV, is usually a source of a lot of "good fun" where many hours can be spent.

Because your average CSV is often a SSV or TSV in some ISO that is everything but not UTF8. It usually contains line breaks which have been renamed to funny icons by some combo of tool and operating system. Also there are weird escape character in orders which are not consistent on every row. Also sometimes you have titles sometimes not. Dates make no sense, are language depended and in a weird format the intern made up 10 years ago. And even numbers are weird too, like 12e^-25 or '0.00' or '10.000,0'. Then you get columns which really should be 2 or more, or there are lines which span multiple rows.

Ihmo it is way better for robust CSV parsing to have a really low level approach where you rigorously check for everything (and return everything to sender that does not fit).

[+] kitd|1 year ago|reply
DSLs don't need to be supported by a separate lexer/parser. Some (for better or worse) use standard formats like yaml or json. Some (ie embedded DSLs) are represented using terms defined in a programming language.

Any naming of types, functions or variables that you do while solving a problem in software is creating a "language" of terms that are specific to the domain.

A well-constructed fluent API can read a lot like what you would call a DSL. A configuration language in YAML is both YAML and a DSL for config.

[+] BlackFly|1 year ago|reply
You might be interested in this classic pdf: http://www.cs.virginia.edu/~evans/cs655/readings/steele.pdf (Growing a language)

Without those function names that you talk about, you wouldn't really recognize a language. Those standard library function names give the "batteries included" kind of feeling. The size of the community libraries furthers that accessibility and productivity feeling of the language. Furthermore, you can certainly create control flows via libraries--with callbacks specifically--and create any kind of novel branching structure your domain had need of.

A good API is a sort of DSL. If it well reflects how you think about the domain and helps you express instructions within that domain. The language can be very different. We experience that in our own language when we hear people talking with heavy use of jargon we don't recognize: they might as well be speaking another language.

But overall I agree with you that I would probably save the use of the jargon term DSL for novel syntax targeted at a specific domain.

[+] creata|1 year ago|reply
> When I think of a DSL, I think of a language with specialized syntax, grammar, or constructs suited to the problem domain.

I think that's too strict. For example, JAX is (imo) an eDSL, but it doesn't have a specialized syntax, grammar or constructs - on the contrary, it's meant to feel just like Numpy. The thing that makes it special is the interpretation of those constructs.

[+] skydhash|1 year ago|reply
I don’t think there’s need to have such strict requirements. No need to invent a whole new paradigm, you can go with the usual ones as long as it works well. More often than not the domain is not that complex or flexible to require a language, it may only requires a few algorithms (libraries) ie even if you do invent a language, few programs will be built with it. You may as well tweak an existing language for a nicer DX
[+] Trufa|1 year ago|reply
This is a very good and interesting point, but what if the point itself is to reduce the power and increase things like legibility.

If I create this kind of mapped functions DSLs I can assure that things will be done a a certain way vs the borderline infinite possibilities of code.

[+] wbl|1 year ago|reply
Lisp macros can make their own control flow: it's like the relation of LinQ to F#.
[+] czyhandsome|1 year ago|reply
Well, could you give some good examples of how is lua used in your projects in different roles? I'm curious.
[+] KPGv2|1 year ago|reply
Yeah. I mean, isn't CSV itself a DSL? It can't execute, but it's a domain-specific markup language for structured data.
[+] bitexploder|1 year ago|reply
Disclaimer: this is a neat project.

DSLs are such a trap for most projects that think they need them. Use lua or something off the shelf for scripting.

CEL exists for Go and (safely) solves many of the problems you might also want a DSL for.

The case for DSLs is often hard to justify in a project that has to be maintained for years.

[+] zelphirkalt|1 year ago|reply
CEL seems very much in line with the Golang ideology. It looks like CEL doesn't really have any upsides at all, except for being non-turing-complete. It looks like it is more of a convenience food, for people, for whom any other syntax than what they know from mainstream programming languages is "too adventurous" or "too foreign". As if they cannot be trusted to be able to cope with another syntax. This might even be true, considering the character of Golang itself, which was created as a dumbed down lowest common denominator of several mainstream languages, so that everyone gets it. It was even missing generics for a long time, deeming them to be too complicated.

As a DSL CEL is kinda pointless, since it does not create any additional convenience beyond the usual mainstream programming language syntax. It therefore leaves potential on the table, and as a tradeoff appeals to familiarity of syntax. As a configuration language it is usable, probably with reduced risk, compared to using Golang itself (no turing-completeness!).

I don't think it actually appeals to anyone, who considers creating a DSL for a good reason.

[+] randomdata|1 year ago|reply
> CEL exists for Go and (safely) solves many of the problems you might also want a DSL for.

Seems like it would be difficult to use for what they are trying to achieve. Lua would be a better fit, however it is noted that they tried it first, but ran into some kind of issue with it. So now Limbo is among the "off the shelf" options.

[+] Mikhail_Edoshin|1 year ago|reply
A generic structure is not a notation. XML or S-expressions are a generic structure. A good practical test that shows limitations of this approach would be to try to write something like a picture description similar to Pikchr (PIC). It is possible, but the result will be far more verbose and obtuse than what you get with a real grammar and parser in Pikchr.
[+] wejick|1 year ago|reply
Unfortunately the given example doesn't give - mere mortal like me- idea on how this make sense to use. Instead of using whatever scripting language available, or just json.

Or probably grule? [1] https://github.com/hyperjumptech/grule-rule-engine

[+] randomdata|1 year ago|reply
> Instead of using whatever scripting language available

This is just another scripting language available. It only differs from something like Lua or Javascript in that it claims to have less complexity. Complexity which, while not fully elaborated on, apparently bogged down their efforts when they originally tried using Lua and an embeddable Go engine.

[+] Lyngbakr|1 year ago|reply

    > Some popular DSLs most software developers use on a regular basis include Regular Expressions for pattern matching, AWK for text transformation or Standard Query Language for interacting with databases.
Isn't it Structured Query Language? Or are both variants used?
[+] shakna|1 year ago|reply
> In the early days of the system there was divided preference between Standard Query Language and Structured Query Language but it did not make a whole lot of difference since most people most of the time called it by the acronym SQL. Now the overwhelming but not complete preference is for Structured Query Language.

[0] https://www.sjsu.edu/faculty/watkins/sql.htm

[+] randomdata|1 year ago|reply
Strictly, it is just SQL. It doesn't stand for anything. The project was originally known as SEQUEL, a fun word play on QUEL (from the Ingress/Postgres camp), but trademark issues saw it later lose the vowels.

"Structured Query Language" is a so-called backronym. "Standard Query Language" fits the pattern as well. As does "Super Quick Lookup", if you want to have some fun with it.

[+] codetrotter|1 year ago|reply
Not to be confused with the Lingo programming language that was used in Macromedia Director to create Shockwave content ;)

https://en.wikipedia.org/wiki/Lingo_(programming_language)

https://en.wikipedia.org/wiki/Adobe_Shockwave

https://en.wikipedia.org/wiki/Adobe_Director

[+] giancarlostoro|1 year ago|reply
Which is still in use today by hobbyists mainly, but also Habbo Hotel released their old client using their old Lingo code as "Origins" it runs as an EXE that embeds the runtime. People are working on emulators that will run it natively on web, and there's a really interesting decompiler for it as well.
[+] allknowingfrog|1 year ago|reply
This is a bit of a tangent, but this is how I would actually solve the problem in Ruby:

  #!/usr/bin/env ruby

  require "csv"

  puts CSV.foreach(ARGV[0], headers: true).sum { _1[1].to_f }.round(2)
[+] ofrzeta|1 year ago|reply
I don't know. Last update two years ago when the project was first posted. Is it worth spending time on it?
[+] randomdata|1 year ago|reply
What's missing that would necessitate an update?
[+] justinko|1 year ago|reply
The verbosity of “fed through an LLM” articles are becoming unbearable to digest.
[+] Alifatisk|1 year ago|reply
Regarding the Ruby example, why did they use Float("...") instead of #to_f?
[+] intelekshual|1 year ago|reply
Probably because `Float()` is stricter and will raise an error if the string isn't a valid number, whereas `#to_f` will silently return 0.0 or do a best-effort conversion (e.g. "1.2abc".to_f => 1.2)
[+] dlock17|1 year ago|reply
This seemed like nuclear overkill for most problems I can think of.

And where it should be used, I can't imagine you can't find a pre existing language (Cuelang maybe) instead.

I was expecting a section at the end where they demonstrate which services need a new language written just for it's configuration, but nope, just general examples.

Also, this should have a (2022) in the title.

[+] tony-allan|1 year ago|reply
I think that a Domain Specific Language without the scaffolding of a language like python to generate test data is a great use case.

You have a specific DSL to generate test data and store code written using your DSL with your test cases.

You then have a security model that separates the responsibilities of the test team from that of other developers. The team can generate many test cases in a secure environment. You could then seek community input into your test processes with having to worry about rogue code.

[+] jerf|1 year ago|reply
It is nuclear overkill for most problems you can think of.

But when you hit a problem that you need something like this for... you need something like this. The attempts to get around it or avoid it or do some unbelievably hacky thing leads to piles and piles of terrible, terrible code.

In 2024, though, I do try very hard to embed my DSLs in an existing serialization. It doesn't always work out, but, the case they show of directly embedding an AST into YAML is a worst-case scenario. In real life I've done things like specify a particular field carries an expr[1] expression to do that sort of thing, and then the structure of the rest of the file just follows normal serialization format.

[1]: https://github.com/expr-lang/expr , but I'm sure many static languages have something like this. If you don't know one, it's a good tool to put in the belt in case you ever need it.

[+] 0xCMP|1 year ago|reply
I wouldn't think of it for generating configuration. The example given implies they use it for fuzzing program inputs. It's purpose in the article is simple: generate a CSV with two columns and floats.

However, a DSL could be made to do anything. Generate particular kinds of PDFs, Excel files, intentionally incorrect CSV/TSV files, zipbombs, and etc.

All that is required is to wire it up in the Go side of things and then write a script to make use of it.

[+] randomdata|1 year ago|reply
> Also, this should have a (2022) in the title.

Is there something about the current state of world that has invalidated some or all of the content in the article?

[+] kubectl_h|1 year ago|reply
> I was expecting a section at the end where they demonstrate which services need a new language written just for it's configuration, but nope, just general examples.

Heh that was my first instinct as to why they built this as well but more for providing ergonomic ways to generate k8s objects for complex gitlab specific CRDs.