top | item 45151622

Stop writing CLI validation. Parse it right the first time

204 points| dahlia | 6 months ago |hackers.pub | reply

162 comments

order
[+] jmull|6 months ago|reply
> Think about it. When you get JSON from an API, you don't just parse it as any and then write a bunch of if-statements. You use something like Zod to parse it directly into the shape you want. Invalid data? The parser rejects it. Done.

Isn’t writing code and using zod the same thing? The difference being who wrote the code.

Of course, you hope zod is robust, tested, supported, extensible, and has docs so you can understand how to express your domain in terms it can help you with. And you hope you don’t have to spend too much time migrating as zod’s api changes.

[+] MrJohz|6 months ago|reply
I think the key part, although the author doesn't quite make it explicit, is that (a) the parsing happens all up front, rather than weaving validation and logic together, and (b) the parsing creates a new structure that encodes the invariants of the application, so that the rest of the application no longer needs to check anything.

Whether you do that with Zod or manually or whatever isn't important, the important thing is having a preprocessing step that transforms the data and doesn't just validate it.

[+] bigstrat2003|6 months ago|reply
Yeah, the "parse, don't validate" advice seems vacuous to me because of this. Someone is doing that validation. I think the advice would perhaps be phrased better as "try to not reimplement popular libraries when you could just use them".
[+] akoboldfrying|6 months ago|reply
Yes, both are writing code. But nearly all the time, the constraints you want to express can be expressed with zod, and in that case using zod means you write less code, and the code you do write is more correct.

> Of course, you hope zod is robust, tested, supported, extensible, and has docs so you can understand how to express your domain in terms it can help you with. And you hope you don’t have to spend too much time migrating as zod’s api changes.

Yes, judgement is required to make depending on zod (or any library) worthwhile. This is not different in principle from trusting those same things hold for TypeScript, or Node, or V8, or the C++ compiler V8 was compiled with, or the x86_64 chip it's running on, or the laws of physics.

[+] bschwindHN|6 months ago|reply
Rust with Clap solved this forever ago.

Also - don't write CLI programs in languages that don't compile to native binaries. I don't want to have to drag around your runtime just to execute a command line tool.

[+] MathMonkeyMan|6 months ago|reply
Almost every command line tool has runtime dependencies that must be installed on your system.

    $ ldd /usr/bin/rg
    linux-vdso.so.1 (0x00007fff45dd7000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x000070764e7b1000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x000070764e6ca000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x000070764de00000)
    /lib64/ld-linux-x86-64.so.2 (0x000070764e7e6000)
The worst is compiling a C program with a compiler that uses a more recent libc than is installed on the installation host.
[+] majorbugger|6 months ago|reply
I will keep writing my CLI programs in the languages I want, thanks. Have it crossed your mind that these programs might be for yourself or for internal consumption? When you know runtime will be installed anyway?
[+] jampekka|6 months ago|reply
> Also - don't write CLI programs in languages that don't compile to native binaries. I don't want to have to drag around your runtime just to execute a command line tool.

And don't write programs with languages that depend on CMake and random tarballs to build and/or shared libraries to run.

I usually have a lot less issues with dragging a runtime than fighting with builds.

[+] perching_aix|6 months ago|reply
Like shell scripts? Cause I mean, I agree, I think this world would be a better place if starting tomorrow shell scripts were no longer a thing. Just probably not what you meant.
[+] ndsipa_pomu|6 months ago|reply
> don't write CLI programs in languages that don't compile to native binaries. I don't want to have to drag around your runtime just to execute a command line tool.

Well that's confused me. I write a lot of scripts in BASH specifically to make it easy to move them to different architectures etc. and not require a custom runtime. Interpreted scripts also have the advantage that they're human readable/editable.

[+] dcminter|6 months ago|reply
The declarative form of clap is not quite as well documented as the programmatic approach (but it's not too bad to figure out usually).

One of the things I love about clap is that you can configure it to automatically spit out --help info, and you can even get it to generate shell autocompletions for you!

I think there are some other libraries that are challenging it now (fewer dependencies or something?) but clap sets the standard to beat.

[+] LtWorf|6 months ago|reply
> Also - don't write CLI programs in languages that don't compile to native binaries. I don't want to have to drag around your runtime just to execute a command line tool.

Go programs compile to native executables, they're still rather slow to start, especially if you just want to do --help

[+] geon|6 months ago|reply
This seems like a really weird stance. Who are you to dictate what language people should use? Why CLIs in particular?
[+] 12_throw_away|6 months ago|reply
I like this advice, and yeah, I always try to make illegal states unrepresentable, possibly even to a fault.

The problem I run into here is - how do you create good error messages when you do this? If the user has passed you input with multiple problems, how do you build a list of everything that's wrong with it if the parser crashes out halfway through?

[+] ffsm8|6 months ago|reply
I think you're looking at it too literally - what people usually mean with"making invalid state unrepresentable" is in the main application which has your domain code - which should be separate from your inputs

He even gives the example of zod, which is a validation library he defines to be a parser.

What he wants to say : "I don't want to write my own validation in a CLI, give me a good API already that first validates and then converts the inputs into my declared schema"

[+] mark38848|6 months ago|reply
Just use optparse-applicative in PureScript. Applicatives are great for this and the library gives it to you for free.
[+] akoboldfrying|6 months ago|reply
Agree. It should definitely be possible to get error messages on par with what TypeScript gives you when you try to assign an object literal to an incompatibly typed variable; whether that's currently the case, and how difficult it would be to get there if not, I don't know.
[+] ambicapter|6 months ago|reply
Most validation libraries worth their salt give you options to deal with this sort of thing? They'll hand you an aggregate error with an 'errors' array, or they'll let you write an error message "prettify-er" to make a particular validation error easier to read.
[+] amterp|6 months ago|reply
Very much agree with the article, this is one of the reasons why I wrote Rad [0], which people here might find interesting. The idea is you write CLI scripts with a declarative approach to script arguments, including all the constraints on them, including relational ones. So you don't write your own CLI validation - you declare the shape that args should take, let Rad check user input for you, and you can focus your script on the interesting stuff. For example

  args:
      username str           # Required string
      password str?          # Optional string
      token str?             # Optional auth token
      age int                # Required integer
      status str             # Required string
  
      username requires password     // If username is provided, password must also be provided
      token excludes password        // Token and password cannot be used together
      age range [18, 99]             // Inclusive range from 18 to 99
      status enum ["active", "inactive", "pending"]
Rad will handle all the validation for you, you can just write the rest of your script assuming the constraints you declared are met.

[0]: https://github.com/amterp/rad

[+] SloopJon|6 months ago|reply
I don't see anything in the post or the linked tutorial that gives a flavor of the user experience when you supply an invalid option. I tried running the example, but I've forgotten too much about Node and TypeScript to make it work. (It can't resolve the @optique references.) What happens when you pass --foo, --target bar, or --port 3.14?
[+] macintux|6 months ago|reply
I had a similar question: to me, the output format “or” statement looks like it might deterministically pick one winner instead of alerting the user that they erred. A good parser is terrific, but it needs to give useful feedback.
[+] andrewguy9|6 months ago|reply
Docopt!

http://docopt.org/

Make use of the usage string be the specification!

A criminally underused library.

[+] fragmede|6 months ago|reply
My favorite. A bit too much magic for some, but it seems well specified to me.
[+] tomjakubowski|6 months ago|reply
A great example of "declaration follows use" outside of C syntax.
[+] esafak|6 months ago|reply
The "problem" is that some languages don't have rich enough type systems to encode all the constraints that people want to support with CLI options. And many programmers aren't that great at wielding the type systems at their disposal.
[+] globular-toast|6 months ago|reply
Not all of this validation belongs in the same layer. A lot of the problems people seem to have is due to people thinking it all has to be done in the I/O layer.

A CLI and an API should indeed occupy the same layer of a program architecture, namely they are entry points that live on the periphery. But really all you should be doing there is lifting the low byte stream you are getting from users to something higher level you can use to call your internals.

So "CLI validation" should be limited to just "I need an int here, one of these strings here, optionally" etc. Stuff like "is this port out of range" or "if you give me this I need this too" should be handled by your internals by e.g. throwing an exception. Your CLI can then display that as an error message in a nice way.

[+] yakshaving_jgt|6 months ago|reply
I've noticed that many programmers believe that parsing is some niche thing that the average programmer likely won't need to contend with, and that it's only applicable in a few specific low-level cases, in which you'll need to reach for a parser combinator library, etc.

But this is wrong. Programmers should be writing parsers all the time!

[+] WJW|6 months ago|reply
Last week my primary task was writing a github action that needed to log in to Heroku and push the current code on main and development branches to the production and staging environments respectively. The week before that, I wrote some code to make sure the type the object was included in the filters passed to an API call.

Don't get me wrong, I actually love writing parsers. It's just not required all that often in my day-to-day work. 99% of the time when I need to write a parser myself it's for and Advent of Code problem, usually I just import whatever JSON or YAML parser is provided for the platform and go from there.

[+] eska|6 months ago|reply
I think most security issues are just due to people not parsing input at all/properly. Then security consultants give each one a new name as if it was something new. :-)
[+] dkubb|6 months ago|reply
The three most common things I think about when coding are DAGs, State Machines and parsing. The latter two come up all the time in regexps which I probably write at least once a day, and I’m always thinking about state transitions and dependencies.
[+] nine_k|6 months ago|reply
I'd say that engineers should use the highest-level tools that are adequate for the task.

Sometimes it's going down to machine code, or rolling your own hash table, or writing your own recursive-descent parser from first principles. But most of the time you don't have to reach that low, and things like parsing are but a minor detail in the grand scheme. The engineer should not spend time on building them, but should be able to competently choose a ready-made part.

I mean, creating your own bolts and nuts may be fun, but mot of the time, if you want to build something, you just pick a few from an appropriate box, and this is exactly right.

[+] SoftTalker|6 months ago|reply
I like just writing functions for each valid combination of flags and parameters. Anything that isn’t handled is default rejected. Languages like Erlang with pattern matching and guards make this a breeze.
[+] bsoles|6 months ago|reply
>> // This is a parser

>> const port = option("--port", integer());

I don't understand. Why is this a parser? Isn't it just way of enforcing a type in a language that doesn't have types?

I was expecting something like a state machine that takes the command line text and parses it to validate the syntax and values.

[+] hansvm|6 months ago|reply
The heavy lifting happens in the definitions of `option` and `integer`. Those will take in whatever arguments they take in and output some sort of `Stream -> Result<Tuple<T, Stream>>` function.

That might sound messy but to the author's point about parser combinators not being complicated, they really don't take much time to get used to, and they're quite simple if you wanted to build such a library yourself. There's not much code (and certainly no magic) going on under the hood.

The advantage of that parsing approach:

It's reasonably declarative. This seems like the author's core point. Parser-combinator code largely looks like just writing out the object you want as a parse result, using your favorite combinator library as the building blocks, and everything automagically works, with amazing type-checking if your language has such features.

The disadvantages:

1. Like any parsing approach, you have to actually consider all the nuances of what you really want parsed (e.g., conditional rules around whitespace handling). It looks a little to me (just from the blog post, not having examined the inner workings yet) like this project side-stepped that by working with the `Stream` type as just the `argv` list, allowing you to be able to say things like "parse the next blob as a string" without also having to encode whitespace and blob boundaries.

2. It's definitely slower (and more memory-intensive) than a hand-rolled parser, and usually also worse in that regard than other sorts of "auto-generated" parsing code.

For CLI arguments, especially if they picked argv as their base stream type, those disadvantages mostly don't exist. I could see it performing poorly for argv parsing for something like `cp` though (maybe not -- maybe something like `git cp`, which has more potential parse failures from delimiters like `--`?), which has both options and potentially ginormous lists of files; if you're not very careful in your argument specification then you might have exponential backtracking issues, and where that would be blatantly obvious in a hand-rolled parser it'll probably get swept under the rug with parser combinators.

[+] einpoklum|6 months ago|reply
Exactly the opposite of this. We should parse the command-line using _no_ strict types. Not even integers. Nothing beyond parsing its structure, e.g. which option names get which (string) values, and which flags are enabled. This can be done without knowing _anything_ about the application domain, and provide a generic options structure which is no longer a sequence of characters.

This approach IMNSHO is much cleaner than the intrication of cmdline parser libraries with application logic and application-domain-related types.

Then one can specify validation logic declaratively, and apply it generically.

This has the added benefit - for compiled rather than interpreted library - of not having to recompile the CLI parsing library for each different app and each different definition of options.

[+] MrJohz|6 months ago|reply
Can you give some examples of this working well? It certainly goes against all of my experience working with CLIs and with parsing inputs in general (e.g. web APIs etc). In general, I've found that the quicker I can convert strings into rich types, the easier that code is to work with and the less likely I am to have troubles with invalid data.
[+] bakkoting|6 months ago|reply
This is the approach taken by node's built-in argument parser util.parseArgs.
[+] m463|6 months ago|reply
This kind of stuff is what makes me appreciate python's argparse.

It's a genuine pleasure to use, and I use it often.

If you dig a little deeper into it, it does all the type and value validation, file validation, it does required and mutually exclusive args, it does subargs. And it lets you do special cases of just about anything.

And of course it does the "normal" stuff like short + long args, boolean args, args that are lists, default values, and help strings.

[+] MrJohz|6 months ago|reply
Actually, I think argparse falls into the same trap that the author is talking about. You can define lots of invariants in the parser, and say that these two arguments can't be passed together, or that this argument, if specified, requires these arguments to also be specified, etc. But the end result is a namespace with a bunch of key-value pairs on it, and argparse doesn't play well with typing systems like mypy or pyright. So the rest of the tool has to assume that the invariants were correctly specified up-front.

The result is that you often still this kind of defensive programming, where argparse ensures that an invariant holds, but other functions still check the same invariant later on because they might have been called a different way or just because the developer isn't sure whether everything was checked where they are in the program.

What I think the author is looking for is a combination of argparse and Pydantic, such that when you define a parser using argparse, it automatically creates the relevant Pydantic classes that define the type of the parsed arguments.

[+] geon|6 months ago|reply
I just recently implemented my own parser combinator lib in typescript too. It was surprisingly simple in the end.

This function parses a number in 6502 asm. So `255` in dec or `$ff` in hex: https://github.com/geon/dumbasm/blob/main/src/parsers/parseN...

I looked at several typescript libraries but they all felt off. Writing my own at least ensured I know how it works.

[+] AndrewDucker|6 months ago|reply
This is one of the things that makes me glad that PowerShell does all of this intrinsically. I define the parameters, it makes sure that the arguments make sense and match them (and their validation).
[+] AnimalMuppet|6 months ago|reply
Well, they're dictating that if you want them to use it, do it this way. Some people want others to use the programs they write; for such people, the GP actually has been given the right to have some valid say in the matter.

Why CLIs in particular? Because they usually are smaller tools. For a big, important tool, you might be willing to jump through more hoops (installing the right runtime), but for a smaller, less important tool, it's just not worth it.

[+] dvdkon|6 months ago|reply
I, for one, do think the world needs more CLI argument parsers :)

This project looks neat, I've never thought to use parser combinators for something other than left-to-right string/token stream parsing.

And I like how it uses Typescript's metaprogramming to generate types from the parser code. I think that would be much harder (or impossible) in other languages, making the idiomatic design of a similar similar library very different.