Can YAML go away entirely and instead allow pipelines to be defined with an actual language? What benefits does the runner-interpreted yaml-defined pipeline paradigm actually achieve? Especially with runners that can't be executed and tested locally, working with them is a nightmare.
jayd16|5 months ago
I'm certainly willing to believe that yaml is not the ideal answer but unless we're comparing it to a concrete alternative, I feel like this is just a "grass is always greener" type take.
ok123456|5 months ago
You write a compiler that enforces stronger invariants above and beyond everything is an array/string/list/number/pointer.
Good general-purpose programming languages provide type systems that do just this. It is criminal that the industry simply ignores this and chooses to use blobs of YAML/JSON/XML with disastrous results---creating ad-hoc programming languages without a typesystem in their chosen poison.
VGHN7XDuOXPAzol|5 months ago
I am not sure you can do this whilst having the granular job reporting (i.e. either you need one YAML block per job or you have all your jobs in one single 'status' item?) Is it actually doable?
iLoveOncall|5 months ago
I don't think anybody serious has any argument in favor of CloudFormation templates.
tracker1|5 months ago
Note: mostly using Deno these days for this, though I will use .net/grate for db projects.
esafak|5 months ago
Some do just that: dagger.io. It is not all roses but debugging is certainly easier.
biimugan|5 months ago
When something is written in a real programming language (that doesn't just compile down to YAML or some other data format), this becomes much more challenging. What should you do in that case? Attempt to parse the configuration into an AST and operate over the AST? But in many programming languages, the AST can become arbitrarily complex. Behavior can be implemented in such a way as to make it difficult to discover or introspect.
Of course, YAML can also become difficult to parse too. If the system consuming the YAML supports in-band signalling -- i.e. proprietary non-YAML directives -- then you would need to first normalize the YAML using that system to interpret and expand those signals. But in principal, that's still at least more tractable than trying to parse an AST.
catlifeonmars|5 months ago
cough CloudFormation cough
Charon77|5 months ago
There are multiple ways to safely run untrusted code.
I for one enjoy how build.rs in rust does it: you have a rust code that controls the entire build system by just printing stuffs on stdout.
There are other ways of course
0xbadcafebee|5 months ago
You already don't have to use YAML. Use whatever language you want to define the configuration, and then dump it as YAML. By using your own language and outputting YAML, you get to implement any solution you want, and GitHub gets to spend more cycles building features.
Simple example:
I don't know why nobody has made this yet, but it wouldn't be hard. Read GHA docs, write Python classes to match, output as YAML.If you want more than GHA features support [via configuration], use the GHA API (https://docs.github.com/en/rest/actions) or scripted workflows feature (https://github.com/actions/github-script).
doublet00th|5 months ago
tarkaTheRotter|5 months ago
There are existing solutions around, but do miss out a bunch of things that are blatantly missing in the space:
- workflow visualisations (this is already working - you can see an example of workflow relationship and breakdowns on a non-trivial example at https://github.com/http4k/http4k/tree/master/.github/typeflo...);
- running workflows through an event simulator so you can tell cause and effect when it comes to what triggers what. Testing workflows anyone? :)
- security testing on workflows - to avoid the many footguns that there are in GHA around secrets etc;
- compliance tests around permitted Action versions;
- publishing of reusable repository files as binary dependencies that can be upgraded and compiled into your projects - including not just GHA actions and workflows but also things like version files, composable Copilot/Claude/Cursor instruction files;
- GitLab, CircleCI, Bitbucket, Azure DevOps support using the same approach and in multiple languages;
Early days yet, but am planning to make it free for OSS and paid for commercial users. I'm also dogfooding it on one of my other open source projects so to make sure that it can handle non-trivial cases. Lots to do - and hopefully it will be valuable enough for commercial companies to pay for!
Wish me luck!
https://typeflows.io/
rickette|5 months ago
nothrabannosir|5 months ago
freeplay|5 months ago
Here's some fun examples to see why HCL sucks:
- Create an if/elseif/else statement
- Do anything remotely complex with a for loop (tip: you're probably going to have to use `flatten` a lot)
time4tea|5 months ago
It takes your programming language version and turns it into github actions yaml, so you dont need to do any of that sort of thing.
herpdyderp|5 months ago
Levitating|5 months ago
imiric|5 months ago
I really enjoyed working with the Earthfile format[1] used for Earthly CI, which unfortunately seems like a dead end now. It's a mix of Dockerfile and Makefile, which made it made very familiar to read and write. Best of all, it allowed running the pipeline locally exactly as it would run remotely, which made development and troubleshooting so much easier. The fact GH Actions doesn't have something equivalent is awful UX[2].
Honestly, I wish the industry hadn't settled on GitHub and GH Actions. We need better tooling and better stewards of open source than a giant corporation who has historically been hostile to open source.
[1]: https://earthly.dev/earthfile
[2]: Yes, I'm aware of `act`, but I've had nothing but issues with it.
ivanjermakov|5 months ago
Pxtl|5 months ago
jbjbjbjb|5 months ago
delusional|5 months ago
That is the key function any serious CI platform needs to tackle to get me interested. FORCE me to write something that can run locally. I'll accept using containers, or maybe even VMs, but make sure that whatever I build for your server ALSO runs on my machine.
I absolutely detest working on GitHub Actions because all too often it ends up requiring that I create a new repo where I can commit to master (because for some reason everybody loves writing actions that only work on master). Which means I have to move all the fucking secrets too.
Solve that for me PLEASE. Don't give me more YAML features.
bigstrat2003|5 months ago
soraminazuki|5 months ago
I've seen few thousands-line YAML files with anchors riddled all over the place. It was impossible to deal with. Rewriting it in Jsonnet paid off immediately.
Another example is Nixpkgs. It's quite pleasant to deal with despite the size of its codebase.
ericHosick|5 months ago
rurban|5 months ago
Jokes aside, I like proper yaml anchors. Other CI's do support these and it made writing yaml actions much easier, esp. complicated cross-building recipes with containers and qemu.
zft|5 months ago
oblio|5 months ago
I say this as someone that built entire Jenkins Groovy frameworks for automating large Jenkins setups (think hundreds of nodes, thousands of Jenkins jobs, stuff like that).
unknown|5 months ago
[deleted]
ZYbCRq22HbJ2y7|5 months ago
Although, I think it is generally an accepted practice to use declarative configuration over imperative configuration? In part, maybe what the article is getting at, maybe?
baq|5 months ago
wiether|5 months ago
We write Bash or Python, and our tool will produce the YAML pipeline reflecting it.
So we dont need to maintain YAML with over-complicated format.
The resulting YAML is not meant to be read by an actual human since its absolute garbage, but the code we want to run is running when we want, without having to maintain the YAML.
And we can easily test it locally.
easterncalculus|5 months ago
Honestly, just having a linter should be enough. Ideally, anything complicated in your build should just be put into a script anyways - it minimizes the amount of lines in that massive YAML file and the potential for merge conflicts when making small changes.
mhh__|5 months ago
AaronAPU|5 months ago
datadrivenangel|5 months ago
verdverm|5 months ago
I use CUE to read yamhell too
unknown|5 months ago
[deleted]
giancarlostoro|5 months ago
red_hare|5 months ago
baq|5 months ago
GitHub Actions have a lot of rules, logic and multiple sublanguages in lots of places (e.g. conditions, shell scripts, etc.) YAML is completely superficial, XML would be an improvement due to less whitespace sensitivity alone.
pacoWebConsult|5 months ago
shadowgovt|5 months ago
Plus it has exactly enough convenience-feature-related sharp edges to be risky to hand to a newbie, while wearing the dress of something that should be too bog-simple to have that problem. I, too, enjoy languages that arbitrarily decide the Norwegian TLD is actually a Boolean "false."
Pxtl|5 months ago
TheDong|5 months ago
Language implementations for yaml vary _wildly_.
What does the following parse as:
If I google "yaml online" and paste it in, one gives me:{'some_map': {False: 'cap', 'key': 'value'}}
The other gives me:
{'some_map': {'false': 'cap', 'key': 'value'}}
... and neither gives what a human probably intended, huh?