I implemented something similar to the compositional regular expressions feature described here for JavaScript a while ago (independently, so semantics may not be the same), and it is one of the libraries I find myself most often bringing into other projects years later. It gets you a tiny bit closer to feeling like you have a first-class parser in the language. Here is an example of implementing media type parsing with regexes using it: https://runkit.com/tolmasky/media-type-parsing-with-template...
To be clear, programming languages should just have actual parsers and you shouldn't use regular expressions for parsers. But if you ARE going to use a regular expression, man is it nice to break it up into smaller pieces.
"Actual parsers" aren't powerful enough to be used to parse Raku.
Raku regular expressions combined with grammars are far more powerful, and if written well, easier to understand than any "actual parser". In order to parse Raku with an "actual parser" it would have to allow you to add and remove things from it as it is parsing. Raku's "parser" does this by subclassing the current grammar adding or removing them in the subclass, and then reverting back to the previous grammar at the end of the current lexical scope.
In Raku, a regular expression is another syntax for writing code. It just has a slightly different default syntax and behavior. It can have both parameters and variables. If the regular expression syntax isn't a good fit for what you are trying to do, you can embed regular Raku syntax to do whatever you need to do and return right back to regular expression syntax.
It also has a much better syntax for doing advanced things, as it was completely redesigned from first principles.
The following is an example of how to match at least one `A` followed by exactly that number of `B`s and exactly that number of `C`s.
(Note that bare square brackets [] are for grouping, not for character classes.)
my $string = 'AAABBBCCC';
say $string ~~ /
^
# match at least one A
# store the result in a named sub-entry
$<A> = [ A+ ]
{} # update result object
# create a lexical var named $repetition
:my $repetition = $<A>.chars(); # <- embedded Raku syntax
# match B and then C exactly $repetition times
$<B> = [ B ** {$repetition} ]
$<C> = [ C ** {$repetition} ]
$
/;
Result:
「AAABBBCCC」
A => 「AAA」
B => 「BBB」
C => 「CCC」
The result is actually a very extensive object that has many ways to interrogate it. What you see above is just a built-in human readable view of it.
In most regular expression syntaxes to match equal amounts of `A`s and `B`s you would need to recurse in-between `A` and `B`. That of course wouldn't allow you to also do that for `C`. That also wouldn't be anywhere as easy to follow as the above. The above should run fairly fast because it never has to backtrack, or recurse.
When you combine them into a grammar, you will get a full parse-tree. (Actually you can do that without a grammar, it is just easier with one.)
Frankly from my perspective much of the design of "actual parsers" are a byproduct of limited RAM on early computers. The reason there is a separate tokenization stage was to reduce the amount of RAM used for the source code so that further stages had enough RAM to do any of the semantic analysis, and eventual compiling of the code. It doesn't really do that much to simplify any of the further stages in my view.
The JSON::Tiny module from above creates the native Raku data structure using an actions class, as the grammar is parsing. Meaning it is parsing and compiling as it goes.
I use Raku in production. It's the best language to deal with text because building parsers so so damn nice. I'm shocked this isn't the top language to create an LLM text pipeline.
Very late to the thread, but I was wondering if you knew of a good example of Raku calling an API over https, polling the API until it returns a specific value?
That's a fair reaction to the post if you haven't looked at any normal Raku code.
If you look at any of the introductory Raku books, it seems a LOT like Python with a C-like syntax. By that I mean the syntax is more curly-brace oriented, but the ease of use and built-in data structures and OO features are all very high level stuff. I think if you know any other high level scripting language that you would find Raku pretty easy to read for comparable scripts. I find it pretty unlikely that the majority of people would use the really unusual stuff in normal every day code. Raku is more flexible (more than one way to do things), but it isn't arcane looking for the normal stuff I've seen. I hope that helps.
Same as Perl, nobody wants to maintain it, but it's extremely fun to write. It has a lot of expression.
You can see that in Raku's ability to define keyword arguments with a shorthand (e.g. :global(:$g)' as well as assuming a value of 'True', so you can just call match(/foo/, :g) to get a global regex match). Perl has tons of this stuff too, all aimed at making the language quicker and more fun to write, but less readable for beginners.
Its strange that people are saying the same about maintaining code bases written using AI assistance.
Im guessing its going to be a generational thing now. A whole older generation of programmers will just find themselves out of place in what is like a normal work set up for the current generation.
Some of these are halfway familiar. Hyper sounds like a more ad-hoc version of something from recursion-schemes, and * as presented is somewhat similar to Scala _ (which I love for lambdas and think every language should adopt something similar).
Speed is still a major issue with Raku. Parsing a log file with a regex is Perl's forte but the latest Raku still takes 6.5 times as long as Python 3.13 excluding startup time.
You'd need to qualify that with an example. In my experience some things are faster in Raku and some are slower, so declaring that "Raku takes 6.5 times as long as Python 3.13" is pretty meaningless without seeing what it's slower at.
The most important Raku features are Command Line Interface (CLI) and grammars.
CLI support is a _usual_ feature -- see "docopt" implementations (and adoption), for example. But CLI is built-in in Raku and nice to use.
As for the grammars -- it is _unusual_ a programming language to have grammars as "first class citizens" and to give the ability to create (compose) grammars using Object-Oriented Programming.
I’ve followed this project for years, and while it’s interesting, I think it’s really a shame that Perl 6 seemed to have been so badly waylaid by this sojourn into the looking-glass.
Junctions introduce a degree of non-determinism to the language. Think Prolog variables. Junctions allow you to talk about a set of solutions without having to mind how they are kept together or how the operations are distributed between members of the Junction. It's especially convenient when you search for something and that something can be a complicated series of logical expressions: you can pack them all in a single Junction and treat as a first-class object. It's a little hard to explain without giving examples, but it really has a lot of uses :)
PowerShell does something similar with their pipelines, see e.g. the answer [0] and the question it answers. Something similar happens in Bash: $x refers not to the string $x, but to the list of the strings that you get by splitting the original string by IFS.
And yes, this feature is annoying and arguably is a mis-feature: containers shall not explode when you touch them.
I dream of a day where one can post a Raku article on HNN and not encounter a comments section full of digressions into discussing Perl.
There is some sense to it by means of comparison, but the constant conflation of the two becomes tiresome.
But in that spirit, let's compare:
The =()= "operator" is really a combination of Perl syntax[^1] that achieves the goal of converting list context to scalar context. This isn't necessary to determine the length of an array (`my $elems = @array` or, in favor of being more explicity, `my $elems = 0+@array`). It is, however, useful in Perl for the counting of more complex list contexts on the RHS.
Let's use some examples from it's documentation to compare to Raku.
Perl:
my $n =()= "abababab" =~ /a/g;
# $n == 4
Raku:
my $n = +("abababab" ~~ m:g/'a'/);
# $n == 4
# Alternatively...
my $n = ("abababab" ~~ m:g/'a'/).elems;
That's it. `+` / `.elems` are literally all you ever need to know for gathering a count of elements. The quotes around 'a' in the regex are optional but I always use them because I appreciate denoting which characters are literal in regexes (Note also that the regex uses the pair syntax mentioned in OP via `m:g`. Additional flags are provided as pairs, eg `m:g:i`).
Another example.
Perl:
my $count =()= split /:/, "ab:ab:ab";
# $count == 3
Raku:
my $count = +"ab:ab:ab".split(':');
# $count == 3
While precedence can at times be a conceptual hindrance, it's also nice to save some parentheses where it is possible and legible to do so. Opinions differ on these points, of course. Note also that `Str.split` can take string literals as well as regexes.
> It is absolutely mindbending to me that all of this language development has happened on top of Perl, of all things.
Why "of all things?" The Perl philosophy of TIMTOWTDI, and Wall's interest in human-language constructs and the ways that they could influence programming-language constructs, seem to make its successor an obvious home for experiments like this.
I'm sure that there has been more changes from Perl 5.8 to Perl 5.40 than there is between Python 2.0 to Python 3.x (Whatever version it is up to at the moment.)
What's more is that every change from Python 2 to Python 3 that I've heard of, resembles a change that Perl5 has had to do over the years. Only Perl did it without breaking everything. (And thus didn't need a major version bump.)
# Does weird things with nested lists too
> [1, [2, 3], 4, 5] <<+>> [10, 20]
[11 [22 23] 14 25]
This article makes me feel like I'm watching a Nao Geo/Animal Planet documentary. Beautiful and interesting to see these creatures in the wild? Absolutely. Do I want to keep my distance? As far away as possible.
I agree. This attempt to fuse higher-order functional programming with magic special behaviors from Perl comes off to me as quixotic. HOP works because you're gluing together extremely simple primitives—ordinary pure functions. You can build big things fearlessly because you perfectly understand the simple bricks they're made of. But here: magic functions—which behave differently on lists-of-scalars vs. lists-of-lists, by special default logic—that's not a good match for HOP. Now you have two major axes of complexity: a vertical one of functional abstraction, and a horizontal one of your "different kinds of function and function application".
You're making a mistake if you're thinking like that. Applying an operation that generally works on single values over a list of values automatically is an incredibly powerful technique. If you have ever used Numpy, you will appreciate not needing it in many cases where Raku's built-ins suffice.
tolmasky|1 year ago
"templated-regular-expression" on npm, GitHub: https://github.com/tolmasky/templated-regular-expression
To be clear, programming languages should just have actual parsers and you shouldn't use regular expressions for parsers. But if you ARE going to use a regular expression, man is it nice to break it up into smaller pieces.
b2gills|1 year ago
Raku regular expressions combined with grammars are far more powerful, and if written well, easier to understand than any "actual parser". In order to parse Raku with an "actual parser" it would have to allow you to add and remove things from it as it is parsing. Raku's "parser" does this by subclassing the current grammar adding or removing them in the subclass, and then reverting back to the previous grammar at the end of the current lexical scope.
In Raku, a regular expression is another syntax for writing code. It just has a slightly different default syntax and behavior. It can have both parameters and variables. If the regular expression syntax isn't a good fit for what you are trying to do, you can embed regular Raku syntax to do whatever you need to do and return right back to regular expression syntax.
It also has a much better syntax for doing advanced things, as it was completely redesigned from first principles.
The following is an example of how to match at least one `A` followed by exactly that number of `B`s and exactly that number of `C`s.
(Note that bare square brackets [] are for grouping, not for character classes.)
Result: The result is actually a very extensive object that has many ways to interrogate it. What you see above is just a built-in human readable view of it.In most regular expression syntaxes to match equal amounts of `A`s and `B`s you would need to recurse in-between `A` and `B`. That of course wouldn't allow you to also do that for `C`. That also wouldn't be anywhere as easy to follow as the above. The above should run fairly fast because it never has to backtrack, or recurse.
When you combine them into a grammar, you will get a full parse-tree. (Actually you can do that without a grammar, it is just easier with one.)
To see an actual parser I often recommend people look at JSON::TINY::Grammar https://github.com/moritz/json/blob/master/lib/JSON/Tiny/Gra...
Frankly from my perspective much of the design of "actual parsers" are a byproduct of limited RAM on early computers. The reason there is a separate tokenization stage was to reduce the amount of RAM used for the source code so that further stages had enough RAM to do any of the semantic analysis, and eventual compiling of the code. It doesn't really do that much to simplify any of the further stages in my view.
The JSON::Tiny module from above creates the native Raku data structure using an actions class, as the grammar is parsing. Meaning it is parsing and compiling as it goes.
mempko|1 year ago
bloopernova|1 year ago
antononcube|1 year ago
christophilus|1 year ago
agumonkey|1 year ago
7thaccount|1 year ago
If you look at any of the introductory Raku books, it seems a LOT like Python with a C-like syntax. By that I mean the syntax is more curly-brace oriented, but the ease of use and built-in data structures and OO features are all very high level stuff. I think if you know any other high level scripting language that you would find Raku pretty easy to read for comparable scripts. I find it pretty unlikely that the majority of people would use the really unusual stuff in normal every day code. Raku is more flexible (more than one way to do things), but it isn't arcane looking for the normal stuff I've seen. I hope that helps.
rjh29|1 year ago
You can see that in Raku's ability to define keyword arguments with a shorthand (e.g. :global(:$g)' as well as assuming a value of 'True', so you can just call match(/foo/, :g) to get a global regex match). Perl has tons of this stuff too, all aimed at making the language quicker and more fun to write, but less readable for beginners.
kamaal|1 year ago
Im guessing its going to be a generational thing now. A whole older generation of programmers will just find themselves out of place in what is like a normal work set up for the current generation.
zokier|1 year ago
lmm|1 year ago
klibertp|1 year ago
emmelaich|1 year ago
Should it be `returns (32, 54)` ? i.e. 4+50 for the 2nd term.
Maybe this is a consequence (head translation) of some countries saying e.g. vierenvijftig (four and fifty) instead of the English fifty-four.
agumonkey|1 year ago
jimberlage|1 year ago
riffraff|1 year ago
cutler|1 year ago
donaldihunter|1 year ago
jddj|1 year ago
quink|1 year ago
emmelaich|1 year ago
I guess (apart from the Whatever), the laziness is new since Perl6/Raku.
agumonkey|1 year ago
antononcube|1 year ago
The most important Raku features are Command Line Interface (CLI) and grammars.
CLI support is a _usual_ feature -- see "docopt" implementations (and adoption), for example. But CLI is built-in in Raku and nice to use.
As for the grammars -- it is _unusual_ a programming language to have grammars as "first class citizens" and to give the ability to create (compose) grammars using Object-Oriented Programming.
binary132|1 year ago
childintime|1 year ago
quink|1 year ago
sbierwagen|1 year ago
yen223|1 year ago
trescenzi|1 year ago
unknown|1 year ago
[deleted]
agnishom|1 year ago
klibertp|1 year ago
kqr|1 year ago
Sure, we can do the same thing with the goto... but why would we want to use the more difficult/annoying alternative when the convenient one exists?
Joker_vD|1 year ago
And yes, this feature is annoying and arguably is a mis-feature: containers shall not explode when you touch them.
[0] https://stackoverflow.com/a/56977142
postepowanieadm|1 year ago
ab5tract|1 year ago
There is some sense to it by means of comparison, but the constant conflation of the two becomes tiresome.
But in that spirit, let's compare:
The =()= "operator" is really a combination of Perl syntax[^1] that achieves the goal of converting list context to scalar context. This isn't necessary to determine the length of an array (`my $elems = @array` or, in favor of being more explicity, `my $elems = 0+@array`). It is, however, useful in Perl for the counting of more complex list contexts on the RHS.
Let's use some examples from it's documentation to compare to Raku.
Perl:
Raku: That's it. `+` / `.elems` are literally all you ever need to know for gathering a count of elements. The quotes around 'a' in the regex are optional but I always use them because I appreciate denoting which characters are literal in regexes (Note also that the regex uses the pair syntax mentioned in OP via `m:g`. Additional flags are provided as pairs, eg `m:g:i`).Another example.
Perl:
Raku: While precedence can at times be a conceptual hindrance, it's also nice to save some parentheses where it is possible and legible to do so. Opinions differ on these points, of course. Note also that `Str.split` can take string literals as well as regexes.[1]: See https://github.com/book/perlsecret/blob/master/lib/perlsecre...
Uptrenda|1 year ago
broodbucket|1 year ago
JadeNB|1 year ago
Why "of all things?" The Perl philosophy of TIMTOWTDI, and Wall's interest in human-language constructs and the ways that they could influence programming-language constructs, seem to make its successor an obvious home for experiments like this.
dragonwriter|1 year ago
ab5tract|1 year ago
Edit: Other than a configuration framework and a test harness.
James_K|1 year ago
b2gills|1 year ago
What's more is that every change from Python 2 to Python 3 that I've heard of, resembles a change that Perl5 has had to do over the years. Only Perl did it without breaking everything. (And thus didn't need a major version bump.)
unknown|1 year ago
[deleted]
tail_exchange|1 year ago
perihelions|1 year ago
klibertp|1 year ago
fuzztester|1 year ago
https://en.m.wikipedia.org/wiki/National_Geographic
https://www.nationalgeographic.com/magazine
BoingBoomTschak|1 year ago
unknown|1 year ago
[deleted]
KelvinFineBoy69|1 year ago
[deleted]
the_arun|1 year ago
the_arun|1 year ago