Streem – a new programming language from Matz

[+] c3RlcGhlbnI_|11 years ago|reply

I am glad that someone is working on a new stream processing language, it is a very interesting paradigm. However I hope that they provide some very robust tools for controlling input splitting. As I have spent too much time fighting with awk and wishing it was more flexible(it is frustrating always having exactly two levels of splitting with only matchers on the first and only inverse splitting on the second).

As is you would have to put in filters to resplit input into lines and that is very messy for something that you will need/want to do very often.

For example if you wanted to parse by character it would be wonderful to be able to do the following:

  STDIN | /./{|c|
    # stuff
  }

Even better would be if you took it a step further and offered something like regex pattern matching for the block input. e.g.

  STDIN | /\w+/{|word|
    /house/ {
      # when word is house
    }
    /car/ {
      # when word is car
    }
    {
      # default case
    }
  }

[+] mackwic|11 years ago|reply

Awesome ! Let's call it sed !

Reference: https://www.gnu.org/software/sed/manual/sed.html#Examples

Sorry, I couldn't miss that one. :) Really, check the example. Sed is most powerful tool and can do astonishing work.

[+] bjourne|11 years ago|reply

This is how you would write it in factor:

    "~/yourfile.txt" utf8 file-contents 
    "\\w+" findall [ first second ] map [ 
        { 
            { "house" [ "house stuff" print ] } 
            { "car" [ "car" print ] } 
            [ "default stuff: %u\n" printf ] 
        } case 
    ] each

Tacit programming basically is the same thing as stream programming.

[+] nateabele|11 years ago|reply

> However I hope that they provide some very robust tools for controlling input splitting

Yes, splitting and combining. Looking at the code example, I feel like so much opportunity is just being squandered, where something like Rust's mapping constructs would feel so much better. Yours is definitely more in that vein, here's something a little closer:

  STDIN | /\w+/{|word|
    /house/   => "foo"
    /car/     => "bar"
    "literal" => "baz"
    _         => "dib"
    # (where _ is a magical symbol for the default case... could be anything)
  }

[+] ori_b|11 years ago|reply

Sounds like a job for structural regex: http://doc.cat-v.org/bell_labs/structural_regexps/se.pdf

[+] klibertp|11 years ago|reply

Take a look at TXR: http://www.nongnu.org/txr/

The link was posted on HN some time ago and was generally well received. I didn't have the time to use it much yet, but it looks very nice. Well, unless you hate lisp.

[+] bglazer|11 years ago|reply

I'd like to recommend a python command line utility called pyp. It allows you to do python string manipulation on text streams using the standard pipe operator.

https://code.google.com/p/pyp/

[+] mamcx|11 years ago|reply

" new stream processing language"... which are the olds?

I'm playing with the idea of build a language, where the functions are unix-like, with STDIN, OUT & ERR. So, instead of raise a exception, it put data in ERR... and make it easy to compose them.

[+] marianov|11 years ago|reply

AWK http://en.m.wikipedia.org/wiki/AWK

[+] Argorak|11 years ago|reply

If you ever meet Matz, talk to him about programming languages. While he gets some flak for all the problems of his original hobby project (Ruby), he obviously loves programming languages and gives things more thought then people give him credit for. I had the chance to talk to him while I was still a student and full of ideas how the language could be made "better" and he shot them all down. For good reasons, as I know nowadays. So I always love seeing him building languages.

I shared a small story about him and languages quite a while ago, I guess it fits here as well: https://news.ycombinator.com/item?id=6562979

[+] danso|11 years ago|reply

Does he really get a "lot of flak" for Ruby? While I know not everyone loves Ruby, it seems crazy to me that people would denigrate Matz on a personal level...to me (admittedly, a novice in designing languages), Ruby always seemed well-thought out...that is, the trade-offs do not seem out of line given the philosophical benefits, and not everyone can make claim to turning a personal project into a worldwide language.

Also, he seems like a nice guy, not the type to be drawn into the kind of flareups in which he would draw flak.

[+] emmanueloga_|11 years ago|reply

I'd be careful with people shutting down my ideas up front, in casual conversation.

Sure, you may end up discarding your ideas yourself, but if you instead sought to implement them you could learn something in the way, maybe turn them into something that would actually make sense. That would not happen if you just followed advice from some authority and simply decided not to experiment with your ideas any further. So I'd not discard advice from knowledgeable people, but I would not take it as a final word either.

[+] pluma|11 years ago|reply

*Yukihiro Matsumoto, creator of Ruby

I'm not sure everyone is familiar enough with Ruby to know who Matz is.

[+] chc|11 years ago|reply

If they don't know who Matz is, are they likely to know who Yukihiro Matsumoto is?

EDIT: Interesting fact: I believe this is now my most downvoted comment in five years on HN, at effectively -7. Never would have guessed. I'm not exactly sure what it says, but I thought it was an interesting data point.

[+] vessenes|11 years ago|reply

Why do most implementations of FizzBuzz special case % 15? I haven't ever really understood this. Maybe it's just my math-y background, but it always seemed to me you should just check mod 3 and mod 5 without an else between them, concatenating Fizz and Buzz.

Can anyone else comment on this? Most canonical FizzBuzz programs special case 15, and I don't get it.

[+] dragonwriter|11 years ago|reply

Well, its more direct in that, while it increases the number of branches, it minimizes the number of statements executed on any branch. It's also the solution that maps most directly to the problem statement, and, absent a strong technical reason to do otherwise, a direct mapping from requirements to code is a good thing.

[+] masukomi|11 years ago|reply

just because 15 HAPPENS to be a concatenation of the output of 3 and the output of 5 today doesn't mean it will be tomorrow. If I said "say Fizz for multiples of 3, Buzz for multiples of 5, and 'this is a silly coding problem' for multiples of 3 and 5 then you'd have to rewrite your code.

Some of us know that clients ALWAYS change their minds, specs are rarely equivalent to the end result, and code against future changes that are trivial to account for in advance.

[+] donmcc|11 years ago|reply

It's to neatly handle the line break.

[+] rudolf0|11 years ago|reply

Easier to understand for beginners, I suppose.

For the record, I have utterly no math background (hardly passed algebra), but I also agree that checking only 3 and 5 is the better solution and is how I've always written FizzBuzz.

[+] FnuGk|11 years ago|reply

Another thing i dont understand is why people hardcode 15. I would rather write (3 * 5) and let the compiler figure out that 3 * 5=15. This way i think it more clearly states where the number 15 comes from. Any reason to write 15 over (3 * 5)?

[+] wenderen|11 years ago|reply

I do that too. It seems more in the spirit of the problem. For an alternate problem, in which you are asked to print strings a, b and c (!= a concat b) if the number is divisible by 3, 5 and 15 respectively, it makes sense to special case 15.

[+] rzwitserloot|11 years ago|reply

Because of the requirement to print the number if it is divisible by neither. Here:

    if x isDivisibleBy 3: print "Fizz"
    if x isDivisibleBy 5: print "Buzz"

and.. how do I now print x in the neither case? I can't 'else'. I could make a long if not divisible by either if expression, but that's less easy to read than an if/else chain that starts out with 'if divisible by both, print fizzbuzz'.

If fizzbuzz was: Print FizzBuzz for multiples of 15, fizz for multiples of 3, and buzz for multiples of 5, and nothing otherwise, I bet you'd see the above pseudocode far more.

[+] unknown|11 years ago|reply

[deleted]

[+] apoorvai|11 years ago|reply

I'm not really a programming language expert, but it seems to me that having an implementation being the spec wouldn't be a good idea. If the Streem implementation has a bug, then the bug becomes the authoritative behavior. Any platform specific quirks would also make it difficult to have defined behavior.

[+] ianlevesque|11 years ago|reply

Yep, welcome to Ruby!

To be fair, a spec with tests was reversed out of the ruby implementation[1], so things have improved a bit.

1. http://rubyspec.org/

[+] aidenn0|11 years ago|reply

Implementations are nearly always the spec when a language is young. You want to be able to experiment and make changes. Then as it matures, you typically get a spec.

[+] skrebbel|11 years ago|reply

Sure, but this is not finished at all. Starting with the spec instead of a prototype implementation sounds like a very very limiting and unrewarding design process.

[+] jfoutz|11 years ago|reply

There's an old argument about worse is better. The gist is, doing things the right way is hard and takes a long time. sometimes, it's better to just get something simple out there and deal with the problems later.

http://www.jwz.org/doc/worse-is-better.html

[+] Igglyboo|11 years ago|reply

It looks like a weekends worth of work for Matz, I doubt he's even thinking of a spec at this point.

[+] dragonwriter|11 years ago|reply

Right now, the closest thing to a spec is the sample FizzBuzz code (as an implicit spec that "this code will solve FizzBuzz"); there is no implementation (just work-in-progress parser/lexer code.)

So, while I'll agree that there are issues that come from the implementation being the spec of a language in general, I would say we are well earlier than the point at which we can identify that as a problem with Streem.

[+] xrstf|11 years ago|reply

Welcome to PHP. The Zend Engine 2 is basically the spec, even though Facebook has recently (couple of weeks ago) started writing a spec to make sure their HHVM is compatible.

[+] weeksie|11 years ago|reply

At the very least, it's going to be amazing to watch a master language designer build a new language from the ground up.

That said, I'm incredibly optimistic about a new Matz language. If I was going to guess, the syntax will be much lighter and the semantics will make VM optimization much easier than in Ruby.

[+] samuell|11 years ago|reply

Would be cool if it would incorporate some thinking and concepts from Flow-based programming [1], as that is AFAIK the most comprehensive architecture covering all the aspects of asyncronous concurrent processing that one might run into (multiple in/out-ports, channels with bounded buffers, sub-stream support, etc etc).

[1] http://www.jpaulmorrison.com/fbp/

[2] http://en.wikipedia.org/wiki/Flow-based_programming

[+] skrebbel|11 years ago|reply

Reminds me a lot of Elixir's |> operator, which does the exact same thing. Nice! Curious how it'll turn out to compare with Elixir on other areas.

[+] lgleason|11 years ago|reply

I love Ruby and I love Matz. With that being said there are some things that Ruby struggles with. I know that there have been some conversations among the core on bringing in more functional concepts to Ruby....at least since April. To me this says that Matz is coming to the conclusion that we may need a new language to get functional right.

While I am sad to see that Ruby may be superseded by a new language I'm really happy to see Matz leading the way with one of the solutions. In the Ruby community we have a expression "Matz is nice and therefore we are nice". That has set the tone for the community in ways that have never been the same in some of the others.

As someone who has had the opportunity to talk with Matz on multiple occasions and work with the Ruby community it would be great to see this as a natural evolution of Ruby and the people who love it... As I have started to move on to working with more functional languages etc. I have started to move away from doing Ruby, but if the community can continue on and evolve with a new language that would be awesome!

[+] dwash|11 years ago|reply

[+] NARKOZ|11 years ago|reply

>the software is related to the magazine article of 2015 issue of the (Japanese) programming magazine

https://github.com/matz/streem/commit/1c8189f9e1df3289801b28...

[+] michaelmior|11 years ago|reply

Why? I think it's pretty normal to drop this in as boilerplate on new projects.

[+] mostafah|11 years ago|reply

And... here’s an implementation: https://github.com/mattn/streeem

Well, that was quick.

[+] rurounijones|11 years ago|reply

Heh, got to love the repo message "Sorry, Sorry"

[+] luckydude|11 years ago|reply

We did a similar thing to the awk source. Made awk scripts first class, you could pipe them to each other.

[+] mijoharas|11 years ago|reply

Can someone help explain what's going on here: \"([^\\\"]|\\.)\" [seen here in context](https://github.com/matz/streem/blob/master/src/lex.l#L49).

Now it seems to be finding literal strings (so "strings" e.t.c.). That would explain the literal double quotes on either side. so without that we get: ([^\\\"]|\\.) so zero or more repeating versions of [^\\\"]|\\.

What I don't understand is why there is the explicit or \\. construct there, as this seems unnecessary. Am I missing something? also, why does it seem that strings cannot have either literal \ or literal " in them?

[+] programminggeek|11 years ago|reply

I like the idea of dataflow or stream processing ideas. I would love if you could make the connector pieces smarter so that you were enforcing a contract between the piping mechanisms. I believe you could build some very interesting systems with that approach.

[+] chmartin|11 years ago|reply

Awesome I hacked something like this together a couple years ago using ruby and gnu parallel

https://github.com/charlesmartin14/gnu_parallel

it is badly needed

[+] hdmoore|11 years ago|reply

This is an excellent example of using a parser as a language. Whether it has any legs depends on whether it beats existing tools on some front (sed, perl, ruby, etc). The concurrent angle is interesting, but I have found a multi-process approach to stream data to be more efficient than most concurrent single-process implementations. For example, with DAP (https://github.com/rapid7/dap), we found that GNU Parallel + Ruby MRI was more effective than a concurrent language such as Go.

[+] panic|11 years ago|reply

Tab (https://bitbucket.org/tkatchev/tab) is another interesting recent text processing language.

[+] zvrba|11 years ago|reply

Take a look at dss from AT&T AST tools:

http://www2.research.att.com/~astopen/dss/dss.html

http://www2.research.att.com/~astopen/publications/dss-2004....

[+] bnegreve|11 years ago|reply

It's definitely a good idea. Pipes are both very powerful and very simple to use and debug, yet they are not very common in general purpose programming languages (examples?). I'm not surprised that someone is trying to build a language around them. I'll follow that, but for now it's a bit too early to judge.

[+] eggie|11 years ago|reply

See node.js streams.

[+] golemotron|11 years ago|reply

Great idea, but I'm a bit disappointed in the syntax. Composed chains in Ruby are much nicer to look at.

189 comments