top | item 44856660

(no title)

z_open | 6 months ago

> Raw or multiline strings are spelled like this:

    const still_raw =
        \\const raw =
        \\    \\Roses are red
        \\    \\  Violets are blue,
        \\    \\Sugar is sweet
        \\    \\  And so are you.
        \\    \\
        \\;
        \\
    ;
This syntax seems fairly insane to me.

discuss

order

IshKebab|6 months ago

Maybe if you've never tried formatting a traditional multiline string (e.g. in Python, C++ or Rust) before.

If it isn't obvious, the problem is that you can't indent them properly because the indentation becomes part of the string itself.

Some languages have magical "removed the indent" modes for strings (e.g. YAML) but they generally suck and just add confusion. This syntax is quite clear (at least with respect to indentation; not sure about the trailing newline - where does the string end exactly?).

qzzi|6 months ago

C and Python automatically concatenate string literals, and Rust has the concat! macro. There's no problem just writing it in a way that works correctly with any indentation. No need for weird-strings.

  " one\n"
  "  two\n"
  "   three\n"

konart|6 months ago

I may be missing something but come Go has a simple:

    `A
       simple
          formatted
             string
    `

?

norir|6 months ago

Significant whitespace is not difficult to add to a language and, for me, is vastly superior than what zig does both for strings and the unnecessary semicolon that zig imposes by _not_ using significant whitespace.

I would so much rather read and write:

    let x = """
      a
      multiline string
      example
    """
than

    let x =
      //a
      //multiline string
      //example
    ;
In this particular example, zig doesn't look that bad, but for longer strings, I find adding the // prefix onerous and makes moving strings around different contexts needlessly painful. Yes, I can automatically add them with vim commands, but I would just rather not have them at all. The trailing """ is also unnecessary in this case, but it is nice to have clear bookends. Zig by contrast lacks an opening bracket but requires a closing bracket, but the bracket it uses `;` is ambiguous in the language. If all I can see is the last line, I cannot tell that a string precedes it, whereas in my example, you can.

Here is a simple way to implement the former case: require tabs for indentation. Parse with recursive descent where the signature is

    (source: string, index: number, indent: number, env: comp_env) => ast
Multiline string parsing becomes a matter of bumping the indent parameter. Whenever the parser encounters a newline character, it checks the indentation and either skips it, or if is less than the current indentation requires a closing """ on the next line at a reduced indentation of one line.

This can be implemented in under 200 lines of pure lua with no standard library functions except string.byte and string.sub.

It is common to hear complaints about languages that have syntactically significant whitespace. I think a lot of the complaints are fair when the language does not have strict formatting rules: python and scala come to mind as examples that do badly with this. With scala, practically everyone ends up using scalafmt which slows down their build considerably because the language is way too permissive in what it allows. Yaml is another great example of significant whitespace done poorly because it is too permissive. When done strictly, I find that a language with significant whitespace will always be more compact and thus, in my opinion, more readable than one that does not use it.

I would never use zig directly because I do not like its syntax even if many people do. If I was mandated to use it, I would spend an afternoon writing a transpiler that would probably be 2-10x faster than the zig compiler for the same program so the overhead of avoiding their decisions I disagree with are negligible.

Of course from this perspective, zig offers me no value. There is nothing I can do with zig that I can't do with c so I'd prefer it as a target language. Most code does not need to be optimized, but for the small amount that does, transpiling to c gives me access to almost everything I need in llvm. If there is something I can't get from c out of llvm (which seems highly unlikely), I can transpile to llvm instead.

z_open|6 months ago

Even if we ignore solutions other languages have come up with, it's even worse that they landed on // for the syntax given that it's apparently used the same way for real comments.

n42|6 months ago

Zig does not really try to appeal to window shoppers. this is one of those controversial decisions that, once you become comfortable with the language by using it, you learn to appreciate.

spoken as someone who found the syntax offensive when I first learned it.

ivanjermakov|6 months ago

It is not the insane syntax, but quite insane problem to solve.

Usually, representing multiline strings within another multiline string requires lots of non-trivial escaping. This is what this example is about: no escaping and no indent nursery needed in Zig.

whitehexagon|6 months ago

I think Kotlin solves it quite nicely with the trimIndent. I seem to recall Golang was my fav, and Java my least, although I think Java also finally added support for a clean text block.

Makes cut 'n' paste embedded shader code, assembly, javascript so much easier to add, and more readable imo. For something like a regular expressions I really liked Golang's back tick 'raw string' syntax.

In Zig I find myself doing an @embedFile to avoid the '\\' pollution.

rybosome|6 months ago

Visually I dislike the \\, but I see this solves the problem of multiline literals and indentation in a handy, unambiguous way. I’m not actually aware of any other language which solves this problem without a function.

throw10920|6 months ago

It seems very reasonable and comes with several technical and cognitive advantages. I think you're just having a knee-jerk emotional reaction because it's different than what you're used to, not because it's actually bad.

conaclos|6 months ago

I like the idea of repeating the delimiter on every line. However `//` looks like a comment to me. I could simply choose double quote:

    const still_raw =
        "const raw =
        "    "Roses are red
        "    "  Violets are blue,
        "    "Sugar is sweet
        "    "  And so are you.
        "    "
        ";
        "
    ;
This cannot be confused with a string literal because a string literal cannot contain newline feeds.

flexagoon|6 months ago

What if you have something like

    const raw =
        "He said "Hello"
        "to me
    ;
Wouldn't that be a mess to parse? How would you know that "He said " is not a string literal and that you have to continue parsing it as a multiline string? How would you distinguish an unclosed string literal from a multiline string?

hardwaregeek|6 months ago

My immediate thought was hmm, that's weird but pretty nice. The indentation problem indeed sucks and with a halfway decent syntax highlighter you can probably de-emphasize the `//` and make it less visually cluttered.

conorbergin|6 months ago

I think everyone has this reaction until they start using it, then it makes perfect sense, especially when using editors that have multiple cursors and can operate on selections.

seabombs|6 months ago

I think the syntax highlighting for this could make it more readable. Make the leading `\\` a different color to the string content.

steveklabnik|6 months ago

I had the exact opposite reaction.

zem|6 months ago

that was my favourite bit in the entire post - the one place where zig has unambiguously one-upped other languages. the problems it is solving are:

1. from the user's point of view, you can now have multiline string literals that are properly indented based on their surrounding source code, without the leading spaces being treated as part of the string

2. from an implementation point of view having them parsed as individual lines is very elegant, it makes newline characters in the code unambiguous and context independent. they always break up tokens in the code, regardless of whether they are in a string literal or not.

klas_segeljakt|6 months ago

When I first read it I thought it was line comments.

watersb|6 months ago

Upvoting because similar comments here suggest that you are not alone.

People are having trouble distinguishing between '//' and '\\'.

fcoury|6 months ago

I really like zig but that syntax is indeed insane.