top | item 34525817

(no title)

txbuck | 3 years ago

The fact that kebab-case support is a rarity constantly boggles my mind, nevermind that it isn't the de facto default for any language created after *sh/lisp. Readability, ease-of-typing, parallel with the way it's used in (Romantic) natural language. If I were writing a new language I intended to popularize this would be one of the features I would emphasize.

Spicy semi-snarky aside: if your counterpoint is that kebab-case prevents crushing your arithmetic operators together, I strongly suggest you either reconsider or never write any code you think may be read by another human being (and possibly yourself).

discuss

order

kibwen|3 years ago

Make your language syntax require whitespace around arithmetic operators, and then we can finally live in a glorious paradise of kebab-case and `foo/bar` for namespaces.

txbuck|3 years ago

For reference, something like like this (Python regex):

  ^[a-zA-Z_]+[a-zA-Z0-9_]*(-?[a-zA-Z0-9_]+)*$

  some-name
  some2-name
  some-2-name
  some-2name
  some-name2
  some-name-2
  some2other-name
  some-other2name
  some-othername2
  some-2othername
  some-0-other-name
  a-0-name
  a0-0a-0a-a0
  kebab-case-pls
  tbh-didnt-write-any-underscore-tests

  -negated
  -negated-variable
  -(negated)
  -(also-negated)
  -(-double-negated)
  binary - operation
  binary + operation
  double - binary - operation

  2-syntax-error-4-me
  2syntaxerrorforme
  1-2-3-4-syn-tax-err-or
  80086-syntax-error
  syntax+error
  syntaxerror+
  syntax-error+
  syntax-error-
  syntax-(error)
  (syntax)-error
  syntax--error
  syntax++error

int_19h|3 years ago

It is (was?) the default for XSLT and XQuery, if we're talking about something relatively mainstream. Beyond that I can think of Dylan and REBOL.

capableweb|3 years ago

CSS is probably the most mainstream language allowing dashes in names.

Gibbon1|3 years ago

My snarky suggestion is we switch to emoji's for arithmetic operators. Semi serious incorporate syntax highlighting via markup into the language.

Joker_vD|3 years ago

I actually have a toy language that uses emojis for keywords, allows me to reduce tokenizing to basically "split by whitespace, split into runs of characters of the same class (every character is its own class except for [A-Za-z0-9_-] which make up a single class), then post-process tokens for finer distinctions". Strings are somewhat more painful, but using \q instead of \" to escape the double quotes helps.

kaba0|3 years ago

I mean, not even a bad idea. I would honestly prefer operators to be more explicit like a-variableanother-varsomething-else. (EDIT: HN removed my emojis, but imagine a plus/minus emoji inbetween ids :( )

With a good font it would basically look similar to what a good IDE with syntax highlighting already does (different color for operators).

gnulinux|3 years ago

I think you're maybe just a little confused, or I'm missing something. Kebab case is in general not viable to implement unless you have very a special/quirky syntax, because there is no way differentiate to `a-b` from `Id("a-b")` and `Subtract(Id("a"), Id("b"))` from syntax alone. Now, there are of course options that introduce trade-offs.

First obvious option is to get rid of the infix "-" operator, which is what Lisp does. In lisp-like languages you don't write "a - b" instead you write "- a b", this way there is nothing to confuse "a-b" with.

Another option is to require a space between operators. E.g. you are not allowed to write "a+b" to mean "add a to b". You have to write "a + b". This is used in Agda programming language. This is very useful because then you can have identifiers like "a+b", or even identifiers like "a+[b+c-d]" etc... As long as any char doesn't have a special meaning (e.g. in Agda "(", ";", "," etc have special meanings) you can use it in an identifier. The trade-off is that, well now you're not allowed to condense arithmetic operations. This may or may not be a problem, depending on the programming language designer. When you said:

> I strongly suggest you either reconsider or never write any code you think may be read by another human being (and possibly yourself).

I'm guessing your opinion is that you're ok with this trade-off. Fact of the matter is that this a very fringe syntax for any programming language to have. As an Agda programmer, I like it, and it is useful, but I'm not convinced something like this would find mass appeal.

The last option I'm aware is to have semantic differentiation. When you find a statement like "c = a-b" you need to ask two things. One, are there identifiers "a", "b" and "a-b". If "a-b" exist and "a" or "b" doesn't exist, you're all set. If all three exist, second question is, are "a" and "b" subtractible? If the answer is yes then programming language designer can choose to prioritize "a - b" over identifier "a-b". Alternatively, you can always choose to prioritize identifier "a-b" as long as it exists. I'm personally not aware of any language that implements something like this, however I have implemented toy languages that go through this, it's pretty easy. It's a matter of making the decision to introduce this type of complexity into your language.

All in all, although I love kebab case, in order to have it in your language you need to make pretty significant trade-offs. Given this, I'm not surprised any mainstream non-lisp-like language doesn't have it.

txbuck|3 years ago

(Hot damn, that was a super thorough/thoughtful reply in a short amount of time.)

You're spot on about the quirky syntax, but I don't think it's as serious a trade-off or addition in complexity (or even a change), given that:

- (IIRC) many style-guides/formatters already enforce spaces between binary operators and their operands (but especially identifiers) and in my super-subjectively-opinionated opinion you should already be doing that even without a formatter

- I don't feel particularly strongly one way or another about any other special characters like "+", so really in this case I'm only considering the dash

- Requiring the dash be between alpha/alphanumerics makes it play nice with unary operators

- The language would be terrible for code-golfing, but that's a relatively niche application I'd definitely consider worth spurning

thaumasiotes|3 years ago

> First obvious option is to get rid of the infix "-" operator, which is what Lisp does. In lisp-like languages you don't write "a - b" instead you write "- a b", this way there is nothing to confuse "a-b" with.

This is a gross error. In your sense, Lisp does not even have operators, only identifiers. The reason there is no confusion between "(- a b)" and "(-ab)" is the spacing that separates the three identifiers in the first case.[1]

Your comment is especially weird because you go on to discuss Lisp's approach as being "an alternative option to what Lisp does".

[1] However, Lisp does have a potential problem with identifiers that begin with a hyphen, due to the need to support literal numeric values like -3. Thus the Common Lisp decrement function is named "1-" despite not returning the value (1 - operand).

kaba0|3 years ago

> As long as any char doesn't have a special meaning (e.g. in Agda "(", ";", "," etc have special meanings) you can use it in an identifier. The trade-off is that

I would say the trade off is variable names like “a+b” themselves. I fail to see any reason why would I want something like that, like even in Math where the grammar is very hand-wavy to accommodate human parsing you would be insane to write that (though to be fair, math does have their own share of problem with identifiers, enumerating all the letters in different alphabets is not a sustainable solution)

jostylr|3 years ago

One could also not have subtraction but simply negation, so instead of a-b being subtraction, one could have a+-b. It probably already works in most languages. Doubt it would be embraced.

Arch-TK|3 years ago

It's a bit hard to parse it outside of languages like lisp because of infix.

thaumasiotes|3 years ago

Infix has nothing to do with it. We can already parse expressions of the form "x - 20". The change would be to stop interpreting "x-20" as being the same set of three tokens as "x - 20".