top | item 16781295

Ask HN: What human language features could be useful for the writing of code?

60 points| webmaven | 8 years ago | reply

Currently, programming languages seem to be largely isolating[0] and invariant rather than inflected[1].

Could operands such as variable names be usefully inflected (probably agglutinatively[2] for simplicity's sake) to indicate type or other contextual constraints?

How about more extensive stacking[3] of operators (an existing example would be += that combines addition and assignment) the way some languages do verbs?

Could we use tenses[4] somehow to make parallel and concurrent programming easier?

Could articles[5] be used to do things like pass variables or instantiate objects?

Anyway, there seem to be opportunities to make programming languages more expressive by borrowing human language features rather than using ever more complex typographical conventions. What do you think?

[0] https://en.wikipedia.org/wiki/Isolating_language

[1] https://en.wikipedia.org/wiki/Inflection

[2] https://en.wikipedia.org/wiki/Agglutinative_language

[3] https://en.wikipedia.org/wiki/Serial_verb_construction

[4] https://en.wikipedia.org/wiki/Grammatical_tense

[5] https://en.wikipedia.org/wiki/Article_(grammar)

58 comments

[+] lisper|8 years ago|reply

I've been experimenting with various riffs on Common Lisp for about thirty years now, and eating my own dog food. I've found four things that turn out to be huge levers for me:

1. Python-style iterators. No matter what I want to iterate over, I just write:

    (for item in thing do ...)

or if I want to collect the results:

    (for item in thing collect ...)

2. A universal binding construct, which I call binding-block or BB for short. It aggressively eliminates parens so that my code ends up looking like:

    (bb var1 (expression)
        var2 (expression)
        ...)

3. Generic functions, and in particular a set of standard generic functions that perform the most common operations on any data type. For example REF dereferences anything. It replaces ELT, NTH, SLOT-VALUE and AREF and even certain database queries.

4. Classes, with very light use of inheritance.

For an example of code written in this style, take a look at:

https://github.com/rongarret/tweetnacl/blob/master/ratchet.l...

You can transfer some of this to other languages, e.g.:

https://github.com/rongarret/ratchet-js/

[+] streondj|8 years ago|reply

Well I've been working on a language based on Linguistic Universals for like over 10 years. It's called Pyash.

of the things you mentioned, it is analytical rather than isolating, so there is morphological separation between grammatical concepts and stem constructs -- much like in Lojban cmavo vs gismu.

Biggest difference from Lojban and other unnatural programming languages is the use of grammatical cases for denoting parameters.

for instance: "hyikdoka tyutdoyu plostu" which glossed is "one _number _accusative_case, two _number _instrumental_case plus _deonetic_mood", or in colloquial English "Increase the number one by the number two!"

(result is "the number three." or "tyindoka li")

because it has a rich type system which is based on noun-classifiers, a language construct popular in asia, english example is "two dollars of corn" where dollars-of is the noun-classifier for two.

In terms of tenses and concurrency, I have fairly elaborate asynchronous parallel execution model already. but probably could use future tense of scheduling. And the computer could accurately describe it's state by combining tense and grammatical aspect.

articles are generally found in languages that have lost a nominative-accusative case distinction, and can generally be piled up with anaphoric references, i.e. refering to variables.

Here is a paper wrote about it last year: http://liberit.ca:43110/1DYjc22BP5VkqNLJgr3nQfoGiNLbGFGffG for slightly more information http://pyac.ca

If anyone is interested can put up a more recent paper also.

[+] ozzmotik|8 years ago|reply

well im certainly interested, sounds absolutely fascinating

[+] catach|8 years ago|reply

Have you considered Perl?[0]

[0] http://world.std.com/~swmcd/steven/perl/linguistics.html

[+] tootie|8 years ago|reply

I know Perl has a bad rep for being noisy, but it also has some nifty conventions to make code more rational. For example you can add your `if` condition after your expression or use `unless` rather than an awkward `if (!())` kind of thing.

[+] torbjorn|8 years ago|reply

Yes! a great way for OP to avoid "ever more complex typographical conventions" /s

[+] webmaven|8 years ago|reply

Reference links:

[0] https://en.wikipedia.org/wiki/Isolating_language

[1] https://en.wikipedia.org/wiki/Inflection

[2] https://en.wikipedia.org/wiki/Agglutinative_language

[3] https://en.wikipedia.org/wiki/Serial_verb_construction

[4] https://en.wikipedia.org/wiki/Grammatical_tense

[5] https://en.wikipedia.org/wiki/Article_(grammar)

[+] wwwater|8 years ago|reply

I like programming languages much more than natural ones, therefore what usually makes me happy, is to see a language-for-humans to resemble a programming language, like https://en.wikipedia.org/wiki/Lojban.

But thank you so much for the links! I learned quite a lot from them and now I at least know that the reason I feel German is more similar to Russian, than English, is the thing called "analytic vs. inflected" languages.

[+] seanmcdirmid|8 years ago|reply

Pretty much along your line of thinking, check out this 2003 paper by Crista Lopes titled "Beyond AOP: Toward Naturalistic Programming":

http://www.dourish.com/publications/2003/oopsla2003-naturali...

Abstract:

> Software understanding for documentation, maintenance or evolution is one of the longest-standing problems in Computer Science. The use of “high-level” programming paradigms and object-oriented languages helps, but fundamentally remains far from solving the problem. Most programming languages and systems have fallen prey to the assumption that they are supposed to capture idealized models of computation inspired by deceptively simple metaphors such as objects and mathematical functions. Aspect-oriented programming languages have made a significant breakthrough by noticing that, in many situations, humans think and describe in crosscutting terms. In this paper we suggest that the next breakthrough would require looking even closer to the way humans have been thinking and describing complex systems for thousand of years using natural languages. While natural languages themselves are not appropriate for programming, they contain a number of elements that make descriptions concise, effective and understandable. In particular, natural languages referentiality is a key factor in supporting powerful program organizations that can be easier understood by humans.

[+] mcphage|8 years ago|reply

Pronouns, for starters. Letting users rephrase commands in order to highlight the most salient pieces is another. More flexible syntax would be good; people are really good at interpreting language, compilers are not nearly so.

[+] sowbug|8 years ago|reply

I know we're in brainstorming mode, so please don't interpret this as criticism of your suggestion.

Don't named variables serve the same purpose as pronouns? What's an example of a pronoun-like usage in a hypothetical programming language?

[+] c3534l|8 years ago|reply

Many programming languages have naming conventions which act as a sort of inflection. Classes may always be capitalized, whereas functions and methods not be. Constants might be written in all caps, while private variables might be preceded by an underscore.

You can draw comparisons between natural language and programming languages just fine. Maybe functions are like verbs and arguments are like the subject. And maybe partial application of a function is like having an intransitive verb. Maybe a decorator is like an adverb and a type constraint is like an adjective.

In general, though, I think the metaphor about programming languages is wrong: programming languages are instructions, not really languages. A programming language should be easy to read out of order more like a magazine than a novel, it should be possible to get a general grasp of what it's doing by glancing at the code, and the code should eliminate ambiguity and uncertainty. Natural languages don't work that way; they're repetitive, vague, and rely on context (social and otherwise).

[+] OtterCoder|8 years ago|reply

Exactly. The only 'natural language' exercise I can think of that compares with programming is the drafting of legal documents, which also aim to specify something precisely according to a very particular set of definitions.

Unsurprisingly, legal documents have a reputation for being at least as difficult to read as some code.

Natural language, designed for human communication, is, by its nature, unsuitable for precise definition, and a precise language is also unsuitable for human communication.

It really makes sense once you realize how much conversation and prose depend on puns, allegory, allusion, symbols, context, emotion, and so on. All of which are hostile to stating a thing as exactly as possible.

[+] piinbinary|8 years ago|reply

There is something along those lines that I've been thinking about: Human languages evolve over time.

Programming languages also change from version to version, but it is the creators of the language making the changes. In human languages, the changes are organically created by the users. Changes that don't end up being useful never take off.

I think it would be practically possible to build a pipeline for this to happen: allow users to extend/modify the language [0][1] and allow those extensions to be shared.

Some languages already have a process for taking popular 3rd party libraries and eventually integrating them into the standard library distributed with the compiler, the same could apply to language modifications.

[0] Modifying the language could involve _removing_ features .

[1] This may need to be more complicated than syntactic macros, which is currently the most common means of extending languages.

[+] iguy|8 years ago|reply

In terms of the language specification it's the it creators who make the changes. But closer to human languages would be the dialect in use in any given project, often a subset of all the features, combined with conventions for how to use it, what to call things, etc. This might well have evolution similar to human languages.

[+] irth|8 years ago|reply

Javascript does have an extendable transpiler (Babel) that allows everyone to make changes, and you can make proposals to the TC39 commitee, too.

[+] unknown|8 years ago|reply

[deleted]

[+] kazinator|8 years ago|reply

Human language features are, by and large, egregiously detrimental to programming languages.

An expert speaker of one human language can require years, even decades, to become equally proficient in another.

Woe to a programming language which sucks that badly.

[+] mpweiher|8 years ago|reply

Great question!

I think we're slowly getting there. First we largely had verbs (procedures) that acted on data somehow. OO added a bit more on the noun side, so we could start to form simple sentences. IMHO much of the appeal of OO is due to this richer "sentence structure", with messages sent to objects.

Of course you could overdo it (see Yegge's Execution in the Kingdom of Nouns[1]), and the FP guys definitely think we overdid it and verbs were fine all along and who needs anything else. Verbs!

Higher Order Messaging[2] adds "adverbs", so we can say something about how those messages are to be delivered. It allows for some very expressive sentences, which you could also describe as crunching architectural patterns. See Sam Adam's great talk[3].

Polymorphic Identifiers[3][4][5] go back to the noun side, adding what you might describe as n noun-phrases, so you can have something more complex that again acts like a noun. Sort of.

Then there's Alistair Cockburn's work[6]

[1] https://steve-yegge.blogspot.de/2006/03/execution-in-kingdom...

[2] https://en.wikipedia.org/wiki/Higher_order_message

[3] https://www.hpi.uni-potsdam.de/hirschfeld/publications/media...

[4] https://link.springer.com/chapter/10.1007/978-1-4614-9299-3_...

[5] https://www.slideshare.net/MarcelWeiher/in-processrest

[6] http://alistair.cockburn.us/Using+natural+language+as+a+meta...

[+] eesmith|8 years ago|reply

Damian Conway explored some of this topic in "Lingua::Romana::Perligata -- Perl for the XXI-imum Century". See http://users.monash.edu/~damian/papers/HTML/Perligata.html .

[+] LarryMade2|8 years ago|reply

I dunno linguistics seem mess up programming more, like mixing singular and plural names for fields/variables (customer_name vs customers_name), I always am trying to whittle stuff down to the best descriptors than trying to make it more expressive. Expressive is best in comments. :-)

[+] carussell|8 years ago|reply

> Could articles[5] be used to do things like pass variables or instantiate objects?

I wrote about something similar to this (called "type-named objects") in a post last year[1]. The idea is that for small utility methods, you shouldn't have to interrupt your main flow of thought to come up with names for your formal parameters; in fact, you shouldn't even have to name your locals. If you're working in a language with a (perhaps already verbose) static type system, you can just leverage that.

So, for example, instead of this:

  appendNode(node: Node): void {
    this.tail.next = node;
    this.tail = node;
  }

... you can do:

  appendNode(Node): void {
    this.tail.next = the Node;
    this.tail = the Node;
  }

This does introduce some precedence/binding gotchas that I go more into. But that hints towards further improvements. From the aforementioned post, you could end up with a language that allows:

  the ControlMessage's id := defocus

Or, if you add support for "passive voice" to your language:

  id from the ControlMessage := defocus

Completely unrelated, but related to the use of "the", Carmack wrote a tweet[2] that I've got embedded in the comments of a project of mine.

> I have moved to naming global singletons with a The* prefix -- ThePacketServer, TheMasterServer, TheVoip, etc. Feels pretty good.

Since then, I stopped creating "proper" singletons altogether. Instead, it'd be a quasi-singleton, where I write a `MasterServer` class, and then bless an instance as `TheMasterServer`. Application code will always import, refer to, and otherwise operate on the latter, but tests are free to instantiate their own. This is especially useful if you have implemented some sort of developer switch in the UI that allows you to launch an embedded test harness within a live, running instance of the app. In that case, you don't exactly want your tests to be fiddling with your "real" singleton.

1. https://www.colbyrussell.com/2017/02/16/novel-ideas-for-prog...

2. https://twitter.com/ID_AA_Carmack/status/575788622554628096

[+] russellsprouts|8 years ago|reply

I can't remember the name, but there is a language with type-named parameters. It's an object-oriented language inspired by Java.

It doesn't use "the" -- it's clear from context whether something is a type name or reference, so you just use "Node". To handle multiple variables of the same type, you can use Node, Node', Node'', etc.

It also supports phrasal methods, where arguments can appear in the middle of the method name. For example, what would be divide(x, y) could be defined as divide(x)By(y) instead.

[+] bobthepanda|8 years ago|reply

I feel like this would be a great argument for an args keyword, in the same way that this is a keyword. args would just be a JSON object with some additional syntactic sugar for adding/removing parameters.

So if you have an overloaded method, say

  print(String)

  print(String, Int)

you could just do

  print(args.extend(Int))

[+] akvadrako|8 years ago|reply

Seems like Applescript.

[+] onuralp|8 years ago|reply

Code phonology (vocalization of code) might be a good place to start: https://news.ycombinator.com/item?id=16554865

[+] Tycho|8 years ago|reply

Maybe some keywords relating to causality/attribution. For instance the word "because", used to express constraints.

"Usury is forbidden due to Sharia law."

Or the modern idiom,

"No universal health care, because reasons."

In programming you could do something like

"Foo is constrained because bar", where 'bar' is a set of limitations that are noted elsewhere and can be updated. Not sure if there's really anything new here though, plus it's really just a new keyword not a new linguistic graft.

[+] vorg|8 years ago|reply

Sounds like Prolog.

[+] Walkman|8 years ago|reply

I would say Python is pretty close to English language, and that's part of why is it so easy to understand. If you write idiomatic Python code it can read like sentences. There are also jokes like "I wrote pseudo code and Python can actually run it". The ternary operator for example is more similar to English grammar than in any other language because it has been constructed with this in mind.

[+] Waterluvian|8 years ago|reply

Too often do I begin writing pseudo code only to realize that writing valid python is actually quicker. Its sloppy python, but we'll get back to that once the concept is proven.

[+] segmondy|8 years ago|reply

What you are looking for is APL and it's descendents. While you are at it find the paper, "notation as a tool of thought"

[+] borplk|8 years ago|reply

The biggest one I would say is "context".

It would be nice to avoid repetition in a controlled manner by having some context.

This would make it less tedious having to pass the same arguments to a function over and over again. For example let's say in a web application once you authenticate the user.

In the code that follows you could somehow tell it "by 'user' we mean the current user, by 'query' we mean query the database using the current database connection".

Effectively this would be similar to a transparent auto-wired dependency injection.

The biggest mismatch between human and computer langauges is that we as humans are very context aware.

And the computers are at the other end of the spectrum. Between two different function calls they don't remember or assume anything so we have to keep repeatedly passing stuff to them, what's obvious to us. Which itself increases the bugs because it's a lot easier to make a mistake when you are doing something repetitive 50 times, as opposed to declaring it right once somewhere.

Other useful ones could be having some expressions that are not strictly binary or function calls.

And the ability to define these for custom types.

For example "user is administrator", "1001 (is|is not) palindrome", "100 is divisible by at least 2 of [1, 10, 50]".

Another one, accessing attributes/fields, "firstName of user", "factors of 100", "favoritePosts of user".

Tests about attributes/fields, "if user has twoFactorAuthentication".

Another one, more "space freindly" languages. So you could do "let the thing we are printing := 'test';".

Now let me say I'm fully aware most of what I have said is very difficult if not impossible.

The problem is language complexity and parsing and also composition. The examples I have made don't really hold up in general cases or compose well so it's hard to fit them in general purpose languages.

If we were using projectional structure editors they would be much much easier to achieve. Because we wouldn't have to deal with language rules like that. You could render the program structure in a human friendly way without forcing the computer to parse and understand the same thing.

One of the biggest bottlenecks with programming languages today is that humans and computers are sharing a text based language with each other. And as a result we have to compromise on it to achieve a balance where it's easy enough for the computer to parse and understand and easy enough for humans to read and write.

If we could separate the two we could optimise the languages for ourselves and let the computers use something much more precise and unambigious like binary under the hood that makes them happy.

All much easier said than done.

[+] wolfgang42|8 years ago|reply

> a transparent auto-wired dependency injection.

I think you're looking for dynamic scopes. These are annoying for most programming, since you can't just look around nearby the way you can with lexical scopes, you have to figure out what the call stack for the function would look like. When used well, however, they can simplify things considerably.

UNIX environment variables and stdin/stdout/stderr are dynamically scoped. Common Lisp seems to call these 'special variables' and has an "earmuff" convention of putting asterisks around their name to set them off.

[+] OtterCoder|8 years ago|reply

> context

Oh, please no. I've had to play fixer to too many codebases that used interesting tricks to implicitly import code, without pointing to a source. Trying to suss out the origin of a specific function named 'get' in a 100k line application, in a file that invokes, but never declares it, is an exercise in frustration. Less context, please, more explicitness.

[+] mabynogy|8 years ago|reply

I think it's the default context that makes bash powerful (like the cwd).

[+] lmlsna|8 years ago|reply

Probably not when the same word has 5 meanings depending on context.