top | item 15051645

“What next?”

461 points| yomritoyj | 8 years ago |graydon2.dreamwidth.org | reply

149 comments

order
[+] fulafel|8 years ago|reply
Again my pet ignored language/compiler technology issue goes unmentioned: data layout optimizations.

Control flow and computation optimizations have enabled use of higher level abstractions with little or no performance penalty, but at the same time it's almost unheard of to automatically perform (or even facilitate) the data structure transformations that are daily bread and butter for programmers doing performance work. Things like AoS->SoA conversion, compressed object references, shrinking fields based on range analysis, flattening/dernormalizing data that is used together, converting cold struct members to indirect lookups, compiling different versions of the code for different call sites based on input data, etc.

It's baffling considering that everyone agrees memory access and cache footprint are the current primary perf bottlenecks, to the point that experts recommend considering on-die computation is free and counting only memory accesses in first-order performance approximations.

[+] ComNik|8 years ago|reply
Jonathan Blow's language "jai" allows for seamlessly switching between AoS and SoA and generally seems to value efficient data layout and "gradual" optimization over safety (in contrast to e.g. Rust).

Unfortunately, no public compiler seems to be available at this time.

[+] mpweiher|8 years ago|reply
More generally, optimization has to become 1st class, not something that the compiler may or may not do on whim.
[+] jnordwick|8 years ago|reply
I've been thinking about this for the last few years. What would an APL-like language look like with structured data? Is that possible? Could you make a language where you specify if a value is SoA or AoS? Is it possible to automatically convert an AoS-based algorithm to SoA?

It really changes how you do basic things like sorting. In the standard AoS approach in C-like languages you swap entire structures around the array. In an SoA approach in APL-like languages you generate a list of indices that would put the data in sorted order then apply it to each column. A number of times I written code to do this in C++ for high-performance systems, and it works great, but is definitely a different way of thinking about things.

[+] sitkack|8 years ago|reply
Memory layout should be library/protocol. Look at the gigawatt hours that have been spent on serialization, and the resulting work one has to put into having a serialization-free format like Capnproto. The heap should be like a well formed database and those accessors should be portable across languages and systems.

One should be able to specify a compile time macro that controls memory layout.

[+] jez|8 years ago|reply
Grayson's very first answer to "what's next" is "ML modules," a language feature probably few people have experienced first hand. We're talking about ML-style modules here, which are quite precisely defined alongside a language (as opposed to a "module" as more commonly exists in a language, which is just a heap of somewhat related identifiers). ML modules can be found in the mainstream ML family languages (Standard ML, Ocaml) as well as some lesser known languages (1ML, Manticore, RAML, and many more).

It's really hard to do justice explaining how amazing modules are. They capture the essence of abstraction incredibly well, giving you plenty of expressive power (alongside an equally powerful type system). Importantly, they compose; you can write functions from modules to modules!

(This is even more impressive than you think: modules have runtime (dynamic) AND compile time (static) components. You've certainly written functions on runtime values before, and you may have even written functions on static types before. But have you written one function that operates on both a static and a dynamic thing at the same time? And what kind of power does this give you? Basically, creating abstractions is effortless.)

To learn more, I recommend you read Danny Gratzer's "A Crash Course on ML Modules"[1]. It's a good jumping off point. From there, try your hand at learning SML or Ocaml and tinker. ML modules are great!

[1]: https://jozefg.bitbucket.io/posts/2015-01-08-modules.html

[+] Animats|8 years ago|reply
One big problem we're now backing into is having incompatible paradigms in the same language. Pure callback, like Javascript, is fine. Pure threading with locks is fine. But having async/await and blocking locks in the same program gets painful fast and leads to deadlocks. Especially if both systems don't understand each other's locking. (Go tries to get this right, with unified locking; Python doesn't.)

The same is true of functional programming. Pure functional is fine. Pure imperative is fine. Both in the same language get complicated. (Rust may have overdone it here.)

More elaborate type systems may not be helpful. We've been there in other contexts, with SOAP-type RPC and XML schemas, superseded by the more casual JSON.

Mechanisms for attaching software unit A to software unit B usually involve one being the master defining the interface and the other being the slave written to the interface. If A calls B and A defines the interface, A is a "framework". If B defines the interface, B is a "library" or "API". We don't know how to do this symmetrically, other than by much manually written glue code.

Doing user-defined work at compile time is still not going well. Generics and templates keep growing in complexity. Making templates Turing-complete didn't help.

[+] mpweiher|8 years ago|reply
> incompatible paradigms

See Architectural Mismatch or, Why it's hard to build systems out of existing parts[1]

Yes, we are not good at it, but we need to do it. Very often, the dominant paradigm is not appropriate for the application at hand, and often no single paradigm is appropriate for the entire application.

For example, UI programming does not really fit call/return well at all[2].

> attaching software unit A to software unit B .. master/slave

Case in point: this is largely due to the call/return architectural style being so incredibly dominant that we don't even see it as a distinct style, with alternatives. I am calling it 'The Gentle Tyranny of Call/Return'.

[1] http://www.cs.cmu.edu/afs/cs.cmu.edu/project/able/www/paper_...

[2] http://dl.ifip.org/db/conf/ehci/ehci2007/Chatty07.pdf

[+] mercer|8 years ago|reply
So this might just me over-fitting my new obsession to everything in the world, or alternatively I might just be out of my depth here, but could it be argued that Elixir's (or rather Erlang's) OTP approach solves/sidesteps most if not all the issues you mention?

Starting a separate 'Erlang process' for all async stuff, for example, seems so wonderfully simple to me compared to the async mess I find in JS, and applying various patterns(?) to that (Task, GenServer, SuperVisor) still provides a lot of freedom without incompatibility.

Please correct me if I'm wrong though. I'm still in the research phase so I haven't even written much Elixir/Erlang yet...

[+] runeks|8 years ago|reply
> Pure functional is fine. Pure imperative is fine. Both in the same language get complicated.

Perhaps my programming language vocabulary is the limiting factor here, but I understand “pure functional” to refer to a non-sequential computation (no ordering, just a pure transformation) and “pure imperative” to be monadic computation in Haskell (a sequence of steps, executed one after the other). I don’t see why these two could be considered incompatible — indeed, monadic computations make little sense without pure functions to transform the values inside the monad.

Can you clarify?

[+] borplk|8 years ago|reply
I'd say the elephant in the room is graduating beyond plaintext (projectional editor, model-based editor).

If you think about it so many of our problems are a direct result of representing software as a bunch of files and folders with plaintext.

Our "fancy" editors and "intellisense" only goes so far.

Language evolution is slowed down because syntax is fragile and parsing is hard.

A "software as data model" approach takes a lot of that away.

You can cut down so much boilerplate and noise because you can have certain behaviours and attributes of the software be hidden from immediate view or condensed down into a colour or an icon.

Plaintext forces you to have a visually distracting element in front of you for every little thing. So as a result you end up with obscure characters and generally noisy code.

If your software is always in a rich data model format your editor can show you different views of it depending on the context.

So how you view your software when you are in "debug mode" could be wildly different from how you view it in "documentation mode" or "development mode".

You can also pull things from arbitrarily places into a single view at will.

Thinking of software as "bunch of files stored in folders" comes with a lot baggage and a lot of assumptions. It inherently biases how you organise things. And it forces you to do things that are not always in your interest. For example you may be "forced" to break things into smaller pieces more than you would like because things get visually too distracting or the file gets too big.

All of that stuff are arbitrary side effects of this ancient view of software that will immediately go away as soon as you treat AND ALWAYS KEEP your software as a rich data model.

Hell all of the problems with parsing text and ambiguity in sytnax and so on will also disappear.

[+] beagle3|8 years ago|reply
This claim is often repeated, but I haven't seen it substantiated even once. It's possible that no one has yet come up with the right answer that "obviously there". But it is also possible that this claim is not true, and a tend to believe the latter more as time passes.

Every attempt that I've seen, e.g. Lamdu, Subtext, any "visual app builder", all fail miserably at delivering ANY benefit except for extremely simple programs -- while at the same time, taking away most of the useful tools we already have like "grep", "diff", etc. Sure, they can be re-implemented in the "rich data model", perhaps even better than their textual ancestors - but the thing is, that they HAVE to be re-implemented, independently for each such "rich data model", or you can't have their functionality at all -- whereas as 1972 "diff" implementation is still useful for 2017 "pony", a language with textual representation.

regarding your example, the "breaking things into smaller pieces" was solved long ago by folding editors (I used one on an IBM mainframe in 1990, I suspect Emacs already had it at the same time, it did for sure in 1996).

the problems with "parsing and ambiguity" are self inflicted, independent of whether the representation is textual. Lisp has no ambiguity, Q (the K syntax sugar) has no ambiguity. Both languages eschew operator precedence, by the way, because THAT is the real issue that underlies modern syntax ambiguities.

I've been waiting for that amazing "software as a data model" approach to show a benefit for almost 30 years now (There's been an attempt nearly every year I looked). Where it has (e.g. Lisp, Forth), it's completely orthogonal to the textual representation.

[+] geofft|8 years ago|reply
I agree with you for finished programs, but most of the time when I'm coding, I'm working on unfinished programs. I find it infuriating enough when I'm using some "smart" text editor that adds closing quotation marks or braces when I'm not ready for them yet. I know I need to type them, they're buffered somewhere in my wrist, they're going to get typed well before my eye realizes that the smart editor has typed them for me and gets around to telling my fingers.

What is the state of the art in editing software that's continuously stored in a syntax tree instead of in unparsed text? (I say "continuously" because I assume that letting me save a file that doesn't parse, even temporarily, breaks most of these benefits.) How do you, for instance, represent merge conflicts, and how do I resolve them?

This is completely new to me so maybe there are great answers here that I just haven't heard of.

[+] ryl00|8 years ago|reply
In my experience, this only works in DSLs where a more graphical representation fits a particular problem space better. One good example is Simulink in control systems work; this to me is the real 'killer' part of Matlab, the ability to lay out PID controllers in a way more natural for control engineers to design/analyze, and then with a push of a button autogenerate C/C++ code to use on hardware. But even then the lack of things like diff, merge, etc. are continual reminders that there are disadvantages as well...
[+] di4na|8 years ago|reply
That is totally true and it expands the programming possibilities...

But... it make debugging harder without even more tooling. And debugging is most of the time spent in software... which probably explain the origin of the problem.

Especially when you need to reverse engineer code provided by a vendor.

[+] aaron_kent|8 years ago|reply
Would love to chat further about these ideas and get some feedback: aaron.kent <at> isomorf.io
[+] gavanwoolery|8 years ago|reply
I like to read about various problems in language design, as someone who is relatively naive to its deeper intricacies it really helps broaden my view. That said I have seen a trend towards adding various bells and whistles to languages without any sort of consideration as to whether it actually, in a measurable way, makes the language better.

The downside to adding an additional feature is that you are much more likely to introduce leaky abstraction (even things as minor as syntactical sugar). Your language has more "gotchas", a steeper learning curve, and a higher chance of getting things wrong or not understanding what is going on under the hood.

For this reason, I have always appreciated relatively simple homoiconic languages that are close-to-the-metal. That said, the universe of tools and build systems around these languages has been a growing pile of cruft and garbage for quite some time, for understandable reasons.

I envision the sweet spot lies at a super-simple system language with a tightly-knit and extensible metaprogramming layer on top of it, and a consistent method of accessing common hardware and I/O. Instant recompilation ("scripting") seamlessly tied to highly optimized compilation would be ideal while I am making a wishlist :)

[+] mcguire|8 years ago|reply
[Aside: Why do I have the Whiley (http://whiley.org/about/overview/) link marked seen?]

I was mildly curious why Graydon didn't mention my current, mildly passionate affair, Pony (https://www.ponylang.org/), and its use of capabilities (and actors, and per-actor garbage collection, etc.). Then, I saw,

"I had some extended notes here about "less-mainstream paradigms" and/or "things I wouldn't even recommend pursuing", but on reflection, I think it's kinda a bummer to draw too much attention to them. So I'll just leave it at a short list: actors, software transactional memory, lazy evaluation, backtracking, memoizing, "graphical" and/or two-dimensional languages, and user-extensible syntax."

Which is mildly upsetting, given that Graydon is one of my spirit animals for programming languages.

On the other hand, his bit on ESC/dependent typing/verification tech. covers all my bases: "If you want to play in this space, you ought to study at least Sage, Stardust, Whiley, Frama-C, SPARK-2014, Dafny, F∗, ATS, Xanadu, Idris, Zombie-Trellys, Dependent Haskell, and Liquid Haskell."

So I'm mostly as happy as a pig in a blanket. (Specifically, take a look at Dafny (https://github.com/Microsoft/dafny) (probably the poster child for the verification approach) and Idris (https://www.idris-lang.org/) (voted most likely to be generally usable of the dependently typed languages).

[+] carussell|8 years ago|reply
All this and handling overflow still doesn't make the list. Had it been the case that easy considerations for overflow were baked into C back then, we probably wouldn't be dealing with hardware where handling overflow is even more difficult than it would have been on the PDP-11. (On the PDP-11, overflow would have trapped.) At the very least, it would be the norm for compilers to emulate it whether there was efficient machine-level support or not. However, that didn't happen, and because of that, even Rust finds it acceptable to punt on overflow for performance reasons.
[+] Animats|8 years ago|reply
On the PDP-11, overflow would have trapped.

On the DEC VAX, overflow could be set to trap, based on a bit mask at the beginning of each function, but that was not the case on the PDP-11. Nobody used that feature. I once modified the C compiler for the VAX to make integer overflow trap, and rebuilt standard utilities. Most of them trapped on integer overflow.

If you want arithmetic to wrap, you should have to write something like

    n := (n + 1) mod (2^32);
Let the optimizer figure out there's a cheap way to do that.
[+] naasking|8 years ago|reply
> All this and handling overflow still doesn't make the list.

Because it's either a solved problem if you just use infinite precision numbers, or it's encompassed by other things the article lists, like refinement types/extended static checking and errors.

[+] mcguire|8 years ago|reply
Check Ada/SPARK 2014 and Frama-C (with the WP and WP-RTE plugins[1]). Both are good at ensuring your code won't overflow (and Ada checks it, IIRC), although I don't know the story about allowing overflow in the cases where you want it.

[1] I've been told the Value plugin is better for these safety issues, but I haven't figured out how so.

[+] legulere|8 years ago|reply
As a clarification, I guess you are speaking about integer overflows. Your comment confused me a bit as I thought that you speak about buffer overflows at first.
[+] VHRanger|8 years ago|reply
Designing a "checkedInt" class is not trivial
[+] mcguire|8 years ago|reply
"Writing this makes me think it deserves a footnote / warning: if while reading these remarks, you feel that modules -- or anything else I'm going to mention here -- are a "simple thing" that's easy to get right, with obvious right answers, I'm going to suggest you're likely suffering some mixture of Stockholm syndrome induced by your current favourite language, Engineer syndrome, and/or Dunning–Kruger effect. Literally thousands of extremely skilled people have spent their lives banging their heads against these problems, and every shipping system has Serious Issues they simply don't deal with right."

Amen!

[+] statictype|8 years ago|reply
So Graydon works at Apple on Swift?

Wasn't he the original designer of Rust and employed at Mozilla?

Surprised that this move completely went under my radar

[+] rtpg|8 years ago|reply
The blurring of types and values as part of the static checking very much speaks to me.

I've been using Typescript a lot recently with union types, guards, and other tools. It's clear to me that the type system is very complex and powerful! But sometimes I would like to make assertions that are hard to express in the limited syntax of types. Haskell has similar issues when trying to do type-level programming.

Having ways to generate types dynamically and hook into typechecking to check properties more deeply would be super useful for a lot of web tools like ORMs.

[+] charlieflowers|8 years ago|reply
I believe F# is rich in features like this. I am only an F# outsider who has read about such features, but check out F# type providers.
[+] bjz_|8 years ago|reply
I would love to see some advancements into distributed, statically typed languages that can be run on across cluster, and that would support type-safe, rolling deployments. One would have to ensure that state could be migrated safely, and that messaging can still happen between the nodes of different versions. Similar to thinking about this 'temporal' dimension of code, it would be cool to see us push versioning and library upgrades further, perhaps supporting automatic migrations.
[+] imtringued|8 years ago|reply
I would start a bit simpler. Are there any statically typed serialisation formats like flatbuffers or cap'n proto where you can statically verify that version 1.0 and version 1.2 are compatible but version 1.2 and 2.0 are not?
[+] dom96|8 years ago|reply
Interesting to see the mention of effect systems. However, I am disappointed that the Nim programming language wasn't mentioned. Perhaps Eff and Koka have effect systems that are far more extensive, but as a language that doesn't make effect systems its primary feature I think Nim stands out.

Here is some more info about Nim's effect system: https://nim-lang.org/docs/manual.html#effect-system

[+] hderms|8 years ago|reply
Fantastic article. This is the kind of stuff I go to Hacker News to read. Had never even heard of half of these conceptual leaps.
[+] simonebrunozzi|8 years ago|reply
I would have preferred a more informative HN title, instead of a semi-clickbaity "What next?", e.g.

"The next big step for compiled languages?"

[+] tedunangst|8 years ago|reply
Not everyone writes exclusively for the HN audience.
[+] ehnto|8 years ago|reply
I know I am basically dangeling meat into lions den with this question; How has PHP7 done in regards to the Modules section or modularity he speaks of?

I am interested in genuine and objective replies of course.

(Yes your joke is probably very funny and I am sure it's a novel and exciting quip about the state of affairs in 2006 when wordpress was the flagship product)

[+] TazeTSchnitzel|8 years ago|reply
PHP 5.3 (2009) added a (simple, static) namespace system. Composer (2012) has built a sophisticated package management infrastructure on top of this.

However, PHP doesn't have a module system, for better or worse. Namespaces merely deal with name collisions and adding prefixes for you. Encapsulation only exists within classes and functions, not within packages. PHP has no concept of namespaced variables, only classes, functions and constants. Two versions of the same package cannot be loaded simultaneously if they occupy the same namespace.

There has been some relatively recent discussion about having, for example, intra-namespace visibility modifiers (e.g. https://externals.io/message/91778#92148). PHP may yet grow modules. The thing is, though, all sorts of things are suggested all the time. Many PHP 7 features had been suggested many years before (https://ajf.me/talks/2015-10-03-better-late-than-never-scala...). The reason PHP 7 has them is people took the initiative to implement them.

[+] tel|8 years ago|reply
Haven't used PHP in more than a decade so I looked up the documentation. The most relevant parts I could find were "Classes and Objects" and "Namespaces". After a rough review, I'd say that the it is possible that PHP inner classes provide one of the most rudimentary components of "type theoretic modules" in a partial and likely broken way. I don't mean this as a diss at PHP—it's just hard to do it right.

Namespaces also provide something similar to "type theoretic modules" but they provide only the very most prosaic, simple, and unimportant function: namely, namespacing.

[+] Sean1708|8 years ago|reply
I suspect that the closest thing to the kind of modules that Graydon is talking about would be ML's modules[0], but I can't find any information on PHP7's modules so I can't speak for how similar they are (though knowing PHP's dynamic nature I doubt they're particularly similar).

[0]: https://jozefg.bitbucket.io/posts/2015-01-08-modules.html

[+] thaumasiotes|8 years ago|reply
> the state of affairs in 2006 when wordpress was the flagship product

Does PHP have a different flagship product now?

[+] kbenson|8 years ago|reply
There's probably an interesting discussion to be had about the merits of shipping fast with a few killer features and taking your time layer in comparison to well thought out initial features.

PHP is probably a great language now. I used it in the PHP 4 and 5 days. Coming from Perl, that was particularly painful.

[+] ilaksh|8 years ago|reply
I think at some point we will get to projection editors being mainstream for programming, and eventually things that we normally consider user activities will be recognized as programming when they involve Turing complete configurability. This will be an offshoot of more projection editing.

I also think that eventually we may see a truly common semantic definitional layer that programming languages and operating systems can be built off of. It's just like the types of metastructures used as the basis for many platforms today, but with the idea of creating a truly Uber platform.

Another futuristic idea I had would be a VR projectional programming system where components would be plugged and configured in 3d.

Another idea might be to find a way to take the flexibility of advanced neural networks and make it a core feature of a programming language.

[+] lazyant|8 years ago|reply
What would be a good book / website to learn the concepts & nomenclature in order to understand the advanced language discussions in HN like this one?
[+] leeoniya|8 years ago|reply
it's interesting that Rust isn't mentioned once in his post. i wonder if he's disheartened with the direction his baby went.
[+] jancsika|8 years ago|reply
I'm surprised build time wasn't on the list.

Curious and can't find anything: what's the most complex golang program out there, and how long does it take to compile?