top | item 36199499

On 'function coloring' (2018)

44 points| jeremylevy | 2 years ago |tedinski.com

21 comments

dang|2 years ago

What color is your function? (2015) - https://news.ycombinator.com/item?id=28657358 - Sept 2021 (58 comments)

What Color Is Your Function? (2015) - https://news.ycombinator.com/item?id=23218782 - May 2020 (85 comments)

What Color is Your Function? (2015) - https://news.ycombinator.com/item?id=16732948 - April 2018 (45 comments)

What Color Is Your Function? - https://news.ycombinator.com/item?id=8984648 - Feb 2015 (143 comments)

What Color Is Your Function? - https://news.ycombinator.com/item?id=8982494 - Feb 2015 (3 comments)

verdagon|2 years ago

I often hear that if a distinction exists, it should be represented in the type system; functions are either async or not, so we should track it at compile time.

The thing most people don't understand about language design is that if you follow this line of thinking to the extreme, your complexity goes through the roof as all of the infectious constraints spread upward through your call graph, sometimes conflicting with other constraints.

Rust is an interesting data point. It already has at least three infectious mechanisms, which spread constraints upwards (async vs synchronous, Sync vs !Sync ("data coloring"), &mut and borrowing in general). If you've ever been unable to propagate these infectious constraints upward because you couldn't change your signature (because it was a public stable API, or it was a trait override, or it was a `drop` method) then you've felt this complexity first hand. I believe this is the main reason that we see less (healthy) abstraction in Rust compared to other languages.

Instead, I think a language should use these kinds of "infectiousnesses" very sparingly. In this order:

* Find a solution that doesn't involve infectiousness. I think Loom did really well here.

* Add an escape hatch (not one that stalls the entire async runtime, preferably). For example, interfaces are a good escape hatch for static types.

* I very much like MrJohz's suggestion in [0]: invert a feature's infectiousness by changing the default color.

If I had one PL-related wish, it would be for a mechanism that's non-infectious like Loom, but didn't involve its stack copying and didn't need a runtime. I have a few ideas along those lines, but we'll see if they pan out.

[0] https://www.reddit.com/r/ProgrammingLanguages/comments/vofiy...

SebastianKra|2 years ago

I associate this aversion to infectiousness with the same mentality that brought us dependency injection and the architecture of tightly coupled singletons:

"I want to..." "...block anywhere" "...use IO anywhere" "...mutate anything anywhere" '...have access to all state everywhere"

But, expanding on the articles point, we _do_ have to deal with these details. If I use IO anywhere in the call stack, then I make it unavailable for unit testing. If I block anywhere, then I potentially freeze some UI code. If I fire and forget async operations, I will have a harder time handling errors and knowing when an operation is finished. If I don't keep my mutable state minimal/isolated, I will encounter internal inconsistencies.

In my experience, keeping these concepts at the top, helps to keep your code testable and understandable.

vore|2 years ago

Rust is getting keyword generics to be polymorphic over sync/async which, uh, I'm not sure if I'm a huge fan of: https://blog.rust-lang.org/inside-rust/2022/07/27/keyword-ge...

HWR_14|2 years ago

Infectious constraints propagating up the call graph is the entire goal, isn't it?

unscaled|2 years ago

I think the view of asynchronous functions as "upwardly infectious" comes from focusing too much on JavaScript.

Async/Await arguably[1] first debuted on C#, and in C# you could always call Task.Wait() or Task.Result to block on the Task returned from the code. Similarly, Python async coroutines can be run from blocking functions using event_loop.run_until_complete(coroutine). Kotlin similarly has runBlocking(). Running async code from blocking code is also possible on Rust, but it depends on your async runtime (tokio, async_std, etc.). JavaScript/TypeScript is the only mainstream language which completely disallows calling async from blocking code, and that is only due to its single-threaded (mostly) never-blocking design.

That doesn't mean you should always feel free to call asynchronous functions from synchronous ones. For instance, if you have a limited number of synchronous threads, this could be a bad idea. Likewise, if you call an async function that calls a synchronous function that then calls an asynchronous function again — you're likely in for trouble. So asynchronous functions are upwardly infectious in a way: You can call them from synchronous function, but in many (or even most) cases it is not a good idea to do so.

But wait! It's even worse: some synchronous functions can also be viewed as "upwardly infectious". Think for instance of a function that does synchronous network I/O: You can't call that function from an asynchronous function, without blocking (or even potentially deadlocking) one of your precious few scheduler threads. In fact, every function that does blocking IO is suspect, although some local blocking IO might be fast enough to ignore. FFI, Kernel thread synchronization and other interactions with the kernel are also danger territory.

So in practice, we never had just two colors to describe the synchronous properties of a function, but three:

Blue functions are neutral. They do not do any IO, or at least do not block on IO (they may do fire-and-forget IO or have other side-effects: we are not typing all IO effects here).

Yellow functions are blocking. They may perform IO, block on a kernel mutex, or even just send the kernel thread to sleep.

Red functions are called as if they are synchronous, but are transformed by the compiler to execute asynchronously and suspend. They may perform non-blocking IO or wait for asynchronous synchronization constructs.

All of these definitions are soft. Apart from JavaScript, colored languages are as strict very strict as they are imagined to be. You can directly call a yellow function or blue function from anywhere, and you can call a red functions from a blue or yellow function with very little extra ceremony. But instead of a strict set of axioms, we have a set of caveats for why you shouldn't do that:

The caveats are:

* In most language, blue and yellow functions are syntactically similar and there is no way to tell them apart, except for reading the function source or consulting its documentation.

* You can safely call blue functions from anywhere.

* You can safely call a red function from within a yellow function — but sometimes need to know the right scheduler event loop to run it on)

* You can safely call a yellow function from within a red function — but to avoid blocking and saturating your scheduler, you need to run the yellow function on a dedicated thread or a thread pool. Performance may suffer, so you want to avoid this kind of call as much as you can.

* You can call both yellow and red functions from within a blue function, but by doing that, but the function becomes yellow (but as mentioned before, this change is syntactically invisible). Blue functions become yellow even if you call a red function, since you'll be calling the red function from within a blocking helper (e.g. Kotlin's runBlocking() or Python's loop.run_until_complete()).

Looking at these caveats, even if we inverse take a "colored" language and inverse the default color from synchronous to asynchronous, we still end up with an even more nasty version of upward infection, since yellow functions are not even marked. This is not the same as the truly binary "pure/impure" distinction.

I think the real issues that prevent some languages from committing themselves to be fully colorless are:

1. The language must support a legacy blocking library functions or foreign interfaces (i.e. need to support "yellow functions").

2. The language wants to support pluggable or customizable scheduling.

The only languages I know that cab completely let go of these concerns are Go and (possibly) Erlang[2]. They have their own runtimes that covers every aspect of IO and synchronization, and strictly limit the ways in which you can do FFI[3] or customize their scheduler. Once you took care of that, you have no yellow functions, and you can ensure all your functions are the same color.

Languages like Rust, C++, C#, Kotlin and Swift need to deal with a legacy baggage of both synchronous blocking IO and previous attempts at asynchronous IO (like callbacks and future/promises). They cannot eliminate this baggage so easily, and the path that Go has chosen is just not open to them. So I believe having colored functions is an acceptable choice. In fact, I lament the choice to conflate blue and yellow functions to the same category, since calling blocking functions from within an async function is a very common mistake I see people doing.

What we really lack is a way to compose blue functions with red functions better by parametrizing blue functions so they can accept function arguments of any color.

[1] A pretty justifiable claim can be made that C# was inspired by F# asynchronous workflows - which also could use Async.RunSynchronously to run them.

[2] Loom does not completely do away with function colors in Java, unfortunately, and yellow functions still remain. For instance, any synchronized method or block is yellow and cannot be called from virtual threads. Since Java doesn't have red functions, the situation is worse in some regards — it's very hard to tell if the function you're writing will be running on a virtual thread and whether you can use yellow functions - which are themselves quite hard to discover. Hopefully, synchronized blocks will be slowly eliminated from the JVM and everything will be made compatible with virtual threads. But until then, Java is not a colorless language - it is just a language where function color is entirely implicit.

[3] https://twitter.com/pcwalton/status/1370132795557748740

assbuttbuttass|2 years ago

> Green threads aren’t easier than async functions. The function colors don’t go away

Don't they? Using green threads I can happily call a blocking function in a synchronous context in one place, and run it with a goroutine in another place.

With async (particularly Rust's tokio), you need to pass around the "runtime" object in order to call an async function in a synchronous context.

BreakfastB0b|2 years ago

Go functions are coloured as well. context.Context proliferates everywhere IO takes place to handle cancellation / short circuiting because goroutines can’t be sent signals to die like threads. They have to exit themselves.

eyelidlessness|2 years ago

> The less interesting version is that you do always have to option to just block. You can call a task runner on that future, and synchronously block until that specific future completes itself. This isn’t the greatest approach, because we probably don’t want to block, but it is an option.

My first thought, coming from a JS perspective, was “no, you actually can’t do that”, because blocking the event loop will prevent resolution of the Promise. My next thought was that this is particular to a single threaded, cooperative concurrency model, so of course “can’t” is situational.

But on further thought, it’s much closer to my first thought: you can do that if you can interleave asynchronous ticks with synchronous ones, but only if you can determine there are no data races (or accept them if there are). This is something I suspect Haskell (or Rust?) is probably well suited for, but for JS it’s a complete non-starter.

Another comment made the point that the async keyword (the explicit “color”) is at issue, and I nearly made this comment there because it raised the same concerns for me from a different angle: if you can “await” from an otherwise synchronous function, you’re either accepting data races, blocking permanently, or implicitly “coloring” all functions as async with the same implication that you can statically eliminate races.

At which point, writing this, I realized this would all be so much simpler if everything was a statically analyzable DAG with transactional semantics.

Georgelemental|2 years ago

In Rust, blocking the current thread on future completion is spelled `block_on(your_future)`. https://docs.rs/futures/latest/futures/executor/fn.block_on....

inimino|2 years ago

It would be nice to mention in the Haskell part the solution which is just use 'trace', which just prints the thing and also evaluates to it, so it's just a pair of parens and a keyword and away you go.

vore|2 years ago

I think it's all a matter of how you want to draw the line: do you want to expose the underlying runtime to the consumer, or do you want to make green threads look like real threads? I'm more sympathetic to the latter, because real threads with kernel scheduling don't have function coloring and are even less predictable in when they yield, so why can't my language runtime (e.g. in Python) just pretend they're kernel threads?

I give a pass to Rust for this one though, since it would need a more heavyweight runtime with green threads to deal with this.

pornel|2 years ago

This article conflates two things:

1. A real limitation of JS that prevents it from using results of async functions in sync functions.

2. What syntax author prefers for awaiting async calls (here: none)

This makes "colored/colorless" discisions confusing, because most languages have fixed #1 and don't have any serious deficiency there, but also most languages don't have hidden implicit await #2 — often on purpose, because await has performance implications, changes semantics of locks, and becomes a leaky abstraction around FFI.

randyrand|2 years ago

> You can only await for it from an async function.

get rid of this needless requirement and function coloring goes away.

If you await, you should be able to return the raw type directly, without needing an async keyword.

hardware2win|2 years ago

async keyword in C# is due to backward compatibility reasons iirc.