top | item 40768237

Three ways to think about Go channels

162 points| ingve | 1 year ago |dolthub.com | reply

137 comments

order
[+] hedora|1 year ago|reply
I’ve worked with a few large code bases that use channels. In all those code bases, they were about as maintainable as GOTO-based control flow, except that GOTO makes you have unique label names. All the channel-based code I’ve seen just has 100’s of call sites like “chan->send()” and “chan->recv()” sprinkled around, and doesn’t even have the discipline to put related senders and receivers in the same source file.

At least old-school syntax like “GOTO foobar_step_23” and “LABEL foobar_step_23” is grepable).

I greatly prefer programs that “color” functions sync/async, and that use small state machines to coordinate shared state only when necessary.

Go technically supports this, but it doesn’t seem like it is idiomatic (unlike rust, the go compiler won’t help with data races, and unlike C++, people don’t assume that the language is a giant foot-gun).

[+] klabb3|1 year ago|reply
> All the channel-based code I’ve seen just has 100’s of call sites like “chan->send()” and “chan->recv()” sprinkled around, and doesn’t even have the discipline to put related senders and receivers in the same source file.

Any language that supports true multithreaded parallelism provides foot-rail-guns, no matter if you’re locking, asyncing or channeling. For concurrent data structures, you can’t lazily rely on the compiler to catch your errors. Not even in eg Rust. You should never sprinkle concurrent primitives just because.

For every channel created, you need to determine how the invariants of both the channel itself and the contents that pass through it. This is, for the most part, an easier exercise than with locks, because channels provide a way to think about ownership, and crucially, transfer of ownership between threads (goroutines). It also decouples the response to an event from the production of it, whereas with locks you have to dispatch the “next action” from the code that produced the event. I’ve never had a language that’s so easy to do things like “spin up 10 concurrent requests at a time with individual timeouts, cancel outstanding requests if any of them succeeds or canceled by the user, and ensure all tasks are torn down before returning”. It’s near-trivial to get right with channels, contexts and waitgroups.

In either case, since concurrency is so difficult, I would say that the Go docs and resources are quite lightweight on how to use their primitives correctly. Right now, it’s a bit “now finish the rest of the fucking owl”. You need discipline to use them, but there’s no parental figures around to tell you how to watch out for the dangers.

[+] bruce343434|1 year ago|reply
Kind of sick of seeing this misconception so I want to clear it up: async/await (as most commonly implemented) != pre-emptive and truly parallel multithreading.

All parallel execution is asynchronous, but not all asynchronous execution is parallel.

async/await interleaves execution in a cooperative fashion: a "fiber" runs until it yields by doing something blocking (such as blocking I/O, the most often cited use case). There are no data races and no need to use synchronization primitives since there is never a case where 2 fibers modify a piece of data at the same time: because execution is interleaved, the operations are always ordered one after the other. Whenever a fiber blocks, execution is returned to the scheduler, which resumes execution in one of the other fibers which were blocked.

fork(pthread_create)/join(pthread_join) parallelizes execution: a "thread" runs until it is done or is killed externally (the scheduler can also pause execution). A thread X can wait for another thread Y to be done by calling join(Y). There is a need to use atomics or other synchronization primitives such as channels (or mutexes, or barriers, or semaphores) when 2 threads share a variable in order to make sure only 1 at a time modifies, because although threads can be ran in an interleaving fashion (and sometimes they are interrupted (pre-empted)), they are usually ran in parallel on different CPU cores/threads (if you have a SMT CPU, which most nowadays are: n CPU cores and each CPU core has 2 "CPU threads").

P.S. A CPU thread != OS thread

[+] arp242|1 year ago|reply
> All the channel-based code I’ve seen just has 100’s of call sites like “chan->send()” and “chan->recv()” sprinkled around

This just sounds like badly written chaotic code. Unfortunately there are some people who seem to think "Go has channels, therefore, I need to use channels as much as possible", but that's almost always a mistake.

Channels shouldn't be used very frequently, but they can be very handy especially in combination with select and/or buffered channels. Usually there should be a clear and "obvious" API.

[+] bcrosby95|1 year ago|reply
You may as well say variables are bad because people are bad at naming variables. I mean, yeah? I don't call my channels "chan", I call them "outputChannel" or "inputChannel" or "foobarProducer" or "twiddleListeners" or whatever.
[+] packetlost|1 year ago|reply
This is why I think each instance of a channel should be a global singleton as soon as they escape a local context. IMO you could solve this by having each instance of a channel be a unique type that must be annotated as that type everywhere it's used. Channels are generic, but you should not generally use them generically.

In Go you can use a type alias to make this work:

type ExitChan <-chan nil

Unfortunately it doesn't really enforce anything at the compiler level, but at the very least you can grep for it.

[+] throwaway894345|1 year ago|reply
I'm confused about why we're comparing GOTO control flow with channels, since they're completely unrelated. I guess you can make a mess with each when you use them inappropriately? With GOTO, you should avoid it because there are better options (notably, conditional statements).

Unlike GOTO, there isn't really a better alternative for synchronizing parallel programs. Parallel programming (or even concurrent programming, for that matter) is just harder than sequential programming, and if you don't know what you're doing (regardless of whether you use channels or not) you're going to end up with a mess. Channels help to tame parallel programs, but they can't compensate for programmers who don't know how to write correct parallel programs.

And while Rust can prevent against data races, data races are a tiny sliver of parallel programming bugs. Far more common are race conditions and deadlocks, against which Rust is powerless.

> I greatly prefer programs that “color” functions sync/async, and that use small state machines to coordinate shared state only when necessary.

Async/await similarly doesn't solve the problem of undisciplined programmers making a mess. If you give undisciplined programmers async/await and shared state, they'll make a mess as easily as they will with goroutines and channels. If you're hiring people who can't be trusted with shared memory parallelism, then you have to take away parallelism or mutability or you have to train them to write correct parallel programs.

[+] kccqzy|1 year ago|reply
Function coloring is orthogonal to async or not. I also greatly prefer function coloring, but I'm agnostic when it comes to either explicit async/await or sync code in green threads. For example, Haskell has function coloring, but it also has Go-style lightweight threads and sync code. It's the nicest experience IMO.
[+] emadda|1 year ago|reply
I think the reason channels become harder to understand is that you jump to having a network topology of senders and receivers to reason about.

With async/await, everything is still a function. Your IDE and debugger help you understand when and where a function is called. When debug-breaking on a function you have the stack trace.

Async/await gets you fast context switching for IO ops as it is a single OS process (no threads for the OS to constantly switch).

But once you go beyond a single OS process, then you need something like messaging or shared memory on top of async/await to communicate between OS processes/threads.

I think Golang jumps straight to messaging which is better for multiple cores, but misses the first step of async/await which is easier to reason about when you have a single process on a single core.

[+] hot_gril|1 year ago|reply
Golang should have async/await and exceptions, both of which can be implemented as syntactic sugar on top of existing features.
[+] mlinhares|1 year ago|reply
Almost every single usage of channels in Go i've encountered was a mistake, buggy and hard to understand. To me nowadays seeing channels in use in code i just see it as a code smell and think how difficult it would be to remove them.

There are multiple ways to use channels wrong, the compiler will not help you and most of the docs and "tutorials" will not cover that. The best thing you can do in Go is stay as far away from channels as you can.

[+] cyberax|1 year ago|reply
I must admit, I dislike Go channels. They are a hell to debug, as they are unnamed. And they're anonymous, so you can easily get a stacktrace filled with thousands of identical stacktraces that you can't correlate with logs.

Golang needs to have a way to manage the channels better. Naming them and waiting on them would simplify a lot of crusty stuff. Naming is becoming possible, goroutines can already have pprof labels (that are even inherited between goroutines!), so just adding pprof labels to stacktraces will help a lot.

But unfortunately, Go creators are allergic to anything that brings thread-local variables closer.

[+] rdtsc|1 year ago|reply
> They are a hell to debug, as they are unnamed. And they're anonymous

That was one odd thing that stood out to me about Go. I am coming from Erlang so a process having process ID we can keep track of, terminate it, trace, etc, is fundamental to be able to reason about and operate a system. Advertising that they can handle millions of lightweight goroutines but then having no obvious way to identify them, and monitor their lifetimes is kind of strange.

On the lower level, I sort of understand why they did it: they focused on typed channels. So they can have multiple channels of different types potentially talking to the same goroutine. Now, having both named goroutines and channels would be more complicated, and they tried to keep things "simple". Erlang's processes on the other hand, have implicit mailboxes, there is no "mailbox1" and "mailbox2" there is only one, but it's also easier there because there is no static typing. There things are simple because only processes have identities but not mailboxes.

[+] vineyardmike|1 year ago|reply
With the advent of generics, there’s no reason we can’t get libraries that wrap common objects.

As a thought experiment, I could see Optionals, Named Channels, Collections etc being implemented in some common but non-stdlib library. Of course now we’ve recreated all those languages people hate when they talk about go… but those features exist for a reason.

[+] jrockway|1 year ago|reply
Stacks printed by the receiver are generally incomplete/useless for systems that communicate. The error is often in the sender, but is reported by the receiver. This isn't unique to Go and channels.

Consider a Node app that calls a C++ web service. The C++ web service returns "503 Service Unavailable" because your request causes the server to segfault. The useful stack trace for fixing this problem is on the C++ side in a core dump, not whatever Node prints when the HTTP request fails.)

What you want to do on the Go side is have your sender be able to relay an error to the receiver, and to wrap errors:

    func doWork(w work) (result, error) {
        return errors.New("too lazy to do work")
    }

    func doWorkQueue(q <-chan work, r chan<- result) {
        for w := range q {
           result, err := doWork(w)
           if err != nil {
              r <- errorResult(errors.Wrapf(err, "doWork(workID=%v)", w.ID))
              continue
           }
           r <- successResult(w, result)
        }
    }
    ...
    go doWorkQueue(...)
    submitWork(work)
    for r := range resultCh {
        if err := r.Err(); err != nil {
            log.Error("failed to do work: %v", errors.Wrap(err, "recv result"))
        }
    }
 
This now errors with something like:

    main.go:20: failed to do work: 
    main.go:20: recv result:
    main.go:9: doWork(workID=42): 
    main.go:2: too lazy to do work
        
Or if your `errors` library doesn't grab the caller func/line number, just "failed to do work: recv result: doWork(workID=42): too lazy to do work". This should be enough to debug the problem, regardless of what side the problem happens on.

(At work we use a hacked up copy of github.com/pkg/errors, which grabs the entire stack trace at each of the errors.Wrap/errors.New call sites. This results in an exceedingly verbose trace that takes up your entire screen, but is at the very least... thorough.)

The reason that error wrapping is essential is because stack traces don't capture critical information, like what iteration of the loop you're on, how many times your retry loop ran, which work id failed and returned "too lazy to do work", etc. The error wrapping is where you get to add this in. This is, again, the same as every language... the random exception you throw in your HTTP client when the server is down isn't aware of the work item id that caused this exception, so you have to catch and re-throw with that information or you have an undebuggable mess.

What's nice about Go is that it's really easy to add this contextual information, either with fmt.Errorf in the standard library, or with very small functions that capture some information automatically (errors.Wrap/errors.Wrapf). I will say that not doing this is my #1 complaint, and it's exceedingly common to just "return err". I have spent 4 years fixing the work codebase to wrap errors, and people still add new unwrapped errors (because our linter allows an escape hatch, errors.EnsureStack, which some people really like). It then results in some oncall engineer wasting a week debugging a simple problem. Sigh! But that's humans being humans; Go makes it very easy to do the right thing. You just have to tell your team to do the right thing and to make them want the right thing.

[+] tazu|1 year ago|reply
> But unfortunately, Go creators are allergic to anything that brings thread-local variables closer.

This is the most frustrating thing about Go for me. They use thread-local storage within the stdlib, but refuse to let us plebs have access to it.

[+] tombert|1 year ago|reply
This almost kind of feels like you're reinventing the Actor model (e.g. Erlang/Elixir).
[+] jgrahamc|1 year ago|reply
I'll admit that I like Go channels because Hoare was the professor when I was doing my doctorate and so CSP was what I used in my thesis[1], but it's worth understanding what unbuffered channels give you: it's message passing with synchronization. They are very simple to reason about and make writing concurrent code a breeze.

[1] https://blog.jgc.org/2024/03/the-formal-development-of-secur...

[+] bjoli|1 year ago|reply
I come from the complete opposite side: I have been writing some concurrentML code in guile scheme recently, and it has really made me understand why I always disliked go's channels.

There are just so many things that are just slightly wrong to make it unpleasant. Better than many other things, but still kind of frustrating. I really think unbuffered channels should have been the default.

[+] bsaul|1 year ago|reply
i like the pattern, but message passing where the message is actually just a pointer to a mutable struct ( as allowed in go) kind of defeat the purpose, doesn't it ?
[+] tapirl|1 year ago|reply
> One literal interpretation of Pike's quote is that message passing is different than sharing memory for pedantic reasons. Copying values between sender/receiver stacks is safer than sharing memory. But it is also possible to send values with pointers into channels, so that doesn't really prevent developers from abusing the model. I don't think avoiding shared memory is a top of mind consideration for developers deciding whether to use channels.

I think the point of Pike's quote is that, when a goroutine gets a pointer received from a channel, it gets the ownership of the values referenced by the pointer and other goroutines give up the ownership. This is a discipline Go programmers should hold but not a rule enforced by the language.

[+] jerf|1 year ago|reply
As I like to say, Rust may be one of the only languages that has built ownership right into its type system, but the problems that "creates" with programming in Rust are actually problems revealed by programming in Rust, not created. The ownership problems are 100% there in other languages too, it just isn't a compiler error. It can help everyone, in any threaded language, to be thinking like Rust, even if the compiler and type system do not directly help you.

A simple and useful degenerate case of this is that whenever any message is sent, complete ownership of the entire transitively-reachable set of values contained in that message is by default transferred to the receiver. This is worthwhile even if you must explicitly construct some safe value to pass in order to maintain this promise. This keeps things generally easy to reason about without the full complication and power that Rust offers. At the very least, whenever I violate this, I have lots of comments about what and why on both sides of that transaction in my code base.

(I do sort of wish there was a variant of a "go" statement that I could use that made me statically promise that all communication in and out of a given goroutine must be solely in the form of copied values, so I could guarantee that goroutine was an "actor". This is, admittedly, just me wishing very pie-in-the-sky. I doubt it could be turned into a practical proposal.)

[+] doawoo|1 year ago|reply
Y’all should really give Erlang/Elixir a try… this stuff is so much more trivial to deal with in that ecosystem and it pains me that it doesn’t get as much attention as Go does.
[+] tptacek|1 year ago|reply
They're very different languages and have different deployment dynamics. Elixir is a higher-level language. Other languages are lower-level. People will always be frustrated to see a language "shoehorned" into a higher- or lower- level setting where their preferred language might fit. We're an Elixir/Go/Rust shop. You could not use Rust and Go interchangeably for what we use them for, and you couldn't use Elixir almost anywhere we use Go.
[+] tail_exchange|1 year ago|reply
I am curious about Elixir, but I really can't bring myself to spend time mastering a language with dynamic types. You are removing one class of errors from your application, but at the same time inviting a new one that has been solved decades ago.

I'm aware that this is a very controversial take, because lots of people love duck typed languages, but after working in large codebases, they are a hard pass for me. There's a reason why TypeScript was created for JavaScript, Sorbet was created for Ruby, and type hints is so popular with Python. I think I've seen something about gradual typing being introduced in Elixir, but gradual typing is still a long ways from enforcing type safety. Until then, I'll stick with Go.

If I am mistaken and there is a "TypeScript for Elixir", then I would love to know about it.

[+] innocentoldguy|1 year ago|reply
I agree. Having worked a lot with both, I think Elixir is by far the better language, especially for web programming.
[+] liampulles|1 year ago|reply
As an avid Go user, I think async/await is probably a nicer construct for most usecases. But Go channels work fine as long as you keep to the basics and documented patterns.

I can recommend making a utility function which accepts a set of anonymous functions and a concurrency factor. I've since extended this function with a version which accepts a rate limiter, jitter factor, and retry count. This handles most cases where I need concurrency (batches) in a simple and safe way.

[+] adastra22|1 year ago|reply
I guess the grass is always greener on the other side. As a Rust user, I’m constantly thinking “why didn’t they just give us go channels instead of this crazy async/await nonsense?”
[+] brianolson|1 year ago|reply
Rust async/await is less nice than Go coroutines. There are things you can't do and weird rules around Rust async code. Every Go chan allows multiple readers and multiple writers, but Rust stdlib and tokio default to single-reader queues.
[+] initplus|1 year ago|reply
Channels and async/await aren’t really equivalent features. Beyond the fact that they both deal with concurrency.

You can do channels (message passing) on top of async await.

[+] adeptima|1 year ago|reply
Well writen. Will recommend to all newcomers from other languages.

As for uncovered topics or part 2 - long running Go channels can be a nightmare.

You need to implement kind of observability for them.

You must find the way to stop/run again/upgrade and even version payload to handle your channels/goroutine.

Common problem with channels overuse is so called goroutine leaks. Happens more often than most devs think. Especially, if lib writers initiate goroutines in init() to maintain cache or do some background cleanup job. It's good to scan all used packages for such surprises.

You might also find concepts like "durable execution" or "workflow" engines down the road.

[+] Intermernet|1 year ago|reply
One pattern for long running channels is to have another polling / heartbeat channel on the goroutine. This is often redundant as you can achieve the basic functionality with contexts, but it can be useful when you need to implement some reporting, or you need to implement more advanced load balancing or back off strategies.
[+] onionisafruit|1 year ago|reply
I didn’t realize why channels are part of the language vs a standard library feature until I read this. Now it makes sense that it’s about compiler optimizations with goroutines.
[+] parhamn|1 year ago|reply
Go channel behaviors are pretty annoying. For one I always forget the panic scenarios (e.g. writing to a closed channel), I feel like the type system could've done more here.

I recently wrote a simple function that maps out tasks concurrently, can be canceled by a context.WithCancel, or if a task fails. The things that cancel the task mapper need to coordinate very carefully on both sides of the channel so that they're closed and publishers stop sending in the right sequence. The amount of switches/cancels/signals quickly explode around the coordination if you too cute on how to do it (e.g. read from the error channel to stop the work).

Frankly I'm not sure I still got it right [1]. And this is probably the most unsettling part. Rereading the code I can't possibly remember the cancellation semantics and ordering of the short mess I created. Now I'm wondering if mutexes would've made for more understandable code.

[1] https://gist.github.com/pnegahdar/1783f0a4e03dc9a3da43478994...

[+] jerf|1 year ago|reply
Writing to a closed channel at all is generally a design smell. Generally this is consumers closing channels, which is not a good idea. Only producers should close channels, generally only a producer that is the sole owner of the channel, and then, being the sole owner, it should "know" that it closed the channel and by its structure never write to it again. You probably are overcomplicating matters and need just one top-level channel, which in modern Go is the one contained in a context.Context, to be the one and only stop signal in the system.

I think everyone goes through a bit of a complication phase with channels, I recognize your issues in code I've written myself, this is definitely not a "only a bad person would have this problem" post. But, yes, there probably is an organization that solves this problem. There's an art to using them properly. A good 50% of that art may well be that consumers should never close channels.

(Another one is the utility of sending a channel as part of the message in a different channel. It is intuitively easy to think that channels must be very expensive, but when they're not participating in a select, they're just a handful of machine words in RAM. It is perfectly sensible to send a message to a "server" process that contains the channel in it to send the reply to, because it only costs a few machine words in allocation and precisely the one sync operation it will ever participate in. Channels do not have to amortize their costs with lots of messages; a 1-message channel is practical. This also cleaned up some complicated code I had before, trying to prematurely optimize something that was already very cheap.)

The other thing is, if you haven't looked at https://pkg.go.dev/golang.org/x/sync/errgroup , you may want to. golang.org/x/ is the not-as-well-known-as-it-should-be extended standard library; things the Go team are not willing to put into the 1.0 backwards compatibility promise so they retain the ability to change things if necessary, but otherwise de facto as high quality as the standard library, modulo some reasonably well-labeled exceptions. Contra some claims that it is impossible to abstract in Go, many of these common concurrency patterns have been abstracted out and you can and should grab them off the shelf.

[+] dlisboa|1 year ago|reply
Not speaking on whether the language should make this easier without an external library, but wouldn't https://github.com/sourcegraph/conc help in that scenario? It has context-aware and error-aware goroutine pools, seems like the exact fit for what you are trying to do. Although admittedly I dive too deep into your code.
[+] DylanSp|1 year ago|reply
Channels can be a useful primitive, but without more structure they're tough to reason about. I really wish that the Go team would provide more implementations of common patterns and publicize them; things like x/sync/errgroup are fantastic, and I'd love to see more.
[+] JaggerFoo|1 year ago|reply
Timely article for me, since I'm thinking of using channels in a real-world application for the first time.