(no title)
RyanZAG | 7 years ago
The remaining performance left behind is all in memory allocation and garbage collection - something you could optimize relatively easily if it were written in C. Such as by using a memory pool, so that you wouldn't need allocations or garbage collection at all.
Of course if performance isn't a big issue for your task, then none of this is really important.
_ph_|7 years ago
This article nicely shows how optimizing your program yields more speed than randomly throwing goroutines at it. Finally it does use goroutines for a good effect, but after proper consideration.
coldtea|7 years ago
The point of Golang is using them intelligently, not merely throwing at any problem like all you've got is a hammer...
iainmerrick|7 years ago
jerf|7 years ago
One of the problems I see repeatedly when people try to benchmark things with concurrency is when they don't specify a problem that is CPU-intensive enough, so it ends up blocked on other elements of the machine. For a task like this, I'd expect optimized Go to easily keep up with a conventional hard drive, and with just a bit of work, come within perhaps a factor of 2 or 3 of keeping up with the memory bandwidth on a consumer machine (including the fact that since you're going to read a bit, then write some stuff, you're not going to get the full sequential read performance out of your RAM), not because Go is teh awesomez but because the problem isn't that hard. To get big concurrency wins, you need a problem where the CPU is chewing away at something but isn't constantly hitting RAM or disk or network for it, such that those systems become the bottleneck.
RyanZAG|7 years ago
val_deleplace|7 years ago
weberc2|7 years ago
emperorcezar|7 years ago
Ideally, in this case I would think one would want to check the number of cores and decide what route to take.
scott_s|7 years ago
For the point at which the author removed parallelism, and the sequential code was faster, I think this was the case. The computation was too fine-grain. The author successfully took advantage of parallelism by applying it at a coarser granularity; each thread did more work. At this point, the author also does tune the solution for the execution environment, as he uses a fixed set of go-routines to process a bunch of messages rather than one go-routine per message.