(no title)
zachmu | 1 year ago
For me the biggest red flag is somebody using a channel as part of an exported library function signature, either as a param or a return value. Almost never the right call.
zachmu | 1 year ago
For me the biggest red flag is somebody using a channel as part of an exported library function signature, either as a param or a return value. Almost never the right call.
lanstin|1 year ago
The library is called "go-treewalk" :) The data of course never ends back in main, it's for doing things or maybe printing out data, not doing more calcualation across the tree.
lelanthran|1 year ago
Okay, I gotta ask - what exactly is wrong with this approach? Unless you're starting only a single goroutine[1], this seems to me like a reasonable approach.
Think about recursively finding all files in a directory that match a particular filter, and then performing some action on the matches. It's better to start a goroutine that sends each match to the caller via a channel so that as each file is found the caller can process them while the searcher is still finding more matches.
The alternatives are:
1. No async searching, the tree-walker simply collects all the results into a list, and returns the one big list when it is done, at which point the caller will start processing the list.
2. Depending on which language you are using, maybe have actual coroutines, so that the caller can re-call the callee continuously until it gets no result, while the callee can call `yield(result)` for each result.
Both of those seem like poor choices in Go.
[1] And even then, there are some cases where you'd actually want the tree-walking to be asynchronous, so starting a single goroutine so you can do other stuff while talking the tree is a reasonable approach.
fnord123|1 year ago
Before Go had iterators, you either had callbacks or channels to decompose work.
If you have a lot of files on a local ssd, and you're doing nothing interesting with the tree entries, it's a lot of work for no payoff. You're better off just passing a callback function.
If you're walking an NFS directory hierarchy and the computation on each entry is substantial then there's value in it because you can run computations while waiting on the potentially slow network to return results.
In the case of the callback, it is a janky interface because you would need to partially apply the function you want to do the work or pass a method on a custom struct that holds state you're trying to accumulate.
Now that iterators are becoming a part of the language ecosystem, one can use an iterator to decompose the walking and the computation without the jank of a partially applied callback and without the overhead of a channel.
kjksf|1 year ago
The caller would do:
Better than exposing channel.But my first question would be: is it really necessary? Are you really scanning such large directories to make async dir traversal beneficial?
I actually did that once but that that was for a program that scanned the whole drive. I wouldn't do it for scanning a local drive with 100 files.
Finally, you re-defined "traversing a tree" into "traversing a filesystem".
I assume that the post you're responding to was talking about traversing a tree structure in memory. In that context using goroutines is an overkill. Harder to implement, harder to use and slower.
Kamq|1 year ago
And it doesn't really color the function because you can trivially make it sync again.
beautron|1 year ago
Yes, but this goes both ways: You can trivially make the sync function async (assuming it's documented as safe for concurrent use).
So I would argue that the sync API design is simpler and more natural. Callers can easily set up their own goroutine and channels around the function call if they need or want that. But if they don't need or want that, everything is simpler and they don't even need to think about channels.
lelanthran|1 year ago
Or a coroutine (caller calls `yield(item)` for each match, and `return` when done).
Filligree|1 year ago
unknown|1 year ago
[deleted]