Is there a chance for an Everything alternative? That's the awesome tool that would allow you to ignore the depth/breadth completely find any file literally instantly (the power of instant feedback is unapralleled)
Do non-NTFS systems (even the modern ones) not offer this?
the 'locate' command is faster than everything but arguably not as convenient (being a commandline tool). It seems to support non-ntfs (meaning it doesn't support ntfs) linux filesystems.
edit: catfish is a gui front-end available for locate.
But this tests perf for only finding one file. bfs is optimized for finding many files at once, as it defers closing the dirhandle, and uses io_uring for parallel IO callbacks.
`bfs` doesn't actually use io_uring yet, but it is planned. I'm not sure I'd say it's specifically optimized for finding multiple files at once either, I try to make it fast for many different use cases. There's two benchmarks in the blog post and a few more that I run regularly, e.g. https://github.com/tavianator/bfs/pull/107
I am performing a similar file system tree navigation asynchronously in Node.js which is just a shallow API over the C Linux FS APIs.
I can see you are using opendir and closedir functions? What is the benefit from using the opendir function[1] when readdir[2] can be called on a location directly? Is the benefit that opendir returns a file descriptor for use in opening a stream to gather directory object descriptors?
I considering changing my use of readdir to use the withFileTypes option so that it returns a list of directory entries (objects of artifact name and type) instead of a list of conditions to discern types like I am doing on lines 382-432.
Nice to see other alternatives to find. I personally use fd (https://github.com/sharkdp/fd) a lot, as I find the UX much better. There is one thing that I think could be better, around the difference between "wanting to list all files that follow a certain pattern" and "wanting to find one or a few specific files". Technically, those are the same, but an issue I'll often run into is wanting to search something in dotfiles (for example the Go tools), use the unrestricted mode, and it'll find the few files I'm looking for, alongside hundreds of files coming from some cache/backup directory somewhere. This happens even more with rg, as it'll look through the files contents.
I'm not sure if this is me not using the tool how I should, me not using Linux how I should, me using the wrong tool for this job, something missing from the tool or something else entirely. I wonder if other people have this similar "double usage issue", and I'm interested in ways to avoid it.
Would love to see how bfs compares to fdir[0] for directory traversal. Even though fdir is using Node.js underneath, the comparisons I have done with fd & find are pretty close. Of course, bfs would probably be quite a bit faster...but how much faster exactly?
>bfs operates breadth-first, which typically finds the file(s) you're looking for faster.
I find this statement very confusing.
Typically I'm looking for files in subdirs because I'm not aware of target's path. That's depth first (though of couse there is a change I'm going to find my target quite close to the starting point).
And if I'm looking in the same dir I'm currently in (or I know what dir to look at) - then there is no need for a sophisticated search anyway, as typicall you don't have too many files in a directory unless this is some kind of tmp or cache directory.
breath-first doesn't mean it's not looking into subdirectories. It is, but it looks into each subdirectory before decending into the different subdirectories subdirectories.
So it enger first subdir, looks at files there, then goes back up, enters second subdirectory, looks at files there, ... After checking each subdir it goes into the first subdir oft the first subdir, looks at files there, then goes to second subdir of first subdir
There is a mistake in the README of the github repo:
In the 1st part of the "Features" section, the "bfs" and "find" columns' contents are swapped. I was a bit confused while reading it and it took me a while to notice that it was a mistake.
I keep coming across examples where io_uring cannot be used. It seems like functionality that needs to be scrapped and replaced by something more robust. It's the fastest thing that no one seems to be able to use.
Or maybe just iterated on until it gets more functionality? Linux has quite a large API surface, I'm not surprised it takes a while to replicate all of it.
[+] [-] eviks|2 years ago|reply
[+] [-] abrookewood|2 years ago|reply
[+] [-] leeman2016|2 years ago|reply
[+] [-] ksherlock|2 years ago|reply
[+] [-] 112233|2 years ago|reply
[+] [-] mozball|2 years ago|reply
edit: catfish is a gui front-end available for locate.
[+] [-] rurban|2 years ago|reply
But this tests perf for only finding one file. bfs is optimized for finding many files at once, as it defers closing the dirhandle, and uses io_uring for parallel IO callbacks.
[+] [-] tavianator|2 years ago|reply
[+] [-] austin-cheney|2 years ago|reply
I can see you are using opendir and closedir functions? What is the benefit from using the opendir function[1] when readdir[2] can be called on a location directly? Is the benefit that opendir returns a file descriptor for use in opening a stream to gather directory object descriptors?
[1] https://man7.org/linux/man-pages/man3/opendir.3.html
[2] https://man7.org/linux/man-pages/man3/readdir.3.html
Your project is probably more mature but if you want an alternate approach to examine here is I have been doing it: https://github.com/prettydiff/share-file-systems/blob/master...
I considering changing my use of readdir to use the withFileTypes option so that it returns a list of directory entries (objects of artifact name and type) instead of a list of conditions to discern types like I am doing on lines 382-432.
[+] [-] Zababa|2 years ago|reply
I'm not sure if this is me not using the tool how I should, me not using Linux how I should, me using the wrong tool for this job, something missing from the tool or something else entirely. I wonder if other people have this similar "double usage issue", and I'm interested in ways to avoid it.
[+] [-] thecodrr|2 years ago|reply
Disclaimer: I am the developer of fdir.
[0] https://github.com/thecodrr/fdir
[+] [-] tavianator|2 years ago|reply
[+] [-] rizky05|2 years ago|reply
[+] [-] konart|2 years ago|reply
I find this statement very confusing.
Typically I'm looking for files in subdirs because I'm not aware of target's path. That's depth first (though of couse there is a change I'm going to find my target quite close to the starting point).
And if I'm looking in the same dir I'm currently in (or I know what dir to look at) - then there is no need for a sophisticated search anyway, as typicall you don't have too many files in a directory unless this is some kind of tmp or cache directory.
[+] [-] rowbin|2 years ago|reply
So it enger first subdir, looks at files there, then goes back up, enters second subdirectory, looks at files there, ... After checking each subdir it goes into the first subdir oft the first subdir, looks at files there, then goes to second subdir of first subdir
[+] [-] konart|2 years ago|reply
[+] [-] seeknotfind|2 years ago|reply
[+] [-] leeman2016|2 years ago|reply
In the 1st part of the "Features" section, the "bfs" and "find" columns' contents are swapped. I was a bit confused while reading it and it took me a while to notice that it was a mistake.
[+] [-] tavianator|2 years ago|reply
[+] [-] attentive|2 years ago|reply
I wonder how this compares in speed.
[+] [-] Dowwie|2 years ago|reply
[+] [-] tavianator|2 years ago|reply
[+] [-] kristopolous|2 years ago|reply
However as a disclosure, I was running this on my "bedside laptop" which is a Core2 from like 15 years ago.
[+] [-] Dwedit|2 years ago|reply
[+] [-] tambourine_man|2 years ago|reply
[+] [-] haraldooo|2 years ago|reply