Filtering out the spam results is only half the problem. In my experience, a legitimate site's content is cloned by a spam site, and that one appears in a Google search and the legitimate one does not. The example that keeps hitting me is GitHub Issues.
Filtering out the spam only removes the clones; it doesn't get the good results back in.
Host a personal (potentially shared with friends) searx (for multi-engine) or whoogle (google only) instance. Filter out some domains completely, rewrite others. The rewrite part is what allows you to substitute spam clone sites for the real deal. At least searx does dedupe already.
The time spent (including maintenance) will be paid back faster than you might expect.
Optionally rewrite some sites to altfronts like nitter/scribe/piped. If you care about spending time on privacy and decoupling searches from visits, you can set up arbitrary proxying rules.
One benefit among others over browser extensions is that it's a one-time setup for all your devices and clients. All you need to do on reinstall is to change the default search engine.
it isn't even just 'clones' because so many sites will just summarize an article from somewhere else and give a link to it. Sometimes it is a game of telephone with one site summarizing a 2nd site which is a summary of a 3rd and so on. I want a search engine to show me the original source not the one with the best SEO
3np|3 years ago
The time spent (including maintenance) will be paid back faster than you might expect.
Optionally rewrite some sites to altfronts like nitter/scribe/piped. If you care about spending time on privacy and decoupling searches from visits, you can set up arbitrary proxying rules.
One benefit among others over browser extensions is that it's a one-time setup for all your devices and clients. All you need to do on reinstall is to change the default search engine.
jccalhoun|3 years ago
krono|3 years ago
Either way, OP's ask was for a way to blacklist results, and I'm providing a method to accomplish exactly that. Edit: The rest is up to Google.