top | item 24101889

Dorking: the use of search engines to find very specific data

560 points| abarrettwilsdon | 5 years ago |alec.fyi

195 comments

order

chris_f|5 years ago

A few corrections:

The + (formerly used to force a term to be present in the result) and ~ (also find synonyms) operators have been deprecated.

Google now advises to wrap the word in quotes instead of using the +. Google will also automatically look for synonyms without the use of ~.

I have seen 'AROUND(n)' mentioned in many other places working as a proximity operator in Google, but I don't believe that is true and haven't found it to work in any logical way.

Also the use of parentheses to nest queries is not necessary in Google. It is actually required for Bing on complicated queries though.

GordonS|5 years ago

Worth mentioning that even if you put a term in double quotes, Google still tries to be too clever - you are not guaranteed to get results that contain your quotes search term :/

Razengan|5 years ago

Google has been gradually becoming useless for anything but the most basic/popular terms, over the last few years.

Just now I had to give up trying to look up "the term for fans that are paid actors" and variations.

Asking on Reddit or Stack Overflow would be faster than Google's search engine for some things now.

mehrdadn|5 years ago

The plus operator in the page appears to be binary rather than unary. I've never used it. Is that affected as well? (Though I'm confused why AND is necessary. Isn't it implied normally?)

abarrettwilsdon|5 years ago

Updated the article to reflect and credited you for the contribution!

EE84M3i|5 years ago

When you say "deprecated", you mean as in "discontinued" right? Not just like, discouraged?

TheSpiceIsLife|5 years ago

No longer have Google Chrome on any devices, switched over to Chromium Edge.

Same browser, different overloads.

Left the default search engine as Bing, but only because Duck Duck Go is useless for geographicly local search.

sawaruna|5 years ago

Might be my librarian career bias but I'm always surprised at how few people know about query operators. Ironically as Google search seems to be ignoring vital parts of people's queries, they are becoming more needed now, whereas years ago I would have assumed a constantly improving Google search would get better at determining what I was looking for.

colordrops|5 years ago

The operators don't work as well as they used to, and even when using them lots of results are still left out or are not an exact match. The combination of the SEO arms race and Google's algorithms to filter "bad" information make it nearly impossible to find some things. Sometimes you are looking for that "bad" piece of info as a counter example rather than a source of truth, and don't need google's patronizing filtering, so would prefer exact string matches. But apparently they know better than you.

kebman|5 years ago

You don't even wanna know how many times specialized searches have saved my ass, after multiple years on uni, and working as a writer, journalist, programmer, en even a musician! You can safely say that my entire life revolves around being good at doing various forms of searches.

uniqueid|5 years ago

Last week I blocked every * .google.* domain on my network except "youtube-ui.l.google.com".

Google Search: (1) ask a natural language question (since actual search is hobbled) (2) get unrelated garbage and ads back (3) blame yourself for "not being technical enough" to understand why the results aren't actually garbage.

Google Search has deteriorated to the point that so far I haven't missed it at all.

joe-collins|5 years ago

I've been slowly degoogling myself this year. For ~80% of my search, DDG has been entirely adequate.

I do miss some of Big G's cards, and their Maps is vastly superior to DDG's Apple Maps integration, even despite GMap's advertising. DDG's solution is wild, really: they use Apple for static-image-only maps with no real contextual interface, only a sidebar for search results. If you want directions, you must search for your destination by text alone, then in the sidebar choose to get directions from one of four providers (defaulting to Bing).

But when I just want an engine to match the text I give it (i.e. most of the time), DDG performs at least as well as Google's increasingly-fuzzy matching.

darepublic|5 years ago

Google still good for coding related searches

MattGaiser|5 years ago

What is it you are searching for that the results are useless?

ip_addr|5 years ago

What search do you prefer and why?

neilduncan|5 years ago

I live two towns over from Dorking.

https://en.wikipedia.org/wiki/Dorking

aidos|5 years ago

Also weird for me to see the name here (I’m in the next village over), not one you see popping up often. I occasionally wonder how many other HNers there are scattered about in my local area (I suspect not many).

tomalpha|5 years ago

I grew up in Dorking, but this is the first time (that I can remember...) that I actually read its wikipedia article.

TIL: No one knows why 'Dorking' is called 'Dorking', but there's a English Place Names Society which since the 1920's has researched the origins of town names in England, and is considered [0] to be "the established national body on the subject".

[0] https://epns.nottingham.ac.uk/

zeristor|5 years ago

Didn’t it feature in “War of the Worlds”?

My Dad worked for Mullard, which was renamed to Philips Electronics and relocated to Dorking.

kolektiv|5 years ago

Almost due south of Dorking, down the road in Horsham. The small town was the first thing I thought of on seeing this title. I'd imagine this area actually has a fair few HNers, as it's in the tech catchment area for London, Reading, etc.

chrisb|5 years ago

I live just a few miles to the North. Nice to see a few other ~Dorking locals here :)

mbrookes|5 years ago

Also in the immediate vicinity. Who knew there were so many of us! (And only those who have seen this on a Sunday & bothered to comment.)

Perhaps a mini meet-up is in order? :)

tutfbhuf|5 years ago

This is reddit humor, that I sometimes miss here. Thx neilduncan.

harha|5 years ago

I think it would be useful to be able to explicitly search around knowledge graph entities or site topics, e.g. a programming language, a city, a season, without having that single/specific term.

So a search including all sites related to an entity, say Munich or python along with the terms the user is searching because a page might then not specifically include the entity in its keywords or the text on the site or have a different language or use a synonym.

I’m sure search engines consider this somewhat, but explicitly activating such a feature would be a great improvement for the user.

Stackexchange has this feature with tags (using []), with user curated tags. Would be nice to have in DDG or google.

mitchdoogle|5 years ago

I would just like to create my own groups. As another user said, tagging would probably be gamed by SEO companies, but if people could use their own groupings, that problem wouldn't occur. Their could even be curated lists out there of specific sites that fall within a general category. At the least, I'd like to be able to block sites from ever appearing in my results. I've used add-ons for that which work pretty well, but it should be built-in in my opinion

epanchin|5 years ago

Businesses don’t game stackexchange.

1vuio0pswjnm7|5 years ago

I have a question for anyone reading this thread:

Do you believe you can get consistent results with any search?

For example, if we pick some uncommon search terms will we get the same results on the first search, the second search, the third, etc. Or will the results change?

I did a search with some terms from one of the comments in this thread, in quotes. The first search returned only one result: this thread.

As I searched the same quoted terms repeatedly along with additional terms, more results were returned that contained the exact string of original terms. Surprised by this, I tried a search with only the original terms, in quotes, once again. This time the search returned more than just the one result.

abarrettwilsdon|5 years ago

If it's specific enough, the SERP should stay the same until someone else publishes the same thing

e.g. the search of another article "set up Google Sheets APIs (and treat Sheets like a database)"

turns up my site and a couple Twitter threads talking about it (plus a phishing site which has scraped and republished it). I presume that will stay the same b/c it's such a specific title phrase (but not because searches are necessarily deterministic)

minusSeven|5 years ago

google removes a shit loads of search results for anything related to torrents or porn, forcing me to go to other search engines that won't either censor or remove content for legal reasons.

Even that list of search engines are reducing now.

yuvadam|5 years ago

Dorking is not that easy to do, Google is very easy on assuming you are being malicious on certain queries, try one too many and you'll hit their dreaded captcha that is impossible to pass.

userbinator|5 years ago

That really angers me, and I've tripped it more times than I can count, usually by searching for very specific things. Coworkers have also run into it multiple times (before everyone started working from home, we would exclaim "Fuck you, Google!" and raise a middle finger to the screen, which was a cue to everyone else to help).

The fact that they think you're "not human" when you use a search engine for its intended purpose and show how much you know how to use it is both disturbing and saddening. I wonder if Google's own employees run into it and/or the continuing degradation of results, or if they're somehow given immunity and a much better set of results...

kace91|5 years ago

Back when I was a teenager,I had a book titled "hacking with Google" by Johny long that was basically all specific searching tips and terms (oriented to find open vulnerabilities and the like, but still very useful in general despite the tacky name).

I wonder how much of it is still valid after all this time.

mcswell|5 years ago

Back when I was a teenager, I had a slide rule. I can guarantee that a slide rule is still valid, so long as you're not interested in more than two or three significant digits, and you don't want to add or subtract.

Mandatum|5 years ago

I've always heard the search terms as "Google dorks", but never heard it called "dorking".. Seems Google articles on the subject prove me wrong.

voldacar|5 years ago

Why doesn't google.com have a comprehensive list of these? I'm constantly seeing new ones that I didn't know about, but google never teaches you about them so you have to find them in obscure blog posts

lstamour|5 years ago

https://support.google.com/websearch/answer/2466433?hl=en but it's not complete. My favourite is actually the "range" operator. I don't need it often, but when combined with the exact match quotation marks, it's great. For example, here's a search for Sony bluetooth headphones available on Amazon.ca for between CA$100 and $150: https://www.google.com/search?rls=en&q=site%3Aamazon.ca+%22C...

The range operator also works great with years, dates, though the Tools menu with shortcuts for before: and after: operators can help there too.

One I haven't seen mentioned yet but used to be documented is that you can leave out words in a phrase by replacing them with an asterisk. I'm having trouble not italicizing text in this comment box, so pretend \* means a single asterisk: "Stocks rose today by \* percent" as a search matches the phrase "stocks rose today, led by a 4.4 percent". (Which until this post, had only one result on Google.)

Note that it's not 100% exact matching, because for actually exact matches you have to select "Verbatim" under Tools > All Results in the menu below the search box on the results page.

The only downside to using all these operators is that you'll get very familiar and frustrated with the Google reCAPTCHA prompts as your search is "too precise to be human". Even when signed in to Google, especially often in Safari on an iPhone. Sigh.

vezycash|5 years ago

Google randomly ignores "search term in quotes".

Related:examplesite.com used to work well. Now, it's better to use sites like alternativeto.net.

~phrase is unnecessary because but google searches for synonyms by default

phrase1 + phrase2 - Google randomly ignores it. I use it this way +compulsoryTerm

Although rare, there are things I simply can't find using Google. But Bing would. If Google keeps it up, other search engines would benefit.

beachy|5 years ago

Having a reliable search syntax would commoditise Google as other search engines could offer the same options. Having just a search box, instead of lots of options was how they moved ahead of e.g altavista in the first place.

Google would rather people are trained to just type human speak into the search box.

beefield|5 years ago

> Why doesn't google.com have a comprehensive list of these?

It is quite obvious that google does not give a s&it whether I find what I think I want to find. Google is much more interested in 1) serving me ads they think are most profitable and 2) giving me results they think I want.

KorfmannArno|5 years ago

My guess would be because Google eventually wants users to find everything via natural language queries.

EE84M3i|5 years ago

One reason they might not have a comprehensive list is because some might be relatively expensive to execute, but they can't/won't disable them for legacy reasons.

mrnuclear|5 years ago

At least now we are somewhat more empowered to find obscure blog posts. Which raises the suggestion that hackers are advantaged towards finding information. Which raises the suggestion that we should take the independent initiative of using SEO to inform more people about how to become search super-users.

ricardo81|5 years ago

Worth pointing out if you do some of these crafted operator searches quite quickly, you'll end up getting blocked or having to complete a captcha. I haven't done so in a while so I'm not sure what their current behaviour is.

Main reason being there's plenty data mining, e.g. looking for "powered by wordpress" and vulnerable versions, and generally all kinds of data mining that involve very specific requests for information, likely queries that aren't creating revenue, either.

w0mbat|5 years ago

The - prefix operator is very useful and still works.

Google should reinstate the + prefix operator. It was only taken out because it screwed up the search results for Google+, which is dead now.

kilroy123|5 years ago

I find myself having to use the "-" prefix a lot these days.

marcrosoft|5 years ago

I love the “inject JS into the page to find stuff” hack. The author mentions local “site you are on” but this can be applied with headless chrome to crawl many sites.

flywheel|5 years ago

That's web scraping 101

yourad_io|5 years ago

Fun fact: googling for -273.15 without double quotes produces no results.

You need to quote negative arithmetic values when searching, even if there are no other query parameters. It made me wonder if I was misremembering absolute zero.

yjftsjthsd-h|5 years ago

Oh, probably because it interprets it as a logical negation; not "negative X", but "remove X from results".

jrochkind1|5 years ago

Why is this called "dorking"? "Dorking" is a word that just means using search engines to find very specific data? This seems bizarre to me. Why does this need a special word?

Or it actually means using search operators beyond natural language entry? That's what this page seems to be about? I don't know why that would be called "dorking" either?

the_jeremy|5 years ago

All I want is the ability to search for symbols. Symbolhound.com is the only site I've heard that will support that, but it leaves a lot to be desired.

Brakenshire|5 years ago

It’s strange to me that more domain-specific search engines haven’t been created. There must be value in a programmer-specific search engine for instance. Or why aren’t there search engines that specialise in news, social media, Q&A websites or events, to give a few examples.

mcswell|5 years ago

Wow, I hadn't heard of that. I need that kind of search a couple times a week. It may leave a lot to be desired, but it's like democracy: the worst possible thing of its class, except for all the rest.

aaron695|5 years ago

Learn to use time. It's a drop down.

The web is slowly atrophying. Going back in time for originals makes a big difference.

Reverse is also true.

After a blow up the mass media will repeat the same thing on mass and swamp results.

Often an article in the last hour might have what you want, like the database link they are all talking about.

huffmsa|5 years ago

Don't you just love it when you're carefully crafted search finally displays the words or phrases you want in the snippet on the results page but then when you actually open the link and CTRL+F for it it's nowhere to be found? Not even in the raw HTML?

I sure do.

Tepix|5 years ago

There's a related thing you can do. If you have web pages somewhere, create a bunch of blank web pages with just one random word on them (something like "ristordshest") and then create an index page that links to them all.

Then link to that index page somewhere where noone except web crawlers will notice it. Then wait a few weeks.

Now when you

a) sell something on eBay where you are not allowed to link to the product support page page or some other stupid restriction like that

b) want to promote something on Instagram where you can't link to it

Ask people to google for the search term. There will be only one result: Yours.

bmay|5 years ago

the "link:" operator doesn't work for me--it just seems to include the URL's tokens in the search

snowwrestler|5 years ago

Pretty sure that one is deprecated. It was very useful for SEO research, which is probably why it doesn’t work anymore.

abarrettwilsdon|5 years ago

Hmm, I'm seeing the same now.

You can more or less replicate the functionality with intext:specific.url/subsite

Will update and credit you.

peter_d_sherman|5 years ago

A few thoughts:

1) Great information!

2) It seems like the world could use a book like Joe Celko's "SQL For Smarties", but for search engines. Yes, there are such books already, most notably O'Reilly's "Google Hacks" by Rael Dornfest, Paul Bausch, Tara Calishain -- but I think the world could still use a book covering more search engines and search techniques. The above web page would be a great starting point to an endeavor like that.

3) "Dorking" (love that term!) -- is going into my 2020 vocabulary lexicon! <g>

harimau777|5 years ago

Is there any way to search the actual page text? I find that often I remember some unique turn of phase from the page that I'm looking for and it would be extremely helpful to be able to simply search for that.

abarrettwilsdon|5 years ago

`intext:phrase` and `allintext:multi part phrase`

generally "phrase" works well too

jhbadger|5 years ago

Does filetype: still work? I'm getting zero hits for example filetype:epub

FrankSansC|5 years ago

For some file extensions yes, for others no or not anymore (eg .js)

chc|5 years ago

I'm kind of surprised to see Google brought back the + operator. I remember they prominently changed its meaning when they made it the @ of Google+, and I never bothered to check again after it died.

buffin|5 years ago

As a teenager, I used to search for "Index Of <movie name>" for movies. 2/3 times, I was able to find and download the movie I wanted to watch.

zhacker|5 years ago

I think I should rename filechef.com to dorkchef now

iandanforth|5 years ago

The email specific queries don't appear to work. The "@" is ignored by google so you just get results for the domain string.

abarrettwilsdon|5 years ago

The first two appear to still work, but the third does not.

The permutation searches are tricky because you don't know if a lack of results means the email does not exist, or just hasn't been posted anywhere indexed

Will update and credit

j45|5 years ago

This reminds me of an article I once read about the neat tricks that used to exist in altavista.com search engine

Daub|5 years ago

Effective Google-foo is one of the first things I teach my first year students. Few greater life skills exist.

malwarebytess|5 years ago

NLP and to a lesser extent SEO has vastly diminished the value of this type of searching.

somerandomboi|5 years ago

It would be useful to use “Dorking”, even for non-programmers.Good article!

flywheel|5 years ago

Prediction: Using the methods of "dorking", this is the only page on the internet among 10 million+ results that is calling this "dorking".

montjoy|5 years ago

I hope it doesn’t catch on since it makes me die a little inside. It’s a very Reddit-type word though. I can easily imagine it being used by non-technical folk and tech journalists.