top | item 39116195

Show HN: Open-source script to get your site indexed on Google

131 points| goenning | 2 years ago |github.com | reply

61 comments

order
[+] xnx|2 years ago|reply
This script abuses the Indexing API which is intended for job posting and other specific purposes. https://developers.google.com/search/apis/indexing-api/v3/qu...

Use at your own risk.

[+] ldoughty|2 years ago|reply
Might simply not work for your website:

> Currently, the Indexing API can only be used to crawl pages with either JobPosting or BroadcastEvent embedded in a VideoObject.

I wanted to highlight (in addition to your statement) that JobPosting is a specific type of structured data.

If the target site doesn't have these elements, it may or may not work... or it may work for now, but not once they realized it's being used incorrectly

JobPosting structured data: https://developers.google.com/search/docs/appearance/structu...

[+] adaboese|2 years ago|reply
The annoying thing about this is that it will ruin this feature for everyone else. I, and many others, use this for requesting to index time sensitive content.
[+] chiefalchemist|2 years ago|reply
Yes and no. I mean, just because something gets indexed doesn't mean Google values it and is willing to expose its customers to it.

The consistent problem with SEO is that most SEOs don't understand Google's business model. They don't understand Google is going to best serve its customers (i.e., those doing the search). SEOs (and their clients) need to understand that getting Google to index a turd isn't going to change the fact that the content and the experience i'ts wrapped in is still a turd. Google is not interested in pointing its customers to turds.

[+] AznHisoka|2 years ago|reply
Another easy way is to just tweet it, which works for me - they usually get indexed < 1 hour later. Google has access to tweets and the URLs in those tweets.
[+] 3abiton|2 years ago|reply
Paid API access?
[+] gasparto|2 years ago|reply
What happend to the good'ol sitemap.xml?

You'll probably find an npm package with lots of dependencies that'll generate that sitemap for you if that's what you need...

[+] sjwhevvvvvsj|2 years ago|reply
I’m failing to see how this isn’t just “hey look at my sitemap”!
[+] mortallywounded|2 years ago|reply
Is this any different from logging into the Google search console and submitting your sitemap/index request?
[+] rkuykendall-com|2 years ago|reply
I submitted 1,900 pages in September and it has yet to look at 600. It did 4 this month.

I wish I had been more picky with my sitemap but I thought including all URLs was the goal. I at least properly weighted them but that doesn't seem to do much.

[+] tomschwiha|2 years ago|reply
Submitting manually takes sooooo long.
[+] goenning|2 years ago|reply
same result from what I've seen, but not scalable for larger amount of pages
[+] dakiol|2 years ago|reply
Is there something for the opposite? I don't want google (or any other scrapper) to index my website. Afaik, robots.txt is not authoritative.
[+] speedgoose|2 years ago|reply

    sudo iptables -A INPUT -p tcp --dport 80 -j DROP
    sudo iptables -A INPUT -p tcp --dport 443 -j DROP
    sudo ip6tables -A INPUT -p tcp --dport 80 -j DROP
    sudo ip6tables -A INPUT -p tcp --dport 443 -j DROP
That should do.
[+] ahmedfromtunis|2 years ago|reply
From Indexing API documentation:

> Currently, the Indexing API can only be used to crawl pages with either `JobPosting` or `BroadcastEvent` embedded in a `VideoObject`.

So this might come with the risk of seeing the site you want to boost rather penalized by Google.

[+] goenning|2 years ago|reply
This is not a boost, Index != Ranking
[+] mvkel|2 years ago|reply
I recently launched a mini project and was shocked at how difficult and long it took to get any of its pages properly indexed on Google.

It's almost as if Google is actively trying -not- to index anything as a way to reduce spam, by forcing the people who really care to jump through 100 hoops.

A great way for the dark web remains dark.

[+] lobsterthief|2 years ago|reply
It just takes time. Getting people to link to it by sharing it in other channels will help to shorten the timeframe.
[+] callalex|2 years ago|reply
How long is long in your case?
[+] leros|2 years ago|reply
I just submit a sitemap URL to Google Search Console Tools. Is this any different?
[+] ninefoxgambit|2 years ago|reply
I’ve seen a lot of indie startups lately that are basically selling faster google indexing then you can get for free using google search console. I guess they are probably using this feature under the hood.
[+] dewey|2 years ago|reply
I've seen some people even wrapping and re-selling this as SaaS.
[+] nhggfu|2 years ago|reply
"to get your site indexed" => a nonsense claim

+ this technique might make engines aware of your content, but doesn't guarantee indexation whatsoever.

[+] ChrisArchitect|2 years ago|reply
? "what I've noticed"...Google only indexing if a site has backlinks or is submitted by owner. Uh..yeah, how else would google know about a new URL? C'mon. This just seems like the usual SEO obsession/grift with some 'secret' way to get things done. It's straightfwd these days. Are you saying none of the pages you're queuing up are linked to each other? Most cases they would be in some way right? So the spider will start indexing them all based on a top url submission or a few key urls. Do event/job board sites really need all of their pages to be indexed immediately?
[+] navigate8310|2 years ago|reply
So, Google stopped automating indexation because spam, humanity finds new way to resume automation to again propagate spam. It seems Google is trapped in its toxic game of search engine optimization.
[+] beeboobaa|2 years ago|reply
Google no longer finds new sites automatically? That might explain why it's been so trash the past few years.

I remember running a few websites back in the day, and with zero interaction with google all of the pages showed up in the search index a day or two after publishing at most.

[+] goenning|2 years ago|reply
The only possible outcome if for them to shutdown this API or make it work as documented. There's already at least 10+ SaaS offering this as service.
[+] RobotToaster|2 years ago|reply
Eh, you can get google to index your site by just submitting a site map, it just takes a little longer.
[+] niemal_dev|2 years ago|reply

[deleted]

[+] federalauth|2 years ago|reply
Thank you for sharing. Can you quickly explain why keeping an MD5 avoids the abuse issue?