Just tested it by searching for “spline”. This is great! Can someone elaborate on how it was made? Specifically, how it integrates with Google Scholar. Is that done client side or server side? If server side how come it hasn’t been blocked by Google seeing as Google don’t seem to like robots using their regular search function so my guess would be they wouldn’t like it for Scholar either. Perhaps it is proxying requests and would also pass on any CAPTCHAs presented? Still in that case I would expect all requests to get hit with a CAPTCHA. Perhaps it just hasn’t had enough traffic yet?
Likewise, the traditional publishers often respond to demands by funders to make research available by e.g. allowing researchers to share their work elsewhere, and often only after a year or so after publication [0]. This makes the barrier to do so higher, and makes the research less findable. It's not odd to expect that when initiatives like Unpaywall [1] make that research more discoverable, things like embargo periods will get worse.
> If sci-hub is going to scrape publishers, they could put in a bit more effort.
+1, especially for the Onion site. Onion service supposed to be a primary mean to host uncensored websites instead of having to look for the latest domain name everyday, unfortunately it seems nobody cares about it. Most of the time I access it from my browser, the front-end proxy was malfunctioning, or the back-end Tor daemon has dead... Tor network itself do have capacity problem, but they could do much better than a broken front-end proxy... e.g. with Onion Balance.
At a glance it looks like it's really just a proxy, that was limited to scholar.google.com and mutates the page slightly (adds a header, sci-hub links).
> Too much attention is a bad thing, Sci-Bay decides to stop service for a while. Sorry.
Apparently I was not wrong.
This could be developed as a browser plugin that would be much harder or almost impossible for Google to prevent. Well, a Firefox browser plugin, a Chrome browser plugin presumably they wouldn't allow.
The page's HTML is the API. It's pretty easy to download a web page, parse the HTML and then extract specific bits of information from it. The browser does the same thing on the user's behalf, which is why it is called the user agent.
I believe people in academia are paid to write these books anyways, so they might as well not receive the royalties. As a prospective academic myself, I find it unethical.
I don't know how they are doing it, but Google Scholar does not have an API, and scraping is against their TOS.
> Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
Despite this, there is scholar.py [0], which can extract files from Google Scholar, though it explicitly doesn't work around the rate limits.
"See you later
Too much attention is a bad thing, Sci-Bay decides to stop service for a while. Sorry. Anyone who knows how Sci-Bay works and wishes this tool benefits more academics, please contact: [email protected]"
[+] [-] bringtheaction|8 years ago|reply
[+] [-] samat|8 years ago|reply
Rightholders do not fear torrents as long as they are unusable for the general population.
The second they see something usable — they go berserk.
Gonna need some popcorn to watch this one.
[+] [-] Vinnl|8 years ago|reply
[0] https://medium.com/flockademic/how-open-can-open-access-be-c...
[1] https://unpaywall.org/
[+] [-] lsh|8 years ago|reply
For example this (which sucks): https://sci-bay.org/article?link=https://www.ncbi.nlm.nih.go...
Versus the actual article: https://elifesciences.org/articles/24234
[+] [-] n4r9|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] agumonkey|8 years ago|reply
[+] [-] Vinnl|8 years ago|reply
I should also add that I am also working on a project to incentivising authors to make their work freely available: https://flockademic.com/
(More info here: https://medium.com/p/the-holy-grail-in-open-access-sharing-t... )
[+] [-] bcaa7f3a8bbc|8 years ago|reply
+1, especially for the Onion site. Onion service supposed to be a primary mean to host uncensored websites instead of having to look for the latest domain name everyday, unfortunately it seems nobody cares about it. Most of the time I access it from my browser, the front-end proxy was malfunctioning, or the back-end Tor daemon has dead... Tor network itself do have capacity problem, but they could do much better than a broken front-end proxy... e.g. with Onion Balance.
[+] [-] jrochkind1|8 years ago|reply
I don't see this lasting long...
[+] [-] gpm|8 years ago|reply
Does google generally block proxy servers?
[+] [-] jrochkind1|8 years ago|reply
> See you later
> Too much attention is a bad thing, Sci-Bay decides to stop service for a while. Sorry.
Apparently I was not wrong.
This could be developed as a browser plugin that would be much harder or almost impossible for Google to prevent. Well, a Firefox browser plugin, a Chrome browser plugin presumably they wouldn't allow.
[+] [-] matheusmoreira|8 years ago|reply
[+] [-] danielecook|8 years ago|reply
[+] [-] mchannon|8 years ago|reply
[+] [-] moomin|8 years ago|reply
[+] [-] PokemonNoGo|8 years ago|reply
>https://sci-bay.org/scholar?hl=en&as_sdt=0%2C5&q=entropy+sha...
-> Please show you're not a robot
[+] [-] xstartup|8 years ago|reply
https://sci-bay.org/article?link=https://pdfs.semanticschola...
[+] [-] gpm|8 years ago|reply
[+] [-] Myrmornis|8 years ago|reply
However, note that they are very anarchic when it comes to commercial books, not just journal articles!
E.g. from the Sci-Bay search results, this is $131 on amazon.com, and quite possibly the authors do want the royalties.
[BOOK] Intelligent optimisation techniques: genetic algorithms, tabu search, simulated annealing and neural networks D Pham, D Karaboga - 2012 - books.google.com ... Cited by 916 Related articles All 3 versions [Download Book]
[+] [-] gkya|8 years ago|reply
[+] [-] abhishekjha|8 years ago|reply
[+] [-] petra|8 years ago|reply
And about the future, how do you see google responding?
[+] [-] shakna|8 years ago|reply
> Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
Despite this, there is scholar.py [0], which can extract files from Google Scholar, though it explicitly doesn't work around the rate limits.
[0] https://github.com/ckreibich/scholar.py
[+] [-] s2th4d|8 years ago|reply
"See you later Too much attention is a bad thing, Sci-Bay decides to stop service for a while. Sorry. Anyone who knows how Sci-Bay works and wishes this tool benefits more academics, please contact: [email protected]"
[+] [-] aysus|8 years ago|reply
[+] [-] jmnicholson|8 years ago|reply
[+] [-] rkskejfj|8 years ago|reply
[+] [-] unknown|8 years ago|reply
[deleted]
[+] [-] lihan|8 years ago|reply
[+] [-] irundebian|8 years ago|reply
[+] [-] tomrod|8 years ago|reply
[+] [-] eruci|8 years ago|reply