top | item 40790143

(no title)

Q_is_4_Quantum | 1 year ago

Random question from a very-much-not-computer savvy person on the off chance someone cares to answer: If a tiny charge was levied every time a webserver delivered a page to me would it cure this kind of problem? I'm imagining e.g. my browser has to send some crypto of some variety (or guarantee that it will in the future so as to not slow things down).

discuss

order

thephyber|1 year ago

I work in cybersecurity and we need to be able to get info about CVEs extremely quickly from all major software vendors. The problem is this means we need to either subscribe to websites for webhooks (async notifications sent to us when an event happens) or poll a website. It turns out that polling is extremely inefficient, but is how it works most of the time because most of the websites we watch don’t support webhooks/push style notifications.

We create bot traffic, but we don’t want to. The problem is that the data we want isn’t available when we want it (we can’t wait days/weeks for the central CVE db to unembargo CVE records which have high impact) and isn’t delivered to us. Instead, we have to go through lots of effort to go get it. So we create a resilient crawler. And other similar companies / entities do too. Now we are all competing to get the same info in a short time, so we poll the sites too often. This all becomes a stress on the websites we hit.

All because the info should be open, but the companies with the info don’t want to build the most efficient system to distribute it. And there is probably legal liability for a middleman company to just crawl those websites and build a shim webhook system to push data as soon as it is found to webhook subscribers.

mike_hearn|1 year ago

No because the bots are running on compromised machines, so they'd just steal the real owner's micropayments.

It'd also entrench search monopolies even harder, because everyone would exempt Google/Bing because they want to get indexed, but they wouldn't exempt other bots like the one you need for your new engine.

malcolmgreaves|1 year ago

This was actually an idea that came along with the original internet. Coined “micropayments,” it was an idea to enable content consumers to pay content producers. A little bit (fractions of a cent) times a lot of people was the gist of it.

oneshtein|1 year ago

A subscription model is a better business than a tiny charge, because expenses will be the same, but ROI will be much larger. Even with subscription, many web services are not profitable.

thephyber|1 year ago

Micropayments never took off. Advertising / freemium did. Why?

If your server doesn’t serve responses unless someone pays, then there is the problem of uncertainty for the client — how do I know the content behind the paywall is worth it?

Nearly all of the services we use that index the web are free/cheap and require the ability to crawl the web without logging into services. Search engines like Google, Bing, Yandex, Baidu. LLMs like ChatGPT piggyback on CommonCrawl, in addition to paying for large expensive data contracts from companies like Reddit.

We have a word for the part of the internet that is walled off from open crawling — the Deep Web.