top | item 44944761

(no title)

tolmasky | 6 months ago

> I specifically want a search engine that does not run JavaScript, so that it only finds documents that do not require JavaScripts to display the text being searched. (This is not the same as excluding everything that has JavaScripts; some web pages use JavaScripts but can still display the text even without it.)

Sure... but off-topic, right? AI companies are desperate for high quality data, and unlike search scrapers, are actually not supremely time sensitive. That is to say, they don't benefit from picking up on changes seconds after they are published. They essentially take a "snapshot" and then do a training run. There is no "real-time updating" of an AI model. So they have all the time in the world to wait for a page to reach an ideal state, as well as all the incentive in the world to wait for that too. Since the data effectively gets "baked into the model" and then is static for the entire lifetime of the model, you over-index on getting the data, not getting fast, or cheap, or whatever.

discuss

order

No comments yet.