(no title)
netnichols | 10 years ago
My first guess would be that they snapshot the DOM in the JS tick immediately after window.onload completes. Maybe they have a short pause to let any fast timeouts or callbacks complete, but there's got to be a cutoff at some point (e.g. to stop an infinite wait for pages that continuously update a relative date). Of course, with their own JS engine, I bet they can get really fancy with the heuristics to determine when to take that snapshot.
KMag|10 years ago
If they're smart, they actually make the exact timeout a function of a HMAC of the loaded source, to make it very difficult to experiment around, find the exact limits, and fool the indexing system. Back in 2010, it was still a fixed time limit.
Source: executing JavaScript in Google's indexing pipeline was my job from 2006 to 2010.
blumkvist|10 years ago