Tweets count. HN posts also count (actually as high quality texts :). IOT devices reporting status based on the same templates should not qualify as unique pages (count the number of templates if you want). Now if I do a search for some news, many almost verbatim copies would show up. They should only count as one, as we are looking for unique texts!
fspeech|1 year ago