top | item 42825249

(no title)

benwills | 1 year ago

For people interested in this sort of thing, I recently published a blog post looking at counts of HTML tags and their attribute values from a 2.9B page Common Crawl dataset. [1]

There's also a SQLite DB available to download of the top 1k tag+attr+value combinations. [2]

[1] https://webparsing.io/blog/hidden-in-html-parsing-page-layou... [2] https://webparsing.io/data/commoncrawl-2024-11-html-tags-att...

discuss

order

jamesfinlayson|1 year ago

I think someone who works on Chrome did something a few years ago - though I can't remember exactly what they were trying to figure out.