AOL 1.0–2.0 (1994–1995) used the AOLPress engine which was static with no programmable objects.
The ability to interact with the DOM began with "Legacy DOM" (Level 0) in Netscape 2.0 (Sept 1995), IE 3.0 (Aug 1996), AOL 3.0 (1996, via integrated IE engine), and Opera 3.0 (1997). Then there was an intermediate phase in 1997 where Netscape 4.0 (document.layers) and IE 4.0 (document.all) each used their own model.
The first universal standard was the W3C DOM Level 1 Recommendation (Oct 1998). Major browsers adopted this slowly: IE 5.0 (Mar 1999) offered partial support, while Konqueror 2.0 (Oct 2000) and Netscape 6.0 (Nov 2000) were the first W3C-compliant engines (KHTML and Gecko).
Safari 1.0 (2003), Firefox 1.0 (2004), and Chrome 1.0 (2008) launched with native standard DOM support from version 1.0.
Currently most major browser engines follow the WHATWG DOM Living Standard to supports real-time feature implementation.
The last time I checked, Dillo also has no DOM in any reasonable definition of the term; instead it directly interprets the textual HTML when rendering, which explains why it uses an extremely small amount of RAM.
This is pretty relelevant to a project I'm working on - a new web browser not based on Chromium or Firefox.
Web browsers are extremely complex, requiring millions of lines of code in order to deal with a huge variety of Internet standards (and not just the basic ones such as HTML, JavaScript and CSS).
A while ago I wanted to see how much of this AI could get done autonomously (or with a human in the loop), you can see a ten-minute demo I posted a couple of days ago:
It's only around 2,000 LOC so it doesn't have a lot of functionality, but it is able to make POST requests and can read some Wikipedia articles, for example. Try it out. It's very slow, unfortunately.
Took a quick glance through the code, its a pretty decent basic go at it.
i can see a few reasons for slowness - you arent using multiprocessing or threading, you might have to rework your rendering for it though. You will need to have the renderer running in a loop, re-rendering when the stack changes, and the multiprocessing/thread loop adjusting the stack as their requests finish.
Second, id recommend taking a look at existing python dom processing modules, this will allow you to use existing code and extend it to fit with your browser, you wont have to deal with finding all the ridiculous parsing edgecases. This may also speed things up a bit.
Id also recommend trying to render broken sites (save a copy, break it, see what your browser does), for the sake of completion
Cool project, thanks for sharing. HN readers should also check out https://hpbn.co (High-Performance Browser Networking) and https://every-layout.dev (amazing CSS resource; the paid content is worth it, but the free parts are excellent on their own).
HPBN is really well written, chapter 4 helped me understand TLS enough to debug a high latency issue at a previous job. There was an issue where a particularly incomplete TLS frame received and no subsequent bits for it led to a server waiting 30 min for the rest of the bits to arrive. HPBN was a huge help. I haven’t finished reading it but I remember there’s part of it that goes over the trade offs of increasing vs decreasing TLS frame sizes which is a low level knob I now know exists because of HPBN. Not sure if I’ll ever use it but it’s fascinating.
The step I am missing is how other resources (images, style sheets, scripts) are being loaded based on the HTML/DOM. I find that crucial for understanding why images sometimes go missing or why pages sometimes appear without styling.
Bit unfortunate that more than half of the page is dedicated to network requests, but almost all work and complexity of the browser is in the parsing and rendering pipeline.
Will cover the rendering engine in more details. I didn't know at what sections to go deeper. So just stopped and published it to gather more feedback.
I love the "mental model" approach here. Most guides I've seen either get bogged down in the minute details of TLS/Handshakes immediately or are way too high-level. The interactive packet visualization is a really nice touch to bridge that gap. Thanks for sharing!
Claims that browsers transform "d.csdfdsaf" -> https://d.csdfdsaf, but they don't. They only transform domains with valid TLDs, unless you manually add the URL scheme.
Who or what gets to say what a valid TLD is? Especially when people take advantage of their own local resolvers, they could create anything at any time.
I'd also like to suggest a little more work on the URL parsing (even though most users probably won't enter anything that will be misinterpreted). For example, if a protocol scheme other than https:// or http:// is used, the browser will probably still treat it specially somehow (even though browsers typically seem to support fewer of these than they used to!). It might be good to catch these other cases.
Perhaps worth editing the DNS section in light of RFC 9460 ... depending on the presence and contents of the HTTPS RR, a browser might not even use TCP. Here's a good blog post surveying the contents of the HTTPS RR a few years ago. https://www.netmeister.org/blog/https-rrs.html
i have an even stupider question, which is what if we scrapped ip addresses and just used ethernet addresses to route everything? just make the entire internet network be one big switch.
i think the guy who created tailscale wrote about something like this...
domnodom|1 month ago
Early browsers without DOMs (with initial release date): WorldWideWeb (Nexus) (Dec 1990), Erwise (Apr 1992), ViolaWWW (May 1992), Lynx (1992), NCSA Mosaic 1.0 (Apr 1993), Netscape 1.0 (Dec 1994), and IE 1.0 (Aug 1995).
Note: Lynx remains a non-DOM browser by design.
AOL 1.0–2.0 (1994–1995) used the AOLPress engine which was static with no programmable objects.
The ability to interact with the DOM began with "Legacy DOM" (Level 0) in Netscape 2.0 (Sept 1995), IE 3.0 (Aug 1996), AOL 3.0 (1996, via integrated IE engine), and Opera 3.0 (1997). Then there was an intermediate phase in 1997 where Netscape 4.0 (document.layers) and IE 4.0 (document.all) each used their own model.
The first universal standard was the W3C DOM Level 1 Recommendation (Oct 1998). Major browsers adopted this slowly: IE 5.0 (Mar 1999) offered partial support, while Konqueror 2.0 (Oct 2000) and Netscape 6.0 (Nov 2000) were the first W3C-compliant engines (KHTML and Gecko).
Safari 1.0 (2003), Firefox 1.0 (2004), and Chrome 1.0 (2008) launched with native standard DOM support from version 1.0.
Currently most major browser engines follow the WHATWG DOM Living Standard to supports real-time feature implementation.
userbinator|1 month ago
krasun|1 month ago
logicallee|1 month ago
Web browsers are extremely complex, requiring millions of lines of code in order to deal with a huge variety of Internet standards (and not just the basic ones such as HTML, JavaScript and CSS).
A while ago I wanted to see how much of this AI could get done autonomously (or with a human in the loop), you can see a ten-minute demo I posted a couple of days ago:
https://www.youtube.com/watch?v=4xdIMmrLMLo&t=42s
The source code for this is available here right now:
http://taonexus.com/publicfiles/jan2026/160toy-browser.py.tx...
It's only around 2,000 LOC so it doesn't have a lot of functionality, but it is able to make POST requests and can read some Wikipedia articles, for example. Try it out. It's very slow, unfortunately.
Let me know if you have anything you'd like to improve about it. There's also a feature requests page here: https://pollunit.com/en/polls/ahysed74t8gaktvqno100g
CableNinja|1 month ago
i can see a few reasons for slowness - you arent using multiprocessing or threading, you might have to rework your rendering for it though. You will need to have the renderer running in a loop, re-rendering when the stack changes, and the multiprocessing/thread loop adjusting the stack as their requests finish.
Second, id recommend taking a look at existing python dom processing modules, this will allow you to use existing code and extend it to fit with your browser, you wont have to deal with finding all the ridiculous parsing edgecases. This may also speed things up a bit.
Id also recommend trying to render broken sites (save a copy, break it, see what your browser does), for the sake of completion
GaryBluto|1 month ago
https://grail.sourceforge.net/
unknown|1 month ago
[deleted]
chrisweekly|1 month ago
konaraddi|1 month ago
KomoD|1 month ago
utopiah|1 month ago
I'm wondering if examples with Browser/Server could benefit from a small visual, e.g. a desktop/laptop icon on one side and a server on the other.
krasun|1 month ago
Thank you! It is a good suggestion. Let me think about it.
arendtio|1 month ago
The step I am missing is how other resources (images, style sheets, scripts) are being loaded based on the HTML/DOM. I find that crucial for understanding why images sometimes go missing or why pages sometimes appear without styling.
krasun|1 month ago
Thank you!
philk10|1 month ago
krasun|1 month ago
edwinjm|1 month ago
krasun|1 month ago
Thank you!
LoganDark|1 month ago
A1aM0|1 month ago
LoganDark|1 month ago
ranger_danger|1 month ago
krasun|1 month ago
schoen|1 month ago
https://en.wikipedia.org/wiki/List_of_URI_schemes
jeffbee|1 month ago
englishcat|1 month ago
What's your next steps, do you plan to add more details of the reflow process?
amelius|1 month ago
Posts like this are the modern version of that.
vivzkestrel|1 month ago
webdevver|1 month ago
i think the guy who created tailscale wrote about something like this...
GaryBluto|1 month ago
raghavankl|1 month ago
henrygallen|1 month ago