top | item 12382058

Ask HN: If I wanted to make a web browser, where would I start?

45 points| avindroth | 9 years ago | reply

52 comments

order
[+] jbb555|9 years ago|reply
I'd think about it like this :-

1) Write socket code to retrieve the page from a server

2) Get it to draw the text in a window.

3) Write code to look for tags and start diving up the retrieved file into a proper DOM tree. Perhaps start by creating an object for each paragraph. Then maybe look for <b> tags, then <div> tags

4) Change the rendering code to draw the screen for the data structure you created above.

5) Iterate a lot. Add support for parsing more and more tags into the data model. Start adding annotations for various attributes.

6) Start improving the layout code. Draw things in <b> tags in bold. Draw things in <div> tags under each other and <span> tags separately.

7) Add limited CSS support. Allow just width and height and border sections. make the code read the css file and attach the attributes to the correct sections of the DOM document you are creating.

8) Improve the layout code to look at the width and height tags on each element, and where tags are nested to propogate the width and height information up and down the tree as needed. Draw borders if the css tells you to. Look at font weight etc.

9) Iterate. Add one feature at a time. Repeat for a year until you have an browser that can render basic pages.

10) Iterate some more. Read specs. Rewrite your document to DOM tree parse a number of times perhaps using a more formal grammar. You'll probably be on about year 8 by now.

11) Add javascript support... :P

Perhaps my point is though that this seems something where its very easy to start small, just render the text of a page, and incrementally add features and improve the layout. It will take a long time though as there are so many features to add.

[+] d33|9 years ago|reply
Is it still even doable for a single person to write a new rendering engine from scratch? I'm getting the impression that right now it's all about Webkit, Gecko and IE and given how complex things got, I imagine that it would cost an enormous sum of money to write something that, say, can display top 10 Alexa web pages. Things just got too complex.

Perhaps it would be better to have some effort in documenting Webkit code so that more hackers could actually read it and hack on it?

[+] kens|9 years ago|reply
The above list is a good start. I'd add: 2a) make clicking on links work, and 3a) display images. Those two features will make it much more fun.

"How to make a browser" is something I've thought about for a while, ever since someone suggested it as an interview question. (I think it would be an awful question.) It really depends on what you're trying to do. If you want to understand how browsers work and build a toy browser, follow jbb555's list. If you want to display web pages, use WebKit. If you want to build a real, commercial-scale browser, I recommend a large team, since there are a lot of components, each of which is insanely complex. (Everything from CSS support to security to plugins to JavaScript to bookmarks to a debugger to all the new network protocols.)

One more thing to add to jbb555's list: 0) "telnet news.ycombinator.com 80, GET /" - playing around from the command line can give you a good introduction to what really happens when you access a web page. (Edit: HN returns an error page for a plain GET. google.com returns insane code. apple.com looks like a better place to start.)

[+] asimuvPR|9 years ago|reply
Pretty much this.

- Write the networking code to retrieve data from servers. I'd start with GET requests to keep it simple. This part is not that bad and its actually pretty fun. Doing it in Python is very simple and straightforward.

- Write browser GUI because its a native application. This depends on what language you use for the task. If Python, you can get away with tkinter (flame suit on) for a very simple prototype. I believe tkinter is single threaded and that may turn out to be an issue (don't remember).

- Write a simple rendering engine to display the text. No need to do this inside of the GUI code because the output can be read through the console. You are pretty much reading the raw html and extracting the bits you want.

- Plug in the rendering engine into a GUI widget. Since this is a (assumed) read-only browser, you can get away with a simple textbox of sorts. Anything else will require a custom widget.

At this point you have a very minimal browser. Adding CSS, javascript, etc requires that you include a pipeline to route data through the proper channels before it reaches the GUI. This is where the bulk of the work goes. Bring beer. :)

[+] wslh|9 years ago|reply
You have mainly two options: do everything from scratch or reuse. Starting from scratch is almost impossible to do alone, just Chrome was developed by different teams with different expertises, some of them acquired from other companies.

The second option is reusing a web component or browser and modify it for your purposes. Building on top of Chrome or Firefox (or their variants) is a good choice. I think Chrome is better engineered than Firefox and the code is clear. Obviously this is just an opinion.

Now the question is what do you want to add to such browser? Is this something you can add via an extension or is something that requires a new direction?

[+] sglane|9 years ago|reply
>Chrome was developed by different teams with different expertises, some of them acquired from other companies.

To expand on this, Chrome and Safari use webkit as their rendering/layout engine which is a fork of KDE's KHTML. IE came from NCSA Mosaic-> Spyglass -> Microsoft IE.

To be clear on my suggestions below, let's clarify what a browser actually is. The browser manages things like the navigation bar, bookmarks, download/mime handling, etc. This is known as the "chrome" or GUI. The layout engine handles HTML parsing, rendering, possibly is coupled with a Javascript engine like V8.

If you are trying to write a browser, the answer is to use webkit2 in whatever language you like. It's fast, light, and portable. If you are trying to write a layout engine (rendering engine), do as another poster suggested and write one to target HTML 3.2.

[+] basch|9 years ago|reply
chrome/blink was built on webkit which was a fork of khtml. none of it was from scratch within the last decade.
[+] pmontra|9 years ago|reply
As somebody already wrote "Starting from scratch is almost impossible to do alone". However you could set a first goal of doing some basic rendering of plain HTML, even text based. That should make you familiar with the hurdles of a rendering engine. Then apply some CSS. Even writing a browser for HTML 3.2 from scratch and no JavaScript should be an exercise to be proud of https://www.w3.org/TR/REC-html32 (1997). Then start looking at modern browsers using the links others provided. You should have the experience to understand all the moving parts by then.

This is similar to writing the very first web application without using any framework, maybe even interfacing directly with the web server. I remember decoding CGI arguments in C in 1994 before discovering Perl's CGI.pm. Then you understand the magic inside frameworks and don't get burned by surprises.

[+] retro64|9 years ago|reply
I did this once back in 2003. It was text only rendered, with mouse support so you could click on the links instead of tabbing around like Links.

I built it off of the IE object model (Internet Explorer Browser Extensions? Can’t recall). I think I just used the object model for connectivity settings. If you could get to the net using IE, my browser would work as well.

I think I coded to HTML 1.1 specs, and had it working after 2 months (with a few minor rendering issues). It did not support scripting, CSS or really much of anything (although it could POST), but it was surprising how much of the web was accessible. And it was really fast. Not because of my skillz, but when you cut out the garbage, it’s amazing how fast a page will load.

Anyway, I would not recommend this approach, but I would offer you encouragement. HTML parsing isn’t really that big of a deal, and you could get something very minimal (text rendering) working in short order. It would at least give you a feeling of what you are up against.

[+] knabacks|9 years ago|reply
My 2 Cents:

Start out with a solid base: (for example)

- https://cefbuilds.com/

- https://crosswalk-project.org/

- http://electron.atom.io/

- http://nwjs.io/

or you try to help with existing projects: (for example)

- https://github.com/breach/thrust (started out with https://github.com/breach/breach_core a "javascript"-Browser (core is not js))

https://github.com/minbrowser/min

[+] michaelmior|9 years ago|reply
Breach seems like a poor choice since the project has been inactive for ~2 years.
[+] stellar678|9 years ago|reply
"If you wish to make a web browser from scratch, you must first invent the universe."
[+] p333347|9 years ago|reply
If you are seeking tutorials, or UI libraries etc, sorry, I can't help you with that as I am unaware of those things. However, if you are asking from which aspect to start building one from scratch, I would say start with layouting. Try implementing a tiny subset of div related things like position and display and see if this is what you want to spend your time working on. Even if you can crack this thing easily, remember, this is just the tip of the iceberg. I am not discouraging you, but know that you would have to digest the entire DOM and HTML spec, not to mention have a JavaScript engine and implement all the security related intricacies, at the least, for your software to become a web browser.
[+] anatoly|9 years ago|reply
I'm curious: is there a ready-to-use set of tests for a layout engine that checks conformance with the specs? Something that you could, without much effort, run against a toy layout engine you'd write?

I know Webkit/Blink have humongous test suites, but these probably aren't easy to adapt to a toy layout engine.

[+] chvid|9 years ago|reply
Back when I did CS 101 this was our final project; the teacher made the HTML-parser, and we students did the renderer and UI.

The HTML-parser is probably the part it makes sense to do first.

What language and platform do you plan to use? I think a modern OO-language will be helpful.

[+] _RPM|9 years ago|reply
Just a note: The browser as we use it today is a result of 20+ years of human work and innovation. It's what you would call a killer application.
[+] iamben|9 years ago|reply
Equally - the car we drive today is a result of 130 years of human work and innovation. But after all that time we're seeing a real shake with the electric car. Things change when people try. I say good luck to the girl/guy - perhaps they'll create something incredible :-)
[+] txutxu|9 years ago|reply
Well, this is what I could do in that (or almost any) case:

1) Setting features, key points, points to avoid, and goals of the project.

2) Choosing the toolchain to use, for resolving the points at 1)

3) Coding the features with that toolchain

Step 2, may involve: to read a lot (doc, rfc's, specifications, using search engines with keywords, search for "problems", etc) to look at other people's source code of similar programs, to try other people's tools and libraries and evaluate and compare them, to make integration tests, and to make Proofs Of Concept prototypes.

The toolchain (programing language, user GUI, etc) comes with an ecosystem (I hope you don't aim at rewriting also your $lang libraries for validate XML or interact with devices), so you have to choose looking at both.

To separate steps 2 and 3, use iterations, while you have resources.

If you find an opensource product that full-fills all your definitions of point 1), except a few of them, you can evaluate to start by a patch and a pull-request to them, or a fork :P

[+] WayneBro|9 years ago|reply
Do you want to build a web browser or a web browser engine?

If you just want to build a web browser, here's my request for a new browser for iOS since web browsing on iOS is a second or third rate experience:

- Built-in ad-blocking.

- A way to disable JavaScript immediately for any website.

- A desktop mode that actually works, so that YouTube, Reddit, Imgur and other sites don't redirect me to their crappy mobile site without asking.

- Launches a completely separate app for private browsing.

Right now I use Dolphin browser on iOS and despite being in desktop mode, I still have to refresh every YouTube page that I land on in order to get the desktop style native HTML5 video player that allows me to go full-screen. On Imgur, gifv files don't play - I have to change the extension to gif and then wait 60 seconds as it downloads. Lots of other sites have similar types of issues. Meanwhile, gifv files work fine in Brendan Eich's "Brave" browser. It could be as easy to fix as just changing the user-agent.

Dolphin has good ad-blocking, but there is no way to quickly turn off JavaScript for a given page or domain (ala Quick JavaScript Switcher for Chrome).

The thing that really annoys me though about Dolphin is that Private Browsing does not work at all. That's probably because they want you to buy Dolphin Zero for $2.99.

[+] speg|9 years ago|reply
I checked out Brave yesterday for the first time and I think it does most of those things.
[+] softawre|9 years ago|reply
3 dollars? It really annoys you and you can't be bothered to spend 3 dollars?
[+] hugozap|9 years ago|reply
I'd say, if you have the opportunity and want to go wild skip the DOM and create an equivalent model using OpenGL. All the other Web platform features like the distribution model, resource streaming etc can still be used. It's just that the DOM is the great bottleneck and most of the hours devoted to improve the Web are being wasted on sorting quirks related to the DOM. Maybe yours will be the starting point towards a DOM-free Web! :)
[+] IncRnd|9 years ago|reply
You can accept entered URLs and fetch & display web pages quite easily nowadays. You should, depending on the vertical, be very concerned about security.

What are your goals (to a greater degree than "web browswer")?

[+] bigato|9 years ago|reply
1. Write a webassembly interpreter/vm; (main step)

2. Using webassembly, write an html interpreter/renderer;

3. Using webassembly, write an css, js ... interpreter/renderer;

4. Keep doing this for all w3c standards for the rest of your life.

Chances are that by the time you are somewhere in the middle of step 2, people will have written the other interpreters in webassembly and the result of step 1 will be a functional browser already.