(no title)
ttgurney | 3 years ago
I do really appreciate that this one uses libcurl on the backend. Surprisingly few browsers do this--Lynx, Links, and w3m all have their own networking code. They have bespoke HTML parsing and rendering as well. I'm lately thinking I want to see a text-mode browser that just glues together libcurl, curses, simple HTML rendering, and maybe an existing HTML parsing library. No text-based HTML rendering library exists that I'm aware of.
Also these classic text browsers have their own implementations of FTP, NNTP, and some other legacy cruft. I'm thinking most of this could easily be provided by libcurl (if at all).
shiomiru|3 years ago
I had a similar idea a while ago, except mine was to glue together components from the nim stdlib.
So I wrote something like that, then I thought "hey, why not implement some CSS too?" and that sent me down the rabbit hole of writing an actual CSS-based layout engine... I eventually also realized that the stdlib html parser is woefully inadequate for my purposes.
In the end, I wrote my own mini browser engine with an HTML5 parser and whatnot. Right now I'm trying to bring it to a presentable state (i.e. integrate libcurl instead of using the curl binary, etc.) so I can publish it.
Anyways, if there's a moral to this story it's that writing a browser engine is surprisingly fun, so go for it :)
ttgurney|3 years ago
> Anyways, if there's a moral to this story it's that writing a browser engine is surprisingly fun, so go for it :)
Good to know. I'd been fairly intimidated by the idea.
augusto-moura|3 years ago
ttgurney|3 years ago
Actually I would not be surprised if the JavaScript engine can be omitted with just a little bit of patching work... assuming there's not actually a build configuration that leaves it out. I've found that with some software projects and their dependencies, "required" does not always mean required.
smaudet|3 years ago
Makes more sense, that's what this guy does anyways with the js engine?
> Surprisingly few browsers do this--Lynx, Links, and w3m all have their own networking code
I think people are suspicious of curl because it is a common utility, and they think it can't possibly have got it right - plus there's something mildly fun about figuring out how to monitor a socket and send/receive IP packets for the first time.
I have played around a bit with the Curl code a bit, in part I also suspect other programs do it to get "closer" i.e. being able to manage/dispatch events from a thread directly instead of some signal from a curl thread, probably something about security and thread safety too...
shiomiru|3 years ago
w3m even uses its own regex engine for search, because there was no free regex engine with Japanese support the author could've used back then.
1vuio0pswjnm7|3 years ago
https://github.com/curl/curl/commit/68ffe6c17d6e44b459d60805...
https://www.cvedetails.com/product/25084/Haxx-Curl.html?vend...
Instead of only "thinking a lot about text-based browsers", I have been actively using them on a daily basis for the past 26 years.
Links already uses ncurses. I am glad that it does not use libcurl and that it has its own "bespoke" HTML rendering. In over 25 years time, I still have yet to see any other program produce better rendering of HTML tables as text. I have had few if any problems with Links versions over the years. I am quite good at "breaking" software and for me Links has been quite robust. The source code is readable for me and I have been able to change or "fix" things I do not like, then quickly recompile. I can remove features. Recently I fixed a version of the program so that a certain semantic link would not be shown in Wikipedia pages. No "browser extension" required.
Links' rendering has managed to keep up with the evolution of HTML and web design sufficiently for me. Despite the enormous variation in HTML acrosse the www, there are very few cases where the rendering is unsatisfactory.^1 I cannot say the same for other attempts at text-only clients. W3C's libwww-based line-mode browser still compiles and works,^2 although I would not be satisifed with its rendering. Nor would I be satisfied with edbrowse, or something simpler such as mynx.^3
I use Links primarily for reading and printing HTML. I use a variety of TCP clients for making HTTP requests, including djb's tcpclient which I am quite sure beats libcurl any day of the week in terms quality, e.g., the programming skill level of the author and the care with which it was written. This non-libcurl networking code is relatively small and does not need oss-fuzz. I do not intentionally use libcurl. It is too large and complex for my tastes. For TLS, I mainly use stunnel and haproxy.
1. One rare example I can recall is https://archive.is
2. https://github.com/w3c/libwww
3. https://github.com/SirWumpus/ioccc-mynx
ttgurney|3 years ago
I agree that curl is pretty big and bloated. I would not call it a deficiency that Links et al. don't depend on it.
I mostly just was thinking that since I already have curl on my system, it'd be nice to have a browser that reuses that code. Especially since curl has upstream support for the much smaller BearSSL rather than depending on OpenSSL/LibreSSL.
marttt|3 years ago