top | item 28267863

I recommend CGI instead of web frameworks

77 points| _fnqu | 4 years ago |halestrom.net | reply

86 comments

order
[+] simonw|4 years ago|reply
CGI really is a beautiful abstraction.

The reason we stopped using it 15+ years ago was performance: forking a new process for every incoming web request just didn't make sense on ~2000 era hardware.

I wonder how true that is today, given that our machines have vastly more RAM and CPU?

If you squint at them the right way AWS Lambda functions are pretty similar to the CGI model.

[+] user5994461|4 years ago|reply
It's unusable today, more so than it was a decade ago:

- modern frameworks have a much higher startup time (see Python imports, Java VM and other). CGI was fine to run a perl script with no dependencies.

- it prevents any form of caching. caching is very important for many use cases.

- it requires to open a fresh connection with every request, to the database and elsewhere (too bad if you thought you could use redis for caching).

- SSL is everywhere and it has a notable overhead on initialization, meaning you really wish you could reuse connections. (Not just the databases but to other API services)

Speaking from experience, I've inherited the CGI platform at JP Morgan (the largest US bank), that went back to 2006 and approached a thousand running applications at some point. It works and it was still the easiest way to deploy any application in 5 minutes 2 decades later but the drawback are real. It's only for short scripts that can tolerate a 5 second startup time and zero caching.

https://thehftguy.com/2020/07/09/the-most-remarkable-legacy-...

[+] pjc50|4 years ago|reply
The problem has got worse rather than better; the machines are approximately 1 million times more capable in terms of raw MIPS, but the runtime startup cost of the languages used has increased, so the wall clock time to start a request has gone up.

2000 era CGI tended to be in Perl or PHP. Even then Java made an appearance, with a long-lived host process, e.g. Websphere.

One advantage which the author gets right is that CGI consumes no resources while not serving requests, so you can have a lot of different CGI programs on the same machine. It also plays nicely with multi-user systems, so an ISP could host a lot of different customers on one box. We could serve CGIs from several hundred users off a 128MB Pentium system, for significantly less resources than one Slack instance.

[+] yrds96|4 years ago|reply
Fastcgi maybe? It's a little more complex but doesn't spawn process like cgi
[+] zozbot234|4 years ago|reply
Forking a new process for every incoming web request might just make a lot of sense on ~2020 era, Spectre- and Meltdown-prone hardware. And so the Wheel of Samsara turns.
[+] Tomte|4 years ago|reply
What are typical process creation times nowadays on Linux?

I see that starting up a Python interpreter and running a Python program (or Ruby or whatever…) might be slow, but how far can I get with a golang, C or ocaml binary?

I don't know, but I'd expect the Linux people to have optimized that to death.

[+] midasuni|4 years ago|reply
I use CGI all the time. But I’m not trying to serve 2000 users a minute.
[+] throwawayboise|4 years ago|reply
Like writing C in a simple console-based editor and compiling it with cc on the command line, writing raw CGI scripts like this is a great way to start and learn underlying principles. You can build some simple, even useful things and understand every bit of what is happening.

It doesn't scale to what we do on the web today, of course.

It's a good place to start. Not a good place to stay.

[+] midasuni|4 years ago|reply
Who’s we? Not everyone is writing services to scale to a billion users.
[+] littlecranky67|4 years ago|reply
I love the simplicity of CGI, it was ~2002 when I read about it in a book on Linux and had my first server-side generated HTML output a couple of minutes later. But: It has no place in todays world except of educational use. Especially the fact, that every HTTP request would spawn a full new process makes it unfeasible for any serious webapp.
[+] bachmeier|4 years ago|reply
> It has no place in todays world except of educational use.

That's overly strong. I will create a CGI app every now and then. I can use basically any language I want. Not a lot of thought has to go into it. As the article says, it's a simple approach that makes sense for those of us that aren't so familiar with web development (we're usually making those apps for ourselves). In particular, the "you can never have too many dependencies" philosophy of modern web developers is strange to me.

[+] wazoox|4 years ago|reply
Hum, I beg to differ. I write small utilities with web frontends that run on my PC's local apache using CGI. A couple of devs from my team and I are the only persons susceptible to run them, and it's plenty fast enough.

Even for a web-facing tool, as long as it doesn't receive more than a couple requests per second CGI is good enough, and that probably represents most web apps in existence. YAGNI principle applies, too. Most people don't work at Google of Facebook or hyper-scaling startups. A fantastic number of real-world development consists in building boring web front-ends for in-house use of random companies; and 99% of accounting forms or support tickets won't require huge performance, even running with CGI.

[+] mxb|4 years ago|reply
I'll fully admit I'm not as well versed in web programming as it's not my specialty, but aren't serverless functions (e.g. AWS Lambda) essentially the same concept?

I understand that they can solve the problem of horizontal scale as they are spawning a container rather than just a process, but surely if you started with CGI scripts it would be easier to move if you needed to at a later date.

[+] teddyh|4 years ago|reply
How about FCGI?
[+] tored|4 years ago|reply
This is basically what PHP is. PHP can be deployed in several ways, CGI, FCGI or as threaded module in apache, but that rarely matters from the perspective of the programmer, the code will work the same regardless what you put in front of it but still keep CGI-like execution. This is why PHP is a perfect match for the web.
[+] jlundberg|4 years ago|reply
Plain PHP without frameworks is indeed a good choice in the same ball park as plain CGI.

And on the plus side it scales to bigger things thanks to FastCGI with for instance nginx.

At work, every new team member get the task to add their profile to our team rooster. Works great thanks to the low barriers of changing a single PHP file and the low risk of breaking the whole web page is great especially for juniors.

[+] Gaelan|4 years ago|reply
There's an important point I haven't seen discussed yet: I don't believe it's possible to write secure web apps without a templating engine with safe-by-default XSS handling (i.e. interpolated text is sanitized, unless explicitly marked as trusted HTML somehow), which implies some amount of a web framework or at least a web-specific library.
[+] ozim|4 years ago|reply
I think important point is distinction between:

a) static documents - basically HTML+CSS and no scripts on the back-end

There is not much to discuss here a lot of stuff could be just that but people don't want to write blog posts directly in html :) We have static site generators that are doing great so it seems to be working well.

b) dynamic documents - you get data from DB based on query, like list of phone numbers in Texas and want to find specific city

Static page generators would not be that useful if one wants to reflect changes from db. Queries are also nicer in db than having insane long list on page with CTRL-F. I would say a CGI only thing would work great for such use case. You probably want to thing about SQL injections but as it is browse only then might not be an issue.

c) web applications - here people want all bells and whistles

Security is important as you probably need authentication and preventing XSS is quite important here. I would never build web application with only CGI - security headers are not that hard to add. But authentication and authorization + XSS prevention is really hard. Then you have lots of requests that send/filter data. You can have problems with SQL injection as you have to store some users and their passwords and their data, framework+orm helps preventing a lot of troubles. One probably should not use a framework for making his blog/static-page. Unfortunately nowadays most people build web apps.

This is what rubs me with posts about "you don't need X framework, it all should be static documents", well yes you don't need big framework if you build personal website. You probably need one if you build a web app. Downside is we have HTML+CSS as interface that was designed as document framework and not as application interface building framework. That is why we need a back end and front end frameworks.

[+] bawolff|4 years ago|reply
If we're talking something written in c, its at least as easy as not having memory safety vulnerabilities ;)
[+] wazoox|4 years ago|reply
For these there's good old "perl -T" :)
[+] speedgoose|4 years ago|reply
"premature optimisation is the root of all evil" for sure, but getting a denial of service by the Google bot scanning your website, someone running Apache Bench from a smartphone on a GPRS network, or an unfortunate infinite loop client side (it happens) is not great.

I think it's fine to play with CGI to learn, but I wouldn't push that to production.

[+] RedShift1|4 years ago|reply
All those things can happen without CGI?
[+] wsostt|4 years ago|reply
I'll always have a soft spot for Classic ASP[0]. Makes me dream of a simpler time developing CRUD apps.

[0] https://en.wikipedia.org/wiki/Active_Server_Pages

[+] jon-wood|4 years ago|reply
I really wish I’d had a chance to use ASP in an environment doing it properly, rather than the cowboy atmosphere of a tiny web agency. If I understand correctly the idea was that you’d use a proper language for your core business logic which then gets compiled into DLLs loaded by your ASP application, which could then use VBScript for the simple template logic.

Sadly I was at an agency which (I suspect like many other places) just threw stuff like establishing database connections straight into the templates, and had ludicrously complex stuff being done in VBScript, a language designed for light automation.

[+] beckingz|4 years ago|reply
Just use long variable names please. Had to debug an ASP site written in 2019 that used classic three letter variable names for everything and it was rough.
[+] laurencerowe|4 years ago|reply
The reason most Python folks moved away from CGI more than two decades ago now is that the performance is terrible. The interpreter startup time (plus any time executing module imports) has to run per request. This is much less of an issue for shell scripts which startup very quickly.

The programming model does have its advantages though. Persistent servers risk leaking information across requests (seems to be a particular risk in async programming systems where requests execute concurrently in the same thread) so there's a definite advantage to isolation between requests. I'd love to see a server that used per-request v8 isolates (with snapshots for fast startup time.)

[+] JulianMorrison|4 years ago|reply
You'll find that the isolation goes out of the window when you need sessions; they need to be stored somewhere that persists across requests.
[+] alisonkisk|4 years ago|reply
Fastcgi solved that specific problem about minutes after CGI was invented.
[+] Matthias1|4 years ago|reply
I’m all for simplicity, but I don’t understand why the author treats CGI as a beacon of simplicity. In the example of teaching students, I wouldn’t start with dynamic websites at all. You can introduce requests, pages, urls, etc., in the context of static sites or front end development. Then it should be fairly easy to see the patterns that flask is abstracting away.

Most websites are either primarily static, with no server side code, or they’re fully dynamic, where every page is generated. CGI shines when sites are a mix.

One other alternative to CGI that the author doesn’t mention is reverse-proxying a single page. I find that to have more practical use than CGI.

[+] smaudet|4 years ago|reply
After its configured, you can focus on a couple tags, and interpolating some variables.

There is nothing else.

Sure, if you want to learn html just write a page and reload in a browser. But that doesn't teach the simple bits of talking with a server from a browser.

I def agree that there is no modern framework that has an easy mode like this - they are all abstractions which insist you learn a bunch of local, mostly throwaway, tribal knowledge.

Do you want to build a multi user scheduling system to serve tens of thousands of users or robots, probably not. But its perfect for hooking up the output of a couple variables to a webpage.

[+] vanilla-almond|4 years ago|reply
A somewhat related naive question: is anyone using FastCGI (or just CGI) to interface directly with a web server like Apache, NGINX, or Caddy for their web apps? What reasons would you choose to not interface with use these web servers directly?

These web servers are battle-tested and feature-rich. If you add another layer like an application server (Puma, Gunicorn etc) that sits between the server and your programming language does the application server end up duplicating some (most?) of the functionality of the web server?

[+] loup-vaillant|4 years ago|reply
> Different units: traffic of visitors vs people running sites. I think we confuse them.

Yes we do. All. The. Freaking. Time.

It’s availability bias: most of the web sites we visit are popular and big and complex and have had to solve serious scalability issues. Most of the web sites we make have few visitors and are small and simple and do not have any scalability problem beyond the occasional aggregator hug.

Likewise, most of the software we use is big and complex and used by many. Most of the software we make have less than 10 users. Google, Facebook, Microsoft, Amazon… are everywhere, but a stupidly small proportion of companies in the world are as big as they are.

Mike Acton urges us all to "understand the data". That includes how much data we’ll be processing. How many request per days are we expecting? Are they evenly spaced, or will there be spikes? What’s considered acceptable latency? Stuff like that. Remember, the Pirate Bay at its most powerful only needed 4 rack servers, on top of each other. Very few of us will exceed the capacity of even a single server.

It’s not always easy to see, because the front page of HN (and the front page of pretty much anything for that matter) doesn’t feature the ordinary. So we only see the extraordinary, and get the impression that we have to measure up to that.

[+] onion2k|4 years ago|reply
The only downside of CGI that I know about is the fact it starts a new process to handle each user request.

When you're saying it's preferable to serverside web frameworks you probably ought to point out that it does far less for you. I started out in web dev working on Perl CGI scripts and they were great (once you got your FTP app to use the right CRLF encoding), but you really had to do everything yourself. Some devs might see that as a benefit but I don't.

[+] ufmace|4 years ago|reply
I can agree in that CGI is a cool and simple technology that might make sense for some applications, and can be a good method to teach new developers the basics. I don't think I'd choose it for much now though, unless I knew it was very simple and would never grow more complex.

Most of the complaints here are about performance. I don't think the decreased performance would be much of an issue until you get to pretty large scale. I do think the real issue is the huge complexity in doing things correctly and securely in such a manual environment. Oh, you're going to do cookies by just printing the Set-Cookie header manually? Well now you have to handle everything about your cookies manually and do it correctly. What are the odds you're doing that? Just let Rails etc handle it the right way for you. Going to do CSRF protection manually too, and do anti-XSS escaping correctly everywhere, and mitigate a ton of obscure security issues that most people have never heard of? No way, unless you're a world-class expert. Just use one of the major proven frameworks that already take care of all of that stuff for you.

[+] oblib|4 years ago|reply
I still use Perl CGI scripts for handling chores on the server side.

The newest version of my app uses service workers to store almost all the app code in the user's browser. Almost everything the user does with the app is done on the client side so for the most part they only hit the server to get or put data in the server side database (CouchDB) and for the most part those are very small gets and puts.

Compared to earlier versions going back to 2002 my server barely has any load on it. If your goal is to track every click a user makes then CGI isn't a great server side option, but if your goal is to make a fast and reliable app than an offline-first/local-first side benefit is you don't need a huge box or tons of bandwidth to run it and CGI scripts are a fine way to handle small server side chores, which is pretty much all that's left.

[+] habibur|4 years ago|reply
Not really CGI, but use FCGI instead.

Ruby, python, PHP every server side tech use FCGI at the backend, only exception being stacks that have built in webserver [Java].

FCGI works mostly like cgi, but the program doesn't terminate after execution, rather continues to listen for next request and serves it. Runs like a daemon.

[+] paxys|4 years ago|reply
CGI is an interface for web servers to communicate with an application.

Web frameworks are libraries/helpers on the application side to help with business logic for serving requests.

You can use CGI with web frameworks (look at the ton of useful PHP/Perl/Ruby frameworks out there).

You can also build a fully competent website without CGI OR web frameworks. Modern languages now all have built-in web servers which perform a lot better so Apache/nginx etc. need to function at most as reverse proxies.

In fact even if teaching is the goal I'd argue that Apache/CGI introduce more opaque abstractions, not less. You can create a web server and request loop in any language of choice in like 10 lines and take it from there.

[+] _joel|4 years ago|reply
This is used a lot on routers nowadays, for certain things it's great. I'm not sure running sh in cgi-bin would work for c10m (unless you've got a lot of servers!) or sanitization etc.
[+] clamiax|4 years ago|reply
I agree with the author in fact I host my web site using CGI with zero issues. From people who discredits CGI I would like to see, in addition to personal opinions, a clear proof of the fact that it is not suitable for production in professional web sites expecially low traffic ones.