top | item 32202209

Writing my PhD using groff

236 points| yockyrr | 3 years ago |jstutter.netlify.app

130 comments

order
[+] seanhunter|3 years ago|reply
Groff is great. I went through a phase of doing reports for work using groff and one of the cool things is that on W Richard Stevens' website there were all his groff macros he used to produce all the beautiful diagrams in TCP/IP Illustrated etc. So I used to have lovely diagrams with spline curves etc thanks to W Richard Stevens.

The great thing about groff (compared in my experience with latex) is that you spend basically zero time on formatting/messing about once you have a set of macros you like, and the document production cycle is really fast so you edit with zero distractions using basically plain text (a lot like markdown) and then any time you want to see the finished product it's very quick to see it.

[+] reknih|3 years ago|reply
The LaTeX criticisms of the article really resonated with me. Long compile times and a narrow "happy path" are the things where I feel LaTeX makes me less productive.

This is a pity because, otherwise, it is a great tool with its focus on document structure and output quality. I'm currently working on a LaTeX successor which seeks to address these issues, but it is really hard to make the right design compromises here -- what can be programmed? What is accessible through dedicated syntax? How does the structure drive presentation?

Computer typesetting is a rabbit hole, but a fascinating one. And I'm sure the last word on it has not been spoken yet :)

[+] svat|3 years ago|reply
• As a rough rule of thumb, TeX can do about 1000–3000 pages a second on today's computers.[1] This is for a (presumably typical) book that was written in plain TeX.

• So if your LaTeX document is taking orders of magnitude more than about a millisecond a page, then clearly the slowdown must be from additional (macro) code you've actually inserted into your document.

• TeX is already heavily optimized, so the best way to make the compilation faster is to not run code you don't need.

• Helping users do that would be best served IMO not by writing a new typesetting engine, but by improving the debugging and profiling so that users understand what is actually going on: what's making it slow, and what they actually need to happen on every compile.

To put it another way: users include macros and packages because they really want the corresponding functionality (and everyone wants a different 10% of what's available in the (La)TeX ecosystem). It's easy to make a system that runs fast by not doing most of the things that users actually want[2], but if you want a system that gives users what they'd get from their humongous LaTeX macro packages and yet is fast, it would be most useful to help them cut down the fluff from their document-compilation IMO.

---

[1] Details: Try it out yourself: Take the file gentle.tex, the source code to the book "A Gentle Introduction to TeX" (https://ctan.org/pkg/gentle), and time how long it takes to typeset 8 copies of the file (with the `\bye` only at the end): on my laptop, the resulting 776 pages are typeset in: 0.3s by `tex`, 0.6s by `pdftex` and `xetex`, and 0.8s by `luatex`.

[2] For that matter, plain TeX is already such a system; Knuth knew a thing or two about programming and optimization!

[+] jonathanstrange|3 years ago|reply
I've become productive in LaTeX once I stopped doing any typesetting in it until there was a real need for it due to publisher requirements. LaTeX looks great out of the box, I just finished a book that I had to deliver camera-ready and the publisher (not a LaTeX shop) was very impressed with the quality. It was the standard Memoir book template with almost no changes. Ironically, many documentations for special typesetting packages in LaTeX look very bad. Generally, the less you change, the better.

LaTeX really fails at "register-true" typesetting, though. You have to allow it to extend pages here or there by a line or be willing to fix many orphans and widows by hand. AFAIK, this has to do with the text flow algorithms which are paragraph-based and cannot do some global optimizations. (Correct me if I'm wrong, I'm not an expert.)

Btw, I cannot confirm the compile-time criticisms. A whole book takes just a few seconds on my machine for one run. I wonder what people are doing who get slow compile times.

[+] reknih|3 years ago|reply
I figure I should also mention that my LaTeX alternative is called Typst. We do not have much public detail yet but there is a landing page [1] to sign up for more info and beta access as soon as it becomes available.

[1]: https://typst.app/

[+] GiovanniP|3 years ago|reply
If you are working on a LaTeX successor you could be interested in TeXmacs, which is a LaTeX successor which works very nicely in many ways, except apparently selling itself well :-) You could see there how it was designed and how the author answered the questions you are asking.
[+] atoav|3 years ago|reply
I wrote my MA thesis using markdown (with extended syntax). Structuring the document is easy: just use one hash for top level sections and more for subsections. Footnotes are easy, you just add a footnote[^reference] somwhere and then add the footnote text on a seperate line somewhere:

  [^reference]: some text
Inline math works by adding $x= \frac{y}{z}$ or in a seperate math block by adding two $ signs before and after.

The syntax of markdown is easier there, but LaTeX is arguably much more powerful, e.g. you can load tables from csv data, generate graphs, make it deal with your bibliography, draw circuit diagrams etc. And the layouts tend to look just good.

I ended up converting markdown to Indesign IDML and using this as a source in an Adobe Indesign layout where I could do all the basic typographic settings and styling once and update it on changes.

[+] andrepd|3 years ago|reply
Splitting a long document in chunks and using the `draft` option while writing speed up compilation times considerably. Otherwise you're producing a finalised typeset document of 100+ pages every time you hit F5, no wonder it takes ~10s to finish ;)
[+] pingiun|3 years ago|reply
If you want something LaTeX like, but with a wider happy path you should try SILE
[+] AB1908|3 years ago|reply
Does pandoc help at all?
[+] passerine|3 years ago|reply
I can relate quite well to the author's pursuit in tinkering with their typesetting workflow. When I wrote my bachelor's thesis, I also spent a great deal of time coming up with a custom LaTeX template and workflow. Like the author, one of the pain-points was the relatively slow edit-compile-review cycle of modern LaTeX engines like LuaLaTex.

In my case, I was mainly concerned with making the resulting thesis.pdf PDF/A compliant. PDF/A is a archival compliance standard that's dedicated to the long term digital preservation of PDF files.

Predictably, I got way too carried away as well, and ended up trying to create fully-reproducible LaTeX PDFs as well. It was probably overkill for my use-case, but it did result in a fun blog post where I documented the process [1]

[1] https://shen.hong.io/reproducible-pdfa-compliant-latex/

[+] nicce|3 years ago|reply
> Like the author, one of the pain-points was the relatively slow edit-compile-review cycle of modern LaTeX engines like LuaLaTex.

This depends a lot. In most of the cases delay is only about 1 second on modern PCs. A bit more when you cite and build the document twice.

You can use LaTeX in many different ways. There are built-in editors and web services such as Overleaf. In the end, they all use the same workflow or dependencies for building the document, but might add an additonal delay.

I too have ended up tweaking my environment a lot. I ended up testing almost every LaTeX workflow.

I finally ended up for just using vim and zathura. Optimised docker image with LuaLatex builds the document. Second favorite would be LaTeX plugin for Jetbrains products. Overleaf is only good for collaborating.

On my desktop pc which has 16 CPU cores, there is only very little latency when compiling. But for text editing, it is a bit rare that you need such PC…

[+] yockyrr|3 years ago|reply
Surely PDF/A can be created just by passing any PDF through Ghostscript with the right flags such as -dPDFA and -sPDFACompatibilityPolicy=1 ?
[+] vlovich123|3 years ago|reply
I found that those who can’t get consistent styling and have laggy behavior on large documents don’t know how to configure it. I regularly wrote hundred page reports with embedded excel and images all embedded in Word with Math and got pretty proficient. There’s basically several things you need to do:

* Actually set up a named style for every type of content you have. Creating shortcuts for the common ones doesn’t hurt * use whatever the paid version that powers the free equation editor. It was miles better about 10 years ago * use a master document sub document approach for categorizing things. You wouldn’t have a single text file that’s 100 pages long. Split up Word that way too

I’m pretty sure I got to a state where I was using the tooling as intended because I wasn’t actually fighting the wysiwyg. Now I did switch to LateX at the end because I was tired of not having easy version control. Word has it if you enable change tracking but it can’t beat normal tooling. Also I wanted to learn latek because it felt like a worthwhile investment (it was - writing formulas in latek is wayyy faster to write and easier to maintain).

So I liked LateX just fine. Prefer Markdown / wiki these days because I don’t work with math formulas.

Disclaimer: I have zero experience with the web version and have no idea how it scales. I imagine it still does quite well on large documents but maybe browser rendering is not so good.

[+] amelius|3 years ago|reply
I think the point with LaTeX is that you can automate the document generation process to a great extent. For example, if you have some data, some python scripts that process the data, and some other scripts that generate figures, you can put all of that in a pipeline and build a new version of your document automatically after the data changes.
[+] ramraj07|3 years ago|reply
I wrote multiple papers during my PhD. The theoretical one with lots of equations I wrote with latex. It’d be stupid not to. Overleaf helped though I wrote the paper over 6 years so it only helped in the end.

Then I wrote two bio heavy papers. Using word. My thesis was in word too. If you have a ton of figures and not a ton of equations it’s not the best choice to use latex.

[+] hyperdimension|3 years ago|reply
> writing formulas in latek is wayyy faster to write and easier to maintain

I was told by a friend that the Equation Editor in Word would silently accept LaTeX math-mode equation syntax and convert it automatically. Besides trying it out briefly, i never used it extensively, so I'm not sure how complete it is. Still, it's there.

[+] noisy_boy|3 years ago|reply
After years of using hacks in MS Word trying to make my CV look the way I wanted, one day I bit the bullet and wrote it in LaTeX. The amount of 3+ hours spent to learn LaTeX basics and doing the re-write were disproportionately low compared to the huge jump in the quality of the output. Having used troff for writing man pages eons ago, this blog makes me interested in learning groff to re-write my CV in it and compare the experience with that of LaTeX.
[+] scrlk|3 years ago|reply
As a counterpoint, I had to ditch my LaTeX CV when I realised that applicant tracking systems were struggling to properly parse the PDF.

Switching back to a simple Word template (no use of tables; just heading styles and bullet points) and submitting the .docx resolved these issues.

[+] toastal|3 years ago|reply
Speaking of typesetting…

This article is incorrectly scaled for mobile. There's no padding around the text so it butts up against the edge of my display. The line widths are way too long for comfortable reading. The blog entry also starts off with an unsemantic blockquote element that quotes nothing from a source.

But yes, Pandoc is a cool piece of software.

[+] yockyrr|3 years ago|reply
OP here: thanks for the feedback, added padding and correct scale. Should look better now!
[+] mrweasel|3 years ago|reply
Also weirdly enough, browsers also can seem to go into reader-mode, to compensate. I've seen this before, but in this case it seems a little weird that reader mode wouldn't work.
[+] zichy|3 years ago|reply
This together with some padding could help:

  <meta name="viewport" content="width=device-width, initial-scale=1">
[+] nonrandomstring|3 years ago|reply
Nice tour of student typesetting today. Not surprising to find roff still in service too. My thesis in the late 80s was set using nroff, fig and eqn, all of which I've fond memories.

Surely WYSIWYG and "office" suites were a disaster for writing. Students seem to spend lost weeks and months fiddling with MS-Word only to create mediocre looking output.

Personally I's say it's hard to beat Org-mode, separate plain text files, then adding the desired exporter and style files at the last minute.

[+] GiovanniP|3 years ago|reply
> Students seem to spend lost weeks and months fiddling with MS-Word only to create mediocre looking output.

I am suprised, and keep being surprised, that people haven't yet figured out that there is an excellent tool, that is TeXmacs, that manages to make WYSIWYG the best way to write structured documents while having complete control on the output and never having to fiddle with details.

[+] fegu|3 years ago|reply
Will definitely try this. I sometimes used latex at work for things like contracts and other documents that should look formal. But occasionally you need to share it with someone to get their input before it is final. Lots of people are unfamiliar with latex. So I switched to markdown. Markdown does not get in your way, so even those unfamiliar with it get the hang of it.
[+] dmd|3 years ago|reply
I wrote my Masters thesis in LaTeX, which is why I wrote my PhD thesis in Word.
[+] otherme123|3 years ago|reply
I've seen people crying over Word for not being able to work with proper styles or dealing correctly with cross references, bibliography included, all of which is relatively easy for Latex. Bibliographies in Word is almost impossible without a third party plugin like Zotero, and less able people doesn't even know they exist.

There's a well working line of business in my Uni that consist on properly final-formatting thesis with Word.

Luckily for Microsoft "easy" products, there are a legion of people that work for free as technical support.

[+] ModernMech|3 years ago|reply
I wrote my dissertation in Word, and I found it more than sufficient. WYSIWYG is still the best way to edit documents, but it’s not great for version control. Word’s equation editor is great though, and I enjoyed the ability to precisely place figures. Although resolving references can take a while with hundreds, I think they could serve to improve that.
[+] kepler1|3 years ago|reply
Yeah there was a time when I thought my speed of typing was the thing slowing down my thesis writing. So I spent a week training Dragon Naturally Speaking to be able to transcribe my voice.

Turns out that really wasn't the bottleneck, and I had just spent another week distracting myself with technology to avoid writing.

[+] PopAlongKid|3 years ago|reply
I wrote my masters thesis using troff in the early 1980s. Later that decade, I used a version of nroff on PC-DOS for my job. It seems, viewed from a sufficient distance, that this wheel has been re-invented a number of times since then.
[+] ncphil|3 years ago|reply
In the mid-70s, I typed my senior thesis on a reconditioned manual (Underwood) and a borrowed electric typewriter. By the time I did my masters in the late 80s, all my papers were composed in vde on CP/M and formatted with TeX.
[+] zvr|3 years ago|reply
Having used many *roff variants (e.g., troff, nroff, ditroff, groff) over decades and also having rather extensive experience with LaTeX, I'd now definitely choose the latter for any serious typesetting task.

Pain points include many customization points: re-creating exact document specifications provided externally, using specific typefaces, creating your own macros... Oh, and leaving ASCII (or ISO-8859-1) for multi-script characters.

Today's groff is a very fine software, if you are satisfied with its default settings and your task is in the domain it handles.

[+] balddenimhero|3 years ago|reply
Similar to the experiences of other commenters, I find the LaTeX edit-compile-review cycle to only grow unreasonably slow when none of the incremental compilation features are used. For larger documents I recommend (i) splitting the document to leverage the \include and \includeonly commands, and (ii) using the Tikz library "external" to avoid the unnecessary recompilation of unchanged graphics. PGF/TikZ is often a bottleneck.

I agree though that it would be nice if the compilation (esp. from scratch) were generally faster.

[+] gnatolf|3 years ago|reply
Still even using all of that, my thesis with heavy inline tikz took about 5 minutes per run (about 120 pages). And a full rerun with all tikz graphs redone (about 20), it took just shy of 20 minutes if the indexes existed already. That was all on a surface 4 pro from ~2015.
[+] 41b696ef1113|3 years ago|reply
The number one reason to lean heavily on sectioning via \include is for debugging. Debugging Latex is a disaster, and it is only by compartmentalizing code into smaller sections do you have a hope of isolating the problem.
[+] jszymborski|3 years ago|reply
I first tried writing my MSc theses as a set of AsciiDoc(tor) files. I really enjoyed how much more flexibility AsciiDoc gave me over MD so I was pretty set on it. I _really_ hated the equations it generated and AsciiDoc isn't a Pandoc source, sadly. Even worse, the tooling was monstrous. I had entire build scripts that were getting more and more convoluted.

I relented and went to LaTeX, and while the limitations mentioned here resonate with me, I've found it totally doable.

[+] ant6n|3 years ago|reply
The article is making me want to try it, but it’s a bit light on technical details and I’m concerned of having to go down a rabbit hole of learning a bunch of new tech.

Perhaps posting a git repository of a sample phd thesis (with a couple of empty chapters, sample figures/images, tables) could be something that others would really benefit from.