top | item 39027543

Show HN: Htmldocs – Typeset and generate pdfs with HTML/CSS

226 points| kelvinzhang | 2 years ago |htmldocs.com

htmldocs is an Overleaf-style editor for typesetting documents using HTML/CSS, which provides the same benefits as LaTeX while being more accessible, customizable, and familiar.

I built this because I wanted to programatically generate invoices as well as automatically tailor my resume to jobs but had no good way of generating well-formatted PDFs. I ended up building a templating engine to Chromium rendering pipeline to generate PDFs, and due to the amount of engineering effort, turned it into a tool for others that might want to do the same. There's a built-in API (https://htmldocs.com/docs/documents) that you can call to turn JSON into PDFs in a single call.

htmldocs is different from other tools like Wkhtmltopdf and Weasyprint in that it uses Chromium to generate PDFs, meaning that it supports the most modern CSS features and there's minimal drift between the rendered HTML document and PDF.

Will also consider open sourcing if there's enough interest in the project!

89 comments

order
[+] chrismorgan|2 years ago|reply
> htmldocs is different from other tools like Wkhtmltopdf and Weasyprint in that it uses Chromium to generate PDFs, meaning that it supports the most modern CSS features and there's minimal drift between the rendered HTML document and PDF.

WeasyPrint is implemented as a from-scratch and specific-purpose rendering engine, so yeah, it’s different. But wkhtmltopdf uses WebKit, meaning it’s much the same as htmldocs, just backed by a different browser engine.

It’s important to realise, though, that using an existing browser engine doesn’t mean everything’s hunky-dory: in fact, when it comes to some of the things you care about with producing PDFs, some things will be worse, and the WeasyPrint approach has significant advantages. Because browsers don’t care about your use case at all. From time to time they’ll improve things incidentally, but all browsers are missing things like a lot of CSS Paged Media stuff, stuff that’s often been specified for a decade or more, where things like WeasyPrint have had them for years and years.

[+] monax|2 years ago|reply
As the maintainer of wkhtmltopdf @ Odoo I can tell you it's not WebKit. Instead it's outdated WebKit from 2014 running on top of QT4 '^^
[+] pitdicker|2 years ago|reply
Prince has been doing this for 20 years and is in my opinion the gold standard, with good support for footnotes, endnotes, page headers and other little extensions that are only relevant for printing. https://www.princexml.com/

But I'll be giving this a try!

[+] tmcdos|2 years ago|reply
I totally agree - PrinceXML is what I use for PDFs generation.
[+] cubefox|2 years ago|reply
See also native CSS support for paged media:

https://www.w3.org/TR/css-page-3/

When I looked into it several years ago, browser support for some critical features wasn't there yet. Not sure whether this has improved.

In principle, this would be a great alternative to proprietary PDF rendering libraries which require you to design your document completely in (e.g. Java) code, and to the typical LaTeX approach. You really appreciate the elegance of HTML+CSS once you had the misfortune of having to do a simple fucking table in LaTeX.

[+] lelanthran|2 years ago|reply
I dunno about that.

For simple stuff, sure HTML and CSS is great, but when I want something print-perfect I use LaTeX.

A simple table in LaTeX is no harder than in HTML, assuming the same level of knowledge in the tool.

But then when you throw in a simple requirement like "left margins on even pages and right margins on odd pages need to be larger", HTML becomes hell to work in.

HTML and CSS, even with a media query for print, sucks.

[+] karteum|2 years ago|reply
Some people might be interested in https://weasyprint.org/
[+] __jonas|2 years ago|reply
Interesting, I'm looking for a solution to render PDFs at the moment, it looks like that one actually does not use chromium, unlike most tools that I've seen that render HTML to PDFs. What's the advantage? Is it more lightweight / faster / reliable?
[+] dsr_|2 years ago|reply
Your terms of service appear to be bogus:

> You acknowledge that the Website is a general-purpose search engine and tool. Specifically, but without limitation, the Website allows you to search multiple websites for music. Moreover, the Website is a general-purpose tool that allows you to convert audio files from videos and audio from elsewhere on the Internet. The Website may only be used in accordance with law. We do not encourage, condone, induce or allow any use of the Website that may be in violation of any law.

[+] kelvinzhang|2 years ago|reply
Nice catch, I copy/pasted that from a previous project and forgot to remove - should be fixed now.

TOS is pretty general and just there for legal reasons, tl;dr feel free to use as you see fit as long as it's not for anything illegal.

[+] amadeuspagel|2 years ago|reply
Trying to login with google I get this error:

> htmldocs.com has not completed the Google verification process. The app is currently being tested, and can only be accessed by developer-approved testers. If you think you should have access, contact the developer.

[+] leetrout|2 years ago|reply
I have been building a custom PDF generator for my Remarkable tablet using jsPDF. Something that has surprised me is how hard it is to keep the file size small with tools like this.

Its fun and fast to generate things but when you get to 1000+ pages for something like a year long planner the document can quickly balloon past 70 MB.

So far I have kept my 750-1300 page planner between 3-7 MB.

I will give this a try and compare.

[+] dotancohen|2 years ago|reply
Please report back and let us know!
[+] QuiCasseRien|2 years ago|reply
@kelvinzhang

Quite frankly, htmldocs is the exact project i'm looking for months. I'm tired of word and same alternative and wanted something i can write html and css3 to convert to PDF.

You do and in a beautiful way !

Some question : i just want to use your product be also need to be sure my doc will by avaivable in futur. what's your plan ?

- opensource ? - community/enterprise ? - close source but a docker version to go on premise ?

Thanks for you answer and by the work very good works !

[+] kelvinzhang|2 years ago|reply
The long term goal of this project is to make HTML/CSS the defacto way of typesetting PDF documents. I’ll definitely keep it running for as long as I can.

The web version is just the initial step, and will likely open source for people who want to self-host and to increase adoption. For pricing, will probably adopt a model similar to Overleaf where it’s free for most users and maybe charge for team collaboration or have an enterprise license.

[+] matricaria|2 years ago|reply
How does this compare to Typst?[1]

What I like about Typst is that I can use it completely offline and with my editor of choice. Is this planned for htmldocs too?

[1] https://typst.app/

[+] kelvinzhang|2 years ago|reply
Offline isn’t a priority at the moment but can be done in the future by packaging with Electron.

Typst is great but there are a few key differences:

- full CSS support, allowing for more customizability and familiar styling without a learning curve

- less tailored towards academic use-cases are more towards personal/business use-cases like resumes, invoices, reports. Also adding a template gallery in the near future.

- has a templating engine and API baked in

- can use JS packages and ecosystem (ex. icon sets, fonts, etc.)

[+] tagyro|2 years ago|reply
Great project and great work!

I had a similar itch to scratch and I found quarto (https://quarto.org/) - free, open-source and doesn’t depend on chrome (admittedly it has other dependencies, but at least not chrome).

[+] RobKohr|2 years ago|reply
You use the word documents on the site, but you don't make it clear that it is for making pdfs. I thought that this was just an html/css webpage editor.

"Typeset and Generate PDFs with HTML/CSS"

really should be the H1 on that page for clarity.

[+] kelvinzhang|2 years ago|reply
This is a great suggestion and I feel dumb for not thinking of this earlier. Thanks!
[+] nsim|2 years ago|reply
I've used flying saucer pdf[1] for this in the past, but the missing piece always seems to be a descent WYSIWYG template editor. Either open source or paid.

Any suggestions on a web solution that allows non-devs to make great templates would be appreciated.

Historically I've built something simple with Tiny and added a preview button to render, but that super clunky.

[1] https://github.com/flyingsaucerproject/flyingsaucer

[+] mstijak|2 years ago|reply
My firm is currently in the process of creating a similar product that features a sophisticated visual page editor. While it's not officially launched yet, the product is about 95% complete. At the moment, we are on the lookout for early adopters.

For those interested in giving it a try, please contact us at support at cx-reports dot com. We are providing complimentary licenses to users who are eager to collaborate with us during this phase.

Screenshot: https://pasteboard.co/th5f4s0uVWJH.png

[+] joshmarinacci|2 years ago|reply
Oh wow. I started Flying Saucer 20 years ago. I’m amazed it’s still in use.
[+] kayo_20211030|2 years ago|reply
This might be great, and it might not. It's hard to tell as a lot of the documentation is missing, or has broken links. It would be really nice if the docs were more complete, and didn't require a sign-up to kick the types. Is it fundamentally better than, say Apache FOP? Multi-page tables with overflow/keep together control is hard. Does this do it better? I can't tell as all the templates seem to be single page.
[+] jppittma|2 years ago|reply
Oh, is this like a paid version of pandoc?
[+] kelvinzhang|2 years ago|reply
You can probably get away with Pandoc depending on your use-case, this is primarily built for:

- having a web editor interface without needing to re-run a CLI command

- people who don't want to deal with packages/dependencies on their own system

- users that just want an affordable and easy-to-use API to generate PDF documents instead of having to set up their own server <> task queue <> worker system which would usually cost more

- in the future, teams that want to collaborate together on a document

[+] amcaskill|2 years ago|reply
This looks awesome, congratulations.

I think you’re onto something here.

I work on an OSS reporting and analytics tool (https://github.com/evidence-dev/evidence) and the amount of time and effort that goes into a really good “print pdf”, and how valuable people find it, has been one of the more surprising parts of the project to me.

[+] cholmon|2 years ago|reply
What is the pricing like, e.g., per-template created, per-PDF rendered, or per-API call? I don't see any pricing info on the public site.
[+] kelvinzhang|2 years ago|reply
Pricing info is in a modal under the user dropdown, it's currently based on a monthly credit system. If there's another pricing model you'd prefer, let me know!
[+] heads|2 years ago|reply
It’s a good idea. You’ll want to support GitHub flavoured markdown as an input, if you don’t already. Be sure to allow an inline <style> so documents can still be customised using code.

I’d also like to have a mode — the default? — where all rendering was being done locally. There are documents I write that just can’t be stored on a third party computer.

Overleaf, to some extent, has shown the appeal for letting people focus on writing 100% of the time and running software 0% of the time. I’d prefer to use your tool though to write a quick letter to go in a parcel as a packing slip.

By the way: on mobile, your animated headline causes the page to jump up and down.