Show HN: PDF API – Generate, convert, and modify PDF documents
204 points| arkgil | 4 years ago
Arek here. We’re super excited to officially launch PSPDFKit API [1].
PSPDFKit API is a collection of HTTP APIs that enable you to convert, generate, and edit documents without running any service on your infrastructure.
What differentiates our API from others is that you can chain together multiple “actions” as part of a single API request. For example, you can convert, OCR, watermark, edit, and flatten a document — all in one call.
Available actions [2]:
- PDF Generator
- PDF Converter
- Image Converter
- OCR
- Watermark
- Merge
- Split
- Duplicate
- Delete
- Flatten
Our documentation includes sample code for JavaScript [3], Python [4], Java [5], C# [6], PHP [7], and the command line. We also have a Postman collection [8].
Let us know what you think or if you have any questions.
[2] https://pspdfkit.com/api/documentation/tools-and-api/
[3] https://pspdfkit.com/api/tools/javascript/
[4] https://pspdfkit.com/api/tools/python/
[5] https://pspdfkit.com/api/tools/java/
[6] https://pspdfkit.com/api/tools/csharp/
[7] https://pspdfkit.com/api/tools/php/
[8] https://pspdfkit.com/api/documentation/getting-started/postm...
lgvld|4 years ago
Quite expensive though. When I use an API, I usually assume (1) there will be some significant base volume, and (2) this volume has no upper bound, depending on my users behavior. For ~750 € you can only process 1k documents during the month... hard-capped? The price schemes seem to target entreprise but entreprises usually have bigger volumes than that. (But maybe I confuse API calls and document processing with your product?)
But it’s nice you released SDK in several common languages.
Good luck!
TheJoeMan|4 years ago
I’ve been looking for an easy OCR solution considering I have about 10,000 one-page documents a month (invoices). For comparison, Amazon textract is ~$0.05/pg for key-value pairs, but it involves more programming to set up.
fullyforged|4 years ago
At this point in time the price is per generated document - irrespectively of how complicated the operation is.
Because you can combine operations in one http call, you're incentivised to do that as opposed to perform separate calls which increase the possibility of errors and cost for all sides.
Happily taking feedback though - your comment around hard-cap is definitely sound, for example.
victop|4 years ago
Source: I work at PDF Blocks
gyulai|4 years ago
fullyforged|4 years ago
chadash|4 years ago
Most organizations these days are developing in the cloud. I assume you don't mean "offline" but rather performing these functions yourself.
I've tried, but it's a huge pain in the butt. PDFs are very quirky. Things work well 95% of the time and the 5% takes a lot of time to figure out.
When trying to do this myself in an app deployed to AWS, I've had many issues with getting all characters in different languages to work. Every few months, some new thing in a PDF file throws an error and the file won't generate. You get weird file size errors. And the quality of PDF generation varies a lot by language. I'd much rather have an API that I can just call from any of my code if it JUST WORKS.
Now, their pricing is strange and might be a dealbreaker for me. I'd like to see an option to pay per transaction with no cap without having to negotiate with their sales team.
> there's a huge security/privacy headache there
eh, maybe. Some use cases don't require privacy. In my case, I'm mostly assembling PDFs from various sources with my company's documents. No, I don't want a vendor that is going to post my documents to twitter, but I can sleep at night if I have some kind of assurance that they don't use or sell my data.
arkgil|4 years ago
[1] https://pspdfkit.com/server/processor/
simion314|4 years ago
chrisseaton|4 years ago
unknown|4 years ago
[deleted]
matchagaucho|4 years ago
yyyk|4 years ago
So I asked myself: if this kit existed at the time, would we have used it? I don't think so. For all specified pricing plans, the document limit is way too low for what $COMPANY or its clients do. Judging by the progression of the costs, we'd have gone in-house instead of negotiating an Enterprise plan.
If you don't want to adjust pricing, perhaps you could add a consumption based plan. The plan could have a much larger limit, but the client also pays per API call.
arkgil|4 years ago
m12k|4 years ago
zubspace|4 years ago
Sometimes text is positioned absolute to the page border, sometimes relative to other elements, where moving a word shifts all following elements around. There can be multiple matrices involved for positioning text elements. Sometimes text elements are all positioned independently, sometimes by using newlines with custom size. Text elements can span multiple lines or words but sometimes each letter is a single text element where it is even hard to determine, which letters go together or if there's meant to be a space. Additionally fonts can be subsetted, where it's impossible to use other unused letters without knowing the original font. And than there can be OCR'ed PDF's, where an image of scanned text is overlayed on top of the real text. Oh and there can be clipping paths: Rectangles which erase all text below.
And each PDF-Producer creates a different PDF structure.
For reading, PDF's are awesome. For editing, PDF's are a nightmare.
laurent123456|4 years ago
danielrhodes|4 years ago
martin_a|4 years ago
* (becomes harder when the font is not embedded/existent as a subset, but Acrobat let's you choose another font, so no big deal.)
citruscomputing|4 years ago
voiper1|4 years ago
I had to use different OSS tools to do everything I wanted. I was able to access three from within nodejs without touching the disk:
1) Libreoffice CLI for converting doc/docx to PDF. It handled the formatting remarkably well. WARNING: you must have the fonts on the system doing the generating or it will substitute "similar" fonts! NPM: libreoffice-convert
2) NPM pdfjs-dist from mozila for extracting text and finding page numbers.
3) NPM pdf-lib for manipulating PDFs: deleting pages, adding pages from other PDF files (even to the middle of a PDF.)
4) PDF Jam commandline for resizing a pdf `pdfjam --keepinfo --outfile "${path}.resized.pdf" --paper letterpaper "${path}"`;
danielrhodes|4 years ago
ianhawes|4 years ago
I've connected with several other customers of PSPDFKit over the years and they almost all have much more reasonable pricing.
Beware!
ethotool|4 years ago
user_7832|4 years ago
upbeat_general|4 years ago
Their prices may be too high, so you declined to pay. I don’t see that as predatory.
unknown|4 years ago
[deleted]
etothepii|4 years ago
Have you written your own code from scratch?
arkgil|4 years ago
koinexpert|4 years ago
Consider perhaps adding one more call, one to remove passwords. Not brute force, although that would be cool too, but just one that let's you specify the password and it makes the pdf no longer require a password.
lgvld|4 years ago
throw03172019|4 years ago
smashah|4 years ago
yashg|4 years ago
roschdal|4 years ago
zihotki|4 years ago
Long story short, instead of that we spent a few two-week sprints of two-men team and were able to successfully fulfill our needs using open source software. $Company saved hundreds of thousands per year. We also tried to influence company to donate to OSS, that unfortunately never happened, but that's another story.
So please be aware of vendor lock-in and of possible price increase. Always think of a plan-b.
tonyedgecombe|4 years ago
I don't blame you for going down that route. But it feels to me that open source is devaluing our work. PDF is a big and complex specification, there must be thousands of hours of work in the software you chose and yet you are getting all that value for free. Is there any other industry that does this to itself?
wfn|4 years ago
Question: do you have or plan to support PDF signatures? This may be then useful for us[1], we issue qualified certificates and eIDAS-compliant legal qualified electronic signatures which often need to then be embedded into PDFs.
[1]: https://www.zealid.com/en/
arkgil|4 years ago
brudgers|4 years ago
For my personal needs, I use pdftk from the command line.
vmchale|4 years ago
fullyforged|4 years ago
mdellavo|4 years ago
fullyforged|4 years ago
arkgil|4 years ago
SilentM68|4 years ago
fullyforged|4 years ago
We have some work planned in that direction, but nothing close to release at this stage.
passenger09|4 years ago
However a good documentation you have there!
steerablesafe|4 years ago
fullyforged|4 years ago
somehowadev|4 years ago
kvz|4 years ago
https://transloadit.com offers similar composable workflows in a single request, and supports more file types besides PDFs.
Disclosure, I am a founder :)
sideproject|4 years ago
https://www.scholars.io
It's a tool for reading research papers (PDFs) together with your colleagues. You can read, annotate, comment etc.
Needless to say, it led me down quite deep into the PDF world and it.. was interesting.
wolverine876|4 years ago
As far as I know, PDF/A is the only format that fits the first two specs. I know annotations are in the PDF specs but is it reasonable to think that annotations I make today will be readable - and updatable - in (e.g.,) 30 years?
stevenminhhh|4 years ago
I am also dealing with some clients that are struggling with processing handwriting in their document, but I guess it will be a little far fetched.
fullyforged|4 years ago
At the moment we don’t include Japanese and Korean, but I’ll take a note around your questions.
Handwriting is definitely a different beast, that’s not supported.
BOOSTERHIDROGEN|4 years ago
renato_casutt|4 years ago
Maybe PSPDFKit is interested in integrating Bionic Reading into their products? Take a look at the website (bionic-reading.com) to see if BR can add value to your users.
Let me know if you are interested and best regards from the Swiss Alps, Renato
jwillmer|4 years ago
gw67|4 years ago
fullyforged|4 years ago
martin_a|4 years ago
fullyforged|4 years ago
whylo|4 years ago
kareemm|4 years ago
Or filling out pdfs programmatically (a docspring alternative)?
fullyforged|4 years ago
Filling out forms is not supported, but I'll take a note. The engine can do it, but we haven't got it exposed via the API.
amluto|4 years ago
But now apparently you’re supposed to use a web API and depend on an external service. This has all kinds of downsides: it has latency (and potentially tail latency). It has larger security issues. It doesn’t work in many sandboxes. It requires an asynchronous call. Callers have to handle timeouts and retries. (If you left pad a string with a normal library, it either works or it doesn’t. With a web service, it can fail transiently or give wrong answers transiently.). It updates on its own schedule, without notice, and cannot be rolled back. And it can charge an utterly outrageous per-call price, so instead of merely profiling and debugging slowness due to making too many calls, developers also have to worry about inadvertently spending hundreds of thousands of dollars.
Replace “left pad a string” with “generate a PDF” and you get this. Why is this desirable?
I suppose things like this may partially explain the stunning slowness of bank websites.
danielrhodes|4 years ago
I used to work on a browser-based document management system, and I would have used (or at least tried) all of these APIs without hesitation. PDFs are a pain and the mish mash of poor functioning tools that exist provides a constant headache.
1) OCR'ing of a PDF is difficult. The only good service is Google, but requires that you break it into pages as images to be performant. This would have simplified things greatly. Even if the PDF has text inside and is not an image, it can be wrong or not laid out in a linear way, so you have to OCR it. Command line tools do not get you very far. An example: if you OCR or text extract a PDF with multiple columns of text, does it handle the columns well?
2) People want searchable OCR'd PDFs where you can highlight the text, even when it's a bitmap underneath. This requires a technique where you overlay transparent text in the exact position of text in the bitmap. This does not come for free and I've only seen this done on proprietary Windows-only software. This alone would be worth it.
3) Office to PDF is an extremely standard need, especially if you want to display them online. But it's not easy. You have to hack together a headless OpenOffice to have it work at all, but it doesn't do a great job. It's difficult to do well because Office docs are like HTML pages in that it greatly depends on the renderer, not to mention the fonts. Microsoft does not offer a service to do this, unfortunately. If you think anything will do, it really won't: when people see their PDF looks very different than what they saw on Word, they get upset.
4) Table extraction APIs are super important, especially if you are trying to automatically extract data from PDFs (e.g. analyze financial disclosures). There have been whole startups dedicated to this.
5) HTML to PDF is also a pain: you have to set up an instance that is running headless Chromium, which can be quite slow. This has become the defacto standard to quickly create complex PDFs. Having a simple API wrapper around this is just one less thing to manage.
The rest of the APIs, like the merging/splitting/watermarking etc., are pretty standard and you do not need APIs if you already have access to the PDF on a server. But if you were in a browser or on mobile, you might not.
ho_schi|4 years ago
newlisp|4 years ago
unknown|4 years ago
[deleted]
unknown|4 years ago
[deleted]
jfk13|4 years ago
Oh, looks like they're the exact same thing: a webpage-to-PDF service.
Then there are a whole bunch of "PDF Converter" options, including "HTML > PDF", which seems to be yet another name for the same thing.
For me, all this has a whiff of SEO spam that I find quite distasteful. Just tell me what the product does. Don't try to list it under a collection of different titles in the hope of catching more search terms, it just makes you sound like a snake-oil salesman.
arkgil|4 years ago
Of course the downside is as you've pointed out - it can be seen as distasteful and in some cases confusing to our users. We will review this on our side and see if it makes sense to remove some of those tools to reduce confusion.
unknown|4 years ago
[deleted]
TheRealNGenius|4 years ago
[deleted]