top | item 17416671

OpenPDF – A free Java library for creating and editing PDF files

213 points| roschdal | 7 years ago |github.com

48 comments

order
[+] WorkLifeBalance|7 years ago|reply
I was excited to hear about a PDF library not based on iText but alas, this is just another iText fork, in fact the full fork chain seems to be:

    LibrePDF/OpenPDF 
    forked from rtfarte/OpenPDF
    forked from kulatamicuda/iText-4.2.0
    forked from daviddurand/iText-4.2.0
    forked from ymasory/iText-4.2.0
[+] Waterluvian|7 years ago|reply
I had that same frustration this week. Trying to generate geotiffs from arrays in Python. I found four libraries, all of which just wrap GDAL, all of which have the same quirky issues. Alas. :)
[+] jahewson|7 years ago|reply
If you’re after an open source PDF library in Java, check out Apache PDFBox. It’s actively maintained and has a ton of features.

http://pdfbox.apache.org

[+] kovrik|7 years ago|reply
Would recommend!

Used PDFBox a couple of projects ago. It was great! Very nice API, works like a charm.

[+] dlandis|7 years ago|reply
Why is this being upvoted now? People have been forking iText for years, ever since the license was changed to AGPL. This repo doesn't even look very active and the last release was last year.
[+] pmarreck|7 years ago|reply
People just can't stand the idea of writing a Forth parser, I guess lol

https://www.prepressure.com/postscript/basics/history

[+] stevekemp|7 years ago|reply
FORTH is a wonderful language, I've written a couple of toy implementations. I just wish I could use it for something "real".
[+] Devagamster|7 years ago|reply
Pdf is a bit more than a forth implementation. There is a reason there are so few pdf parsers out there.
[+] Lord_Nightmare|7 years ago|reply
Is the LGPL part of the license actually LGPL 2+, 2.1+ or 3.0+?

At least some of the file headers say 2+, the readme says 3.0+, but this note was added after the fork, with commit https://github.com/daviddurand/iText-4.2.0/commit/312abf7b12... so it may be in error?

If it was in error, does this affect any merged code pieces/PRs submitted/merged after the readme was changed, or do the license headers per-file take precedence?

[+] ternaryoperator|7 years ago|reply
I have some experience with iText, so possibly I can answer. iText never released a version 4.2.0 for Java. The original forker misunderstood iText versioning and used the number of the C# version. At the time, iText versions for Java were 2.x, and C# were 4.x The code in the forked version is actually the last release of the Java version of iText 2.x, which was the last Java release under the LGPL/Mozilla license.

Thankfully, iText later coordinated release numbers for both platforms to 5.x numbering when they changed the license to AGPL.

So, you should be safe using this version of iText under LGPL or Mozilla, but it's a fairly old release of the library. If you're just using iText for your hacking projects, you're probably better off going with a more recent version.

[+] banach|7 years ago|reply
I hope somebody makes an Eclipse plugin out of this, and integrates it into the TeXlipse LaTeX IDE. It is a sad state of affairs that this aging project (last release was in 2011) is still the most versatile TeX editing environment. For example, it is the only one that I am aware of, beyond command-line editors, that that lets you open up an arbitrary number of views of the same file, which is often needed in large TeX projects. Its current PDF viewer (Pdf4Eclipse) is broken on Hi-DPI displays.
[+] waynenilsen|7 years ago|reply
LibreOffice Draw does a nice job at editing PDF files as well
[+] Scarbutt|7 years ago|reply
Can this convert a html/css document to PDF? or is chrome headless(ignoring overhead) the way to go these days?
[+] sheeshkebab|7 years ago|reply
to some extent - using com.lowagie.text.html.simpleparser.HTMLWorker (more like if you have really simple html and css, you can generate a PDF doc out of those - which works well for various reports).

Although if you are looking to convert real web pages to PDF, then this is not a good choice.

[+] sk5t|7 years ago|reply
Prince (commercial) is outstanding, and supports a wide array of issues specific to printed media.
[+] xvilka|7 years ago|reply
Does it support Unicode in PDF forms? Feature that poppler still lacks (in 2018!).
[+] justbaker|7 years ago|reply
How well does it handle merging PDFs? I’ve not found an efficient way for merging many small PDFs in pdfbox/itext without cranking up memory settings.
[+] ognarb|7 years ago|reply
That so bad about AGPL, that require a fork?
[+] DannyB2|7 years ago|reply
Imagine this scenario. You want to write your own code ProductX and link it with some GPL code. This brings ProductX also under the GPL license.

You can put ProductX on a server and never actually distribute ProductX to anyone. By not distributing ProductX, you don't have to distribute the ProductX source code to anyone -- although technically it IS under the GPL. Since you're not distributing the binary, you don't have to distribute the source. Yet the public can interact with your server and make use of ProductX's services.

The AGPL prevents that. If you write your code ProductX and link it with AGPL code, then ProductX comes under the AGPL license. But now, merely letting the public interact with ProductX on a server requires you to distribute the AGPL source code to ProductX. Now anyone else has the ProductX source code and can compete with you.

The AGPL is also a way for the author of an AGPL library to make money. A library such as iText. If you want to use iText with a proprietary ProductX, then you need to buy a separate commercial license for iText.

If the developer's intent in ProductX is to keep the source code private, then the developer cannot link ProductX with any AGPL licensed code such as iText -- unless the developer is willing to pay for a commercial license to iText.

A developer should never have code that they didn't write themselves, such as a library, unless that library is under a license that the developer can always remain in full compliance with. That includes proprietary and commercial libraries as well as open source libraries. While AGPL is "open source" if you cannot comply with the terms, then don't use it.

Finally, avoid any code you find on the internet that has "no license". If you use such code, the copyright owner of that code could sue you for copyright infringement. "But wait!" they say. "I would never sue anybody, I just want as many people as possible to use my code, so I don't put it under any license." I say: If you're not going to sue me, then put that promise in writing. it's called a license. If you're such a good guy and aren't going to sue me, then put it in writing like all other open source licensed code.

[+] adrianlmm|7 years ago|reply
I believe it would abligate the developer to release the server side code that uses the library.
[+] ICush8ph|7 years ago|reply
The AGPL is an interesting idea in principle, but it is an ecosystem-level coordination problem if you apply it to libraries. Application that include those libraries also need to be AGPL which means any existing application that wants to use it must be re-licensed to AGPL and only use libraries that are AGPL-compatible. Nobody wants to be a first mover in that direction.

Plus it is untested in court just how far its provisions go.

[+] cntlzw|7 years ago|reply
Why the fork? ... oh LICENSE change
[+] kontzBern|7 years ago|reply
Took a glance at this one not so long ago - quite a nice tool to auto-generate PDF docs yet not that good at editing. For such a purpose I still use this PDFfiller https://w9.pdffiller.com but it's a paid one and written in JS. Nevertheless it's well-crafted enough to do all the works on visuals and overall layout