One thing I think is really lacking from the PDF ecosystem is good open-source tools around signing. PDF signatures are something that is legally important in a lot of the world with regulations like eIDAS. Unfortunately it's extremely difficult to cryptographically sign PDF documents with tools other than Adobe's and some other (often more sketchy) proprietary tools. Even if you figure out how to use stuff like LibreOffice or poppler to sign, you'll struggle to obtain certs that will validate without spending an arm and a leg.
I really hope that someone will decide to step in and become the Let's Encrypt of PDF and S/MIME certs, because that will improve public trust significantly.
[1]: https://www.zealid.com/en/ - you can onboard remotely for free, download your qualified certs at https://my.zealid.com/en - upload, QES sign, download PDFs (all of these free) - or use our APIs to integrate into us (get in touch with us if you'd like the latter).
You’re really right, I asked my IT guy who’s a windows server wizard about what it would take to implement basic PKI for internal document signing and he looked at me like I had 2 heads.
I host this at home. I don't use it myself, I can use the linux CLI tools that its based on. But I prefer my wife to use this to convert/split/etc. her pdf files this way instead of using some random website or app (that uses the same cli tool anyway).
She doesn't mind either way. Seems to work well enough for her use cases.
> Originally developed entirely by ChatGPT, this locally hosted web application has evolved to encompass a comprehensive set of features, addressing all your PDF requirements.
I wonder what the point of that sentence is - to get picked up by HN? Kinda like how any product or service even tangentially related to data suddenly has 'by the way also AI and stuff' added somewhere on their landing page.
You don't see 'Developed using intellisense' used in READMEs.
I'd love it if this could be integrated into Paperless. Every now and then one of my scanned documents goes in upside down and I need to rescan. Clicking rotate, maybe reordering and letting it be rescanned would be great.
One of firefox's more recent updates added a PDF editor.
I think people's perception of forefox is from several versons ago. As a daily user throughout its history, Firefox has made alot of progress over the years IMO.
From the README: “Stirling PDF does not initiate any outbound calls for record-keeping or tracking purposes”. Beyond auditing the code, how could a potential user verify this claim in advance, and how can a web-based app help support such a claim (in particular when the app does need to make some web requests to operate, but only to a restricted list of URLs that might be listed in a manifest along the lines of a Content-Security-Policy for instance)?
This is a concrete problem when deploying apps that need the user to “upload” some sensitive content.
If you're self-hosting on kubernetes, you can set up network policies with deny-all egress rule for this deployment/pod. This would block all outward network calls.
> in particular when the app does need to make some web requests to operate
A web app doesn't need to make an outbound web requests to operate. A user interacting with a web is the one initiating the requests.
You can give the access to the up through a HTTP proxy and you can filter out any outbound requests from the web app or even not configuring the network routing for the server hosting that app. That leaves you with only JS initiated requests in the rendered pages of the app.
Just put a sniffer or network capture tool like Wireshark in between. Additionally you could restrict the apps network access entirely to just your local home network.
I once used it on a pdf file with a scanned text form trying to make it more contrast, as the scan was hardly seen. What I needed was basically just making dark stuff darker (up to the point of making it black) and maybe a bit thicker to make it more visible.
The tool failed to help me with such a seemingly menial task, the improvement was very small. I even tried to repeat the step multiple times, but after like 2nd use there were no visual differences anymore (but the file's size kept actually changing).
That's not really a PDF thing. In that case the PDF is a thin wrapper around raster images. Extract the images, increase the contrast with gimp or imagemagick, and make a new PDF. You could script this if you had a lot of them.
At least it is open source. It never hurts to have something web based but I prefer an application. I use a commercial product for Linux. Master PDF Editor. It is good, but their copy protection sucks. Better don't forget to "deregister" ist before wiping your harddrive, otherwise your code won't work. But there is always customer service....
I generally agree, but I also very often need to quickly do something basic to a PDF from my phone (like split a 2-column layout to single-column so it's actually readable) or when doing something on someone else's computer and little web tools like this are great for that.
And yes, Master PDF Editor is an amazing piece of software! It makes creating PDF forms so easy that every time I get a PDF to fill out, I make it a form and send back an empty copy too so whoever sent it to me can use that instead. I've gotten a few smaller organisations to start using mine instead.
Can someone convert this mumble jumble of docker api endpoint locally environment into a working software that people can actually install and use on their pc/mac/phone???
This seems too complicated to perform simple tasks of split merge edit not to mention the GBs of space docker and dependencies will take.
The creator is in the comments, it wasn’t entirely developed with ChatGPT. The first spike in 24h was. The rest of the past year of development was human.
On the topic of PDFs, one thing I have always wondered is why there aren't any OpenSource pdf editors that are comparable to Adobe or Foxit. Does anyone know?
Probably for the same reason there aren't any OpenSource comparables to Word, Excel, Outlook, Autocad, Quickbooks, or any other flagship productivity software
I feel like everytime I log onto this site I encounter a finished project that's very similar to something I'm working on. I'm too slow (or project hop too much).
Not that PDF related tools are uncommon but yeah I think people understand the sentiment.
I'm also very surprised that <redacted> for profit companies in the Document-manipulating/signing/storing still exist outside of niche industries (healthcare, govt, law) that require audit-trails and other regulatory specifics. I guess SEO still rules.. If anyone wants to make some money call up all the biggest real estate firms in your area and ask them how much they spend on contract signing or related services (it's a lot) and then offer them to do this for half the amount ( I can sign 20-30k documents for <$100 a month, and could be cheaper probably) Your average real estate firm is paying .50c-$1 a signature if they are uninformed, there's a lot of the uninformed.
Manipulation and signing is the easy part. Robust storage, retrieval and deletion aren't so easy.
Reputation often matters more than price in this area, because the pricey services amount to peanuts compared to the business as a whole. It's like doing price optimisation on toilet paper in the office. And reputation is generally interpreted as a proxy for reliability and a guarantee that there will be someone to sue if things go bad.
I’m not sure if this uses unoconv with LibreOffice but my heart goes out to anyone trying to develop Pdf/document manipulation tools with the current Python libraries. A thankless task.
The more you work on this stuff the more you hate proprietary formats as well as having to rely on open source repos operated at the whim of a few good people.
I would be careful with such wording as one could easily come to the conclusion that this tool was developed by the ChatGPT team. Nevertheless that this software certainly wasn't entirely developed by ChatGPT which is technically not possible but WITH the assistance of an AI tool.
On a related note, does anyone have a good solution for highlighting arbitrary spans of text in PDFs? Trying to make viewing search results easier, but most of the solutions I've found are pretty lousy.
I think you can hack together something in pdf.js but you have to deal with the pain of digging through its code. I’m working on something of the sort in an application I’ve built but it’s v much “nice to have”.
If you’re talking about raw pdfs then you are at the whim of the encoding surely? I’ve always found Adobe etc to have utterly crap searches
Out of curiosity (and self interest [1]), what is your use case for digital signing and verification?
My understanding is that it’s more about trust (Docusign being the leader) than anything else: one can provide certificate signing and verification, but the trust in the owner of the certificate is the crux of the matter
I have some questions about the Github Star history, it's very unusual to see a ~1 year old repo with 20k+ stars.
It went from 6k to 15k+ stars in a few days around 2023 Christmas when HN/Github/Reddit traffic is usually lowest, and I didn't see a corresponding social media post or announcement around that time with that kind of traffic.
If I'm wrong and there is some big social media post / promo that I missed, I apologize, I'll eat my shorts!
noodlesUK|1 year ago
I really hope that someone will decide to step in and become the Let's Encrypt of PDF and S/MIME certs, because that will improve public trust significantly.
JumpCrisscross|1 year ago
You’ll be surprised how far you can go pasting a picture of your signature in Preview.
wfn|1 year ago
Two references which I promise will be interesting (re: qcerts and QES tooling):
- excellent open source library for working with PDFs and digital signatures (incl. PDF ones): https://github.com/MatthiasValvekens/pyHanko
- European Commission's DSS Tool (you can submit one PDF only, don't need both original and signed one): https://ec.europa.eu/digital-building-blocks/DSS/webapp-demo...
[1]: https://www.zealid.com/en/ - you can onboard remotely for free, download your qualified certs at https://my.zealid.com/en - upload, QES sign, download PDFs (all of these free) - or use our APIs to integrate into us (get in touch with us if you'd like the latter).
[2]: opinions are my own.
TheJoeMan|1 year ago
unknown|1 year ago
[deleted]
perfmode|1 year ago
maweki|1 year ago
She doesn't mind either way. Seems to work well enough for her use cases.
smartmic|1 year ago
I am interested in this part. Here is what I found: https://pdfbox.apache.org/2.0/commandline.html
Since PDFBox is a Java application, it should work cross-platform, not just Linux. Please correct me if you mean something else.
nashashmi|1 year ago
anotherhue|1 year ago
Well that's that then.
frooodle|1 year ago
It was initially created as a 24 hour challenge to make a full app with chatgpt 3.0 in a set time limit to test what chatgpt was like last year.
I posted on Reddit it got lots of demand and I turned it into a full app,the only full chatgpt was the first 24 hours, it's over a year later now
jpnc|1 year ago
BonusPlay|1 year ago
1) don't expose it to public internet
2) don't give it untrusted input
Which highly reduces the usability factor for me.
arthurcolle|1 year ago
nolongerthere|1 year ago
petepete|1 year ago
moritzruth|1 year ago
Source: https://docs.paperless-ngx.com/changelog/#paperless-ngx-270
Odenwaelder|1 year ago
nashashmi|1 year ago
babox|1 year ago
tacocataco|1 year ago
I think people's perception of forefox is from several versons ago. As a daily user throughout its history, Firefox has made alot of progress over the years IMO.
Give it another shot if it's been a while.
emarsden|1 year ago
This is a concrete problem when deploying apps that need the user to “upload” some sensitive content.
huygens6363|1 year ago
[1] https://www.obdev.at/products/littlesnitch/index.html
Edit: LS is MacOS oriented. I'm sure there are others, but I'm not into it. I feel it should be an OS-level feature, but who am I.
arcastroe|1 year ago
justsomehnguy|1 year ago
A web app doesn't need to make an outbound web requests to operate. A user interacting with a web is the one initiating the requests.
You can give the access to the up through a HTTP proxy and you can filter out any outbound requests from the web app or even not configuring the network routing for the server hosting that app. That leaves you with only JS initiated requests in the rendered pages of the app.
TheCapeGreek|1 year ago
Open source runs in a large amount of trust, and we're all complicit.
apexalpha|1 year ago
mstijak|1 year ago
CxReports: Self-hosted, web-based PDF reporting tool.
https://www.cx-reports.com
llagerlof|1 year ago
I really hope it's better now.
mathfailure|1 year ago
The tool failed to help me with such a seemingly menial task, the improvement was very small. I even tried to repeat the step multiple times, but after like 2nd use there were no visual differences anymore (but the file's size kept actually changing).
jacob019|1 year ago
junto|1 year ago
Beijinger|1 year ago
franga2000|1 year ago
And yes, Master PDF Editor is an amazing piece of software! It makes creating PDF forms so easy that every time I get a PDF to fill out, I make it a form and send back an empty copy too so whoever sent it to me can use that instead. I've gotten a few smaller organisations to start using mine instead.
bustedagain|1 year ago
This seems too complicated to perform simple tasks of split merge edit not to mention the GBs of space docker and dependencies will take.
Thank you
unknown|1 year ago
[deleted]
thrdbndndn|1 year ago
GlacierFox|1 year ago
cjblomqvist|1 year ago
Update: As evident by the author's comment below, it's definitely not made by ChatGPT anymore (in any major way)
tyre|1 year ago
shiftingleft|1 year ago
Alifatisk|1 year ago
On Windows, okular. But honestly, pirating Acrobat is the best way if you have a tough economy.
kapildev|1 year ago
trueismywork|1 year ago
tiahura|1 year ago
TechDebtDevin|1 year ago
Not that PDF related tools are uncommon but yeah I think people understand the sentiment.
I'm also very surprised that <redacted> for profit companies in the Document-manipulating/signing/storing still exist outside of niche industries (healthcare, govt, law) that require audit-trails and other regulatory specifics. I guess SEO still rules.. If anyone wants to make some money call up all the biggest real estate firms in your area and ask them how much they spend on contract signing or related services (it's a lot) and then offer them to do this for half the amount ( I can sign 20-30k documents for <$100 a month, and could be cheaper probably) Your average real estate firm is paying .50c-$1 a signature if they are uninformed, there's a lot of the uninformed.
cess11|1 year ago
Reputation often matters more than price in this area, because the pricey services amount to peanuts compared to the business as a whole. It's like doing price optimisation on toilet paper in the office. And reputation is generally interpreted as a proxy for reliability and a guarantee that there will be someone to sue if things go bad.
jiriro|1 year ago
Jackson_Fleck|1 year ago
The more you work on this stuff the more you hate proprietary formats as well as having to rely on open source repos operated at the whim of a few good people.
ofrzeta|1 year ago
https://pdfsam.org/de/
https://github.com/torakiki/pdfsam
siva7|1 year ago
I would be careful with such wording as one could easily come to the conclusion that this tool was developed by the ChatGPT team. Nevertheless that this software certainly wasn't entirely developed by ChatGPT which is technically not possible but WITH the assistance of an AI tool.
showerst|1 year ago
Jackson_Fleck|1 year ago
If you’re talking about raw pdfs then you are at the whim of the encoding surely? I’ve always found Adobe etc to have utterly crap searches
2Gkashmiri|1 year ago
Backend has Python and preferably agpl
haidev|1 year ago
frooodle|1 year ago
It was initially created as a 24 hour challenge to make a full app with chatgpt 3.0 in a set time limit to test what chatgpt was like last year.
I posted on Reddit it got lots of demand and I turned it into a full app,the only full chatgpt was the first 24 hours, it's over a year later now
sidcool|1 year ago
exac|1 year ago
https://github.com/Stirling-Tools/Stirling-PDF/blob/7f577a60...
n3storm|1 year ago
nip|1 year ago
My understanding is that it’s more about trust (Docusign being the leader) than anything else: one can provide certificate signing and verification, but the trust in the owner of the certificate is the crux of the matter
[1] I’m the developer behind SimplePDF.eu
exe34|1 year ago
brnt|1 year ago
nikisweeting|1 year ago
It went from 6k to 15k+ stars in a few days around 2023 Christmas when HN/Github/Reddit traffic is usually lowest, and I didn't see a corresponding social media post or announcement around that time with that kind of traffic.
If I'm wrong and there is some big social media post / promo that I missed, I apologize, I'll eat my shorts!
https://star-history.com/#Stirling-Tools/Stirling-PDF&Date
https://www.google.com/search?q=%22stirling%22+%22PDF%22&sca...
Froodle|1 year ago
frooodle|1 year ago
unknown|1 year ago
[deleted]