top | item 40158483

Show HN: Pdf2llmtools – I built an API to convert PDF files into text for LLMs

3 points| MadMatt13 | 1 year ago |pdf2llmtools.com

Hello, I'm Matt, a 30yo software developer. And I've been building several little POCs over the past few months playing with generative AI, particularly LLMs APIs such as the ones from Claude or OpenAI.

I was frustrated with the fact than none of them are supporting PDF files inputs out the box. Especially while LLMs are so good at doing work from PDF files when used in their web portals.

I thought that I would easily be able to convert PDF files in my current apps. But I realized that it was actually a pain to do any kind of PDF processing from the tech stacks I'm used to.

After struggling for a bit and almost giving up on PDFs features in my other projects. I managed to build a PDF to HTML API built around pdf2htmlEX wrapped in an Elixir App. I thought that I'm probably not alone to want to be able to send PDFs to OpenAI or any LLM without the headache.

So I'm making this available for anyone to play with and I'm thinking to add some cool additions to make it more LLM friendly and not just do stupid HTML conversion.

I would appreciate any constructive feedback.

Cheers

2 comments

order

noashavit|1 year ago

I also see white text on your site (and my machine defaults to dark mode).

Agone|1 year ago

I don't know if it's a bug, but the design is all white, even the titles.