Interesting, I wanted to do this for a personal use case (mostly learning), but with PDFs. What's tech stack? I have explored using the AWS AI tools, but it seems a bit overkill for what I want it to do.
Tech stack is a mix of serverless Laravel, with Cloudflare and AWS functions, and some Pinecone for vector storage. Still experimenting on a few things but don't want to over-engineer unless I know where I'm going.
Given that cloudflare spies on traffic and reports to multiple agencies on it's findings, perhaps a breakdown of the chain and the privacy implications of each block in the stack would be beneficial?
kordlessagain|1 year ago
https://github.com/MittaAI/SlothAI/blob/main/SlothAI/lib/pro...
https://github.com/MittaAI/mitta-community/tree/main/service...
There's code in there that just reads PDF meta data as well, but you can't always guarantee it's there in a PDF.
lou1306|1 year ago
tompec|1 year ago
stevenicr|1 year ago
oneshtein|1 year ago