top | item 40529501

(no title)

pierre | 1 year ago

RAG cli from llamaindex, allow you to do it 100% locally when used with ollama or llamacpp instead of OpenAI.

https://docs.llamaindex.ai/en/stable/getting_started/starter...

discuss

order

nl|1 year ago

Does the llamaindex PDF indexer correctly deal with multi-column PDFs? Most I've seen don't, and you get very odd results because of this.

rspoerri|1 year ago

i've made quite good conversions from pdf to markdown with https://github.com/VikParuchuri/marker . it's slow but worth a shot. Markdown should be easily parseable by a rag.

i'm trying to get a similar system setup on my computer.

pierre|1 year ago

Locally you can choose pypdf or mupdf wich are good but not perfect. If you can send your data online llamaparse is quite good.

ekianjo|1 year ago

llamaindex has an horrible API, very poor docs and is constantly changing. I do not recommend it.