top | item 46085849

(no title)

Royce-CMR | 3 months ago

Super noob in vector embeddings: I never considered that tables would be a complexifier. (beyond defining in a parseable format for ingestion).

Do vector databases do better with long grouped text vs table formats?

discuss

order

Oras|3 months ago

The issue is the ingestion (extracting the right data in the right format). This is mainly an issue in PDFs and sometimes when there are tables added as images in Docx too. You need a mix of text and OCR extraction to get the data correctly first before start chunking and adding embeddings