top | item 42045576 (no title) dmpetrov | 1 year ago I guess, it involves splitting a file into smaller document snippets, getting page numbers and such, and calculating embeddings for each snippet—that’s the usual approach. Specific signals vary by use case.Hopefully, @jerednel can add more details. discuss order hn newest jerednel|1 year ago For HTML it's markup tags...h1's, page title, meta keywords, meta descriptions.My retriever functions will typically use metadata in combination with the similarity search to do impart some sort of influence or for reranking.
jerednel|1 year ago For HTML it's markup tags...h1's, page title, meta keywords, meta descriptions.My retriever functions will typically use metadata in combination with the similarity search to do impart some sort of influence or for reranking.
jerednel|1 year ago
My retriever functions will typically use metadata in combination with the similarity search to do impart some sort of influence or for reranking.