top | item 38872022

RAG Using Structured Data: Overview and Important Questions

5 points| semihsalihoglu | 2 years ago |kuzudb.com

4 comments

order

semihsalihoglu|2 years ago

This is the first blog post on a series of posts I plan to write on the role graph DBMSs and knowledge graphs play on LLM applications and recent text-to-high-level-query-language work I read up on over the holiday season.

These blogs have two goals:

(i) give an overview of what I learned as an outsider looking for technical depth; (ii) discuss some venues of work that I ran into that looked important.

This first post is on "Retrieval Augmented Generation using structured data", so private records stored in relational or graph DBMSs. The post is long and full of links to some of the important material I read (given my academic background, many of these are papers) but it should be an easy read especially if you were an outsider intimidated by this fast moving space.

tl;dr for this post: - I provide an overview of RAG. - Compared to pre-LLM work, the simplicity and effectiveness of developing a natural language interface over your database using LLMs is impressive. - There is little work that studies LLMs' ability to generate Cypher or SPARQL. I also hope to see more work on nested, recursive and union-of-join queries. - Everyone is studying how to prompt LLMs so they generate correct DBMS queries. Here, I hope to see work studying the effects of data modeling (normalization, views, graph modeling) on the accuracy of LLM-generated queries.

Hope some find this interesting.

emmanueloga_|2 years ago

So this post is about using RAG/LLM to generate queries (Cypher in this case, to be consumed by Kuzu). That way you could ask natural-language questions to be answered by the result of the query.

I wonder if you could comment about other areas of AI+Graphs (I think this is mostly Graph Neural Networks, not sure if anything else?).

For instance, I found PyG and Deep Graph Library but the use cases are so jargon-heavy [1], [2], I'm not sure about the real world applications, in layman terms.

--

1: https://pytorch-geometric.readthedocs.io/en/latest/tutorial/...

2: https://docs.dgl.ai/tutorials/blitz/index.html