top | item 38950834

(no title)

You seem to have done some research already but let me answer briefly: GNNs and what I covered in the blog post, "RAG over structured data", are not connected. They are approaches to solve 2 different problems. GNNs: Let's forget about LLMs completely. GNN is a term given to a specific ML models where the model layers follow a graph structure. Suppose you have some data that represents real-world entities and you have features, i.e., some vector of floating numbers representing properties of these entities. Then if you want to run some predictive task on these entities, e.g., your entities are customers and products, and you want to predict who could buy a product so you can recommend these products to customers. Then there are a suite of ML tools you can use if you could represent your entities as vectors themselves, e.g., then you can use distances between these vector-representations as indication of closeness/similarity and you could recommend products to a customer A that were bought by customers that are close to A's vector representation. This is what's embedding of these entities in a vector space. One way to embed your entities is to run the nodes through an ML model that takes their features as input and produces another set of vectors (you could use the features alone as embeddings but they are not really trained and often have higher dimensions compared to the embedded vectors' dimensions). GNNs are a specific versions of such ML models where the entities and relationships between entities are modeled as a graph. And the model's architecture, i.e., the operations that it does on the feature vectors, depends on the structure of this graph. (edited)

In short, GNNs are not deeply connected to LLMs.

GNNs became very popular several years ago because they were the only ML architectures where you could incorporate into the model and training objective not just the features but also connections between features. And they dominated academia until LLMs. In practice, I don't think they're as popular as they are in academia but afaik several major companies, such as Pinterest based their recommendation engines on models that had GNN architecture.

But one can imagine building applications that use a mix of these technologies. You can use GNNs to create embeddings of KGs and the use these embeddings to extract information during retrieval in a RAG system. All these combinations are possible.

discuss

No comments yet.