Ask HN: Extracting Knowledge Graphs from LLMs
7 points| lagrange77 | 2 years ago
Now i thought about automating this and programatically 'scanning' the LLMs implicit knowledge (of a specific domain) and compile it to some kind of knowledge graph - e.g. an entity relationship diagram of physics concepts.
Could be interesting. With the right scanning technique, it's maybe possible to extract a semantic representation of all of the LLMs 'knowledge' or the information in a document.
Has anyone of you already dealt with sth. like this?
[+] [-] simonmesmith|2 years ago|reply
For example, if you ask an LLM to create a graph of all proteins related to X disease, and show how they interact, it will oblige. (You can try this yourself easily in the OpenAI playground. Just ask it to send you back a list like X -> Y -> Z or whatever. Or an array of source/target/relation triplets.)
The challenge is that what you get will be very dependent on how you phrase your request. So you’ll never know if you’re getting a “complete” graph or just the most probable graph for the request you made. If you’re an expert in the domain, you’ll know, but if you’re an expert you might not need the graph in the first place.
[+] [-] james-revisoai|2 years ago|reply
LLMs are very good at dealing with contextual polysemy, the catch is that an embedding of say a topic, will be far from the topic in it's different possible context. So a knowledge graph would be possible, but how you would find these possible areas, or why you would constrain it, is sort of another question.
Now if you are just asking about education, you can get it to generate lists of relations, and map those in concept maps etc(quite a few tools do this), but that's pretty superficial, as such...
FWIW back when LLMs were more prone to hallucination, the convex hull of the embddings of known ground truth statements was more likely to contain truthful generations and relevant generations than those outside of it when I worked on a quiz-generating application in 2020/21 doing this.
In my opinion though you should try to embrace this malleable nature rather than constrain it...
[+] [-] lagrange77|2 years ago|reply
> Now if you are just asking about education, you can get it to generate lists of relations, and map those in concept maps etc(quite a few tools do this), but that's pretty superficial, as such...
Right, this would use the LLMs inherent functionality to find the most probable output for the prompt and hence could hide or hallucinate info.
What i am after is a more systematic and reliable approach to 'scrape' the models knowledge, without relying on it's best guess for a broad prompt like 'Compile a concept map of classical mechanics.'
[+] [-] ash-ishh|2 years ago|reply
Sample: https://twitter.com/yoheinakajima/status/1706848028014068118
[+] [-] lagrange77|2 years ago|reply
Spires: Building structured knowledge bases from unstructured text using LLMs
[https://news.ycombinator.com/item?id=37929351]
[+] [-] birdplanellama|2 years ago|reply