(no title)
prasmuss15 | 1 year ago
Currently, we do have some ways of helping the graph to understand what nodes and edges "really mean." In addition to the name of the relationship our edges also store a "hydrated" version of the fact triple. For example, if Alice and Bob are siblings you might see an edge with the name IS_SIBLING_OF between the two. In addition to this, the edge also stores the fact: "Alice is the sibling of Bob". This way we are storing much of the semantic context on the nodes and edges themselves in addition to the graph structure.
We also support ingesting structured JSON, and I those cases the edges will be exactly the properties in the JSON doc.
spothedog1|1 year ago
I'm looking for something like graphiti that can take in a text block and when creating the relationships, automatically know to use the `fam:` ontology when creating familial relationships. The vast majority of people don't feel like defining schemas for every little thing and they're basically the same across all systems except for custom proprietary ones you define as your IP.
Their ontology would have OWL rules like `fam:isChildOf` `owl:inverseOf` `fam:isParentOf` so running an OWL Reasoner over the graph would generate the inverse triples as well
So if I had the text `Joe is Bob's dad`, input it into graphiti, then get the triples
person:Joe fam:isParentOf person:Bob person:Bob fam:isChildOf person:Joe
and the edge would be in a shared definition amongst all graphiti users. The LLM can be fine tuned to recognize exactly what fam:isParentOf means so there is no ambiguity. Right now I'm guessing graphiti could spit out edges `IS_SIBLING_OF` `SIBLING` `SISTER` `BROTHER` etc, its not standardized which makes it difficult to interact with computationally if say, I wanted to input a bunch of random text and then run pre-trained graph models of family networks.
prasmuss15|1 year ago
I don't think I was clear enough when I mentioned our plans to add custom schema. The way we are thinking of implementing this idea is by allowing end users to provide specific node types and edge types between those nodes. Then we can pass that information on to the LLM and instruct it to extract only nodes and edges that conform to the provided schema. We would also have methods to verify the output before adding it to the graph.
So in this scenario you could input something like: { NodeType: Person, EdgeTypes: [IS_PARENT_OF, IS_CHILD_OF] }
Always extracting creating inverse relationships as well isn't something we've discussed yet but I think it's a great idea. Happy to hear any other thoughts you have or if you think there is a flaw in our approach to the custom schema to begin to solve the issue you've raised.
Edit: I think part of what you are saying just clicked for me. I think you're suggesting that the graphiti team chooses some open source taxonomy (like Google or Amazon) that we determine as our core taxonomy, and then fine tune an LLM on that data and open-source it? Then users can choose to use that fine-tuned LLM and get consistent schema relationships across the board? I think that is a really cool idea, but probably not something we would be able to do in the foreseeable future. We want the graphiti open source project to not be that opinionated, and we want to allow users to choose or fine tune their own LLMs for their specific use cases.
chatmasta|1 year ago
I’m not affiliated (in fact they launched around the same time that my co-founder and I launched Splitgraph with the same “Git for data” pitch), but I find their technology very intriguing.
Knowledge graphs are on the cusp of revival after being in stasis for 20 years. They’re a perfect match for LLMs and I’m excited to see how the field adopts them.
[0] https://terminusdb.com/