top | item 41450939

(no title)

prasmuss15 | 1 year ago

Thanks for the follow-up and the in depth example and explanation. Like you said, supporting ontologies is definitely a core use-case of KG's and there are also many standard preexisting taxonomies for different things (Google and Amazon both famously have taxonomies that try to cover everything, and there are many other specialized ones as well).

I don't think I was clear enough when I mentioned our plans to add custom schema. The way we are thinking of implementing this idea is by allowing end users to provide specific node types and edge types between those nodes. Then we can pass that information on to the LLM and instruct it to extract only nodes and edges that conform to the provided schema. We would also have methods to verify the output before adding it to the graph.

So in this scenario you could input something like: { NodeType: Person, EdgeTypes: [IS_PARENT_OF, IS_CHILD_OF] }

Always extracting creating inverse relationships as well isn't something we've discussed yet but I think it's a great idea. Happy to hear any other thoughts you have or if you think there is a flaw in our approach to the custom schema to begin to solve the issue you've raised.

Edit: I think part of what you are saying just clicked for me. I think you're suggesting that the graphiti team chooses some open source taxonomy (like Google or Amazon) that we determine as our core taxonomy, and then fine tune an LLM on that data and open-source it? Then users can choose to use that fine-tuned LLM and get consistent schema relationships across the board? I think that is a really cool idea, but probably not something we would be able to do in the foreseeable future. We want the graphiti open source project to not be that opinionated, and we want to allow users to choose or fine tune their own LLMs for their specific use cases.

discuss

order

mehh|1 year ago

Yeah, but you’re kinda missing the point, there is an existing eco system of ontologies and technologies using RDF, without need to reinvent something likely not as well thought out.

prasmuss15|1 year ago

I'm not quite sure I follow. Today, graphiti extracts entities as nodes and facts between those nodes as edges. The nodes and edges store semantic data, like summaries of entities and facts representing the relationships between them (in addition to other metadata). Our searches are also based on this semantic data, and we aren't intending the extracted edge names to be used as filters as we are not doing any taxonomical classifications of nodes and edges.

In the near future, we intend to allow users of graphiti to input a custom schema (ontology), and we would use that to enforce a classifications of the extracted nodes and edges. In this case we are un-opinionated on what custom schema is being provided. You would be able to use an ontology that is made in-house or one of the many open source ones that exist in whatever field you are working in.

In neither case are we trying to recreate our own custom ontology or reinventing the wheel on how things are being classified.