top | item 42549911

(no title)

jeromechoo | 1 year ago

There are two paths to KG generation today and both are problematic in their own ways. 1. Natural Language Processing (NLP) 2. LLM

NLP is fast but requires a model that is trained on an ontology that works with your data. Once you do, it’s a matter of simply feeling the model your bazillion CSVs and PDFs.

LLMs are slow but way easier to start as ontologies can be generated on the fly. This is a double edged sword however as LLMs have a tendency to lose fidelity and consistency on edge naming.

I work in NLP, which is the most used in practice as it’s far more consistent and explainable in very large corpora. But the difficulty in starting a fresh ontology dead ends many projects.

discuss

order

No comments yet.