top | item 41269496

(no title)

kmerroll | 1 year ago

Good article on the high level concepts of a knowledge graph, but some concerning mischaracterizations of core functions of ontologies supporting the class schema and continued disparaging of competing standards-based (RDF triple-store) solutions. That the author omits the updates for property annotations using RDF* is probably not an accident and glosses over the issues with their proprietary clunky query language.

While knowledge graphs are useful in many ways, personally I wouldn't use Neo4J to build a knowledge graph as it doesn't really play to any of their strengths.

Also, I would rather stab myself with a fork than try to use Cypher to query a concept graph when better standards-based options are available.

discuss

order

CharlieDigital|1 year ago

    > While knowledge graphs are useful in many ways, personally I wouldn't use Neo4J to build a knowledge graph as it doesn't really play to any of their strengths.
I'd strongly disagree. The built-in Graph Data Science package has a lot of nice graph algos that are easy to reach for when you need things like community detection.

The ability to "land and expand" efficiently (my term for how I think about KG's in Neo4j) is quite nice with Cypher. Retrieval performance with "land and expand" is, however, highly dependent on your initial processing to build the graph and how well you've teased out the relationships in the dataset.

    > I would rather stab myself with a fork than try to use Cypher to query a concept graph when better standards-based options are available.
Cypher is a variant of the GQL standard that was born from Cypher itself and subsequently the working group of openCypher: https://opencypher.org/

More info:

https://neo4j.com/blog/gql-international-standard/

https://neo4j.com/blog/cypher-gql-world/

enragedcacti|1 year ago

> That the author omits the updates for property annotations using RDF* is probably not an accident and glosses over the issues with their proprietary clunky query language.

Not just that, w.r.t. reification they gloss over the fact that neo4j has the opposite problem. Unlike RDF it is unable to cleanly represent multiple values for the same property and requires reification or clunky lists to fix it.

CharlieDigital|1 year ago

    > clunky lists
Not sure what the problem is here. The nodes and relationships are represented as JSON so it's fairly easy to work with them. They also come with a pretty extensive set of list functions[0] and operators[1].

Neo4j's UNWIND makes it relatively straightforward to manipulate the lists as well[2].

I'm not super familiar with RDF triplestores, but what's nice about Neo4j is that it's easy enough to use as a generalized database so you can store your knowledge graph right alongside of your entities and use it as the primary/only database.

[0] https://neo4j.com/docs/cypher-manual/current/functions/list/

[1] https://neo4j.com/docs/cypher-manual/current/syntax/operator...

[2] https://neo4j.com/docs/cypher-manual/current/clauses/unwind/...

whakim|1 year ago

While I'm all for standards-based options, I think the fetishization does a disservice to anyone dipping their toes into graph databases for the first time. For someone with no prior experience, Cypher is everywhere and implements a ton of common graph algorithms which are huge pain points. AuraDB provides an enterprise-level fully-managed offering which is table stakes for, say, relational databases. Obviously the author has a bias, but one of the overarching philosophical differences between Neo4J and a Triple Store solution is that the former is more flexible; that plays out in their downplaying of ontologies (which are important for keeping data manageable but are also hard to decide and iterate on).

9dev|1 year ago

I can attest to that, or at least to the inverse situation. We have a giant data pile that would fit well onto a knowledge graph, and we have a lot of potential use cases for graph queries. But whenever I try to get started, I end up with a bunch of different technologies that seem so foreign to everything else we’re using, it’s really tough to get into. I can’t seem to wrap my head around SPARQL, Gremlin/TinkerPop has lots of documentation that never quite answers my questions, and the whole Neo4J ecosystem seems mostly a sales funnel for their paid offerings.

Do you by chance have any recommendations?

alexchantavy|1 year ago

I enjoy cypher, it's like you draw ASCII art to describe the path you want to match on and it gives you what you want. I was under the impression that with things like openCypher that cypher was becoming (if not was already) the main standard for interacting with a graph database (but I could be out of date). What are the better standards-based options you're referring to?

westurner|1 year ago

W3C SPARQL, SPARUL is now SPARQL Update 1.1, SPARQL-star, GQL

GraphQL is a JSON HTTP API schema (2015): https://en.wikipedia.org/wiki/GraphQL

GQL (2024): https://en.wikipedia.org/wiki/Graph_Query_Language

W3C RDF-star and SPARQL-star (2023 editors' draft): https://w3c.github.io/rdf-star/cg-spec/editors_draft.html

SPARQL/Update implementations: https://en.wikipedia.org/wiki/SPARUL#SPARQL/Update_implement...

/? graphql sparql [ cypher gremlin ] site:github.com inurl:awesome https://www.google.com/search?q=graphql+sparql++site%253Agit...

But then data validation everywhere; so for language-portable JSON-LD RDF validation there are many implementations of JSON Schema for fixed-shape JSON-LD messages, there's W3C SHACL Shapes and Constraints Language, and json-ld-schema is (JSON Schema + SHACL)

/? hnlog SHACL, inference, reasoning; https://news.ycombinator.com/item?id=38526588 https://westurner.github.io/hnlog/#comment-38526588

andersonvaz|1 year ago

Do you mind in mentioning some of the options available that you consider better than Cypher?

jsemrau|1 year ago

>better standards-based options are available.

Which ones would you recommend?