How NASA Is Using Graph Technology and LLMs to Build a People Knowledge Graph

[+] behnamoh|10 months ago|reply

> What made you choose memgraph? ... And then Memgraph showed me the cost. That kind of sold me for time for us to be able to do that.

It's an ad post about memgraph.

[+] ctxc|10 months ago|reply

Yes, domain is memgraph and it seems to be a marketing case study.

[+] timewizard|10 months ago|reply

> It’s [sql] just not built for the complex relationships that exist in a massive organizations like NASA.

This is an absurd claim.

> Extracted Skills from Team Resumes

> Extracted Skills

> Subject Matter Experts Finder Question: designed to identify employees with expertise in specific domains or mission-critical capabilities.

I can't think of anything that screams "incompetent management" more than this. So, to find a subject matter expert, you're going to "extract skills" and "extract resumes" to answer abstract questions about your staff... without ever once.. just _talking_ to your staff?

What a cold and bizarre future these people think we want to live in.

Meanwhile can we use technology to improve the level of connectivity I have and experience as an employee? Can you please stop asking LLMs to "extract" things about me into goofy automated pipelines? If you want skilled workers then you need to demonstrate skilled management. This is all the exact opposite of that.

[+] throwaway0123_5|10 months ago|reply

It doesn't seem like it is intended for a manager to use to get information about their direct reports, but rather org-wide information?

NASA apparently has ~18k employees, it seems like it might be useful to be able to query "Who at NASA has X, Y, Z skills that can help us with this project." Then you can speak to some of those people face-to-face. It won't be perfect but certainly sounds like a useful tool in principle.

[+] mbuda|10 months ago|reply

This is like saying: "Here we have a rocket, but let's keep trying to go to the moon by bike." xD

What's wrong with attempting to better understand a given organization using LLMs or any other tech? Ofc, great managers will try as hard as possible to talk face to face as much as possible.

[+] unknown|10 months ago|reply

[deleted]

[+] browningstreet|10 months ago|reply

Every large consulting org does this. It’s a market for enterprise and home-grown solutions.

[+] kendallgclark|10 months ago|reply

The use case at NASA isn’t even new. We built this precise thing in 2008. All standards-based.

See https://www.w3.org/2001/sw/sweo/public/UseCases/Nasa/ for a public case study.

This work led to Stardog.

Which is used extensively in NASA today—

https://gpdisonline.com/wp-content/uploads/2019/09/StardogNA...

https://www.informationweek.com/machine-learning-ai/stardog-...

[+] inerte|10 months ago|reply

I know it’s a marketing case study, but:

> Ever wondered how NASA identifies its top experts, forms high-performing teams, and plans for the skills of tomorrow?

Here’s another resource on that https://appel.nasa.gov/2010/02/18/aa_2-7_f_nasa_teams-html/ the book “How NASA Builds Teams: Mission Critical Soft Skills for Scientists, Engineers, and Project Teams”

[+] jerryseff|10 months ago|reply

Memgraph is laughably expensive - I honestly wonder what anyone actually uses it for outside of companies that just don't care about infra spend.

[+] mbuda|10 months ago|reply

DISCLAIMER: The co-founder and CTO of Memgraph here.

To add more context, Memgraph Enterprise pricing is explained under https://memgraph.com/pricing: "Starting at $25,000 per year for 16 GB, Memgraph has an all-inclusive, simple pricing model that scales with your workload without restrictions. No charge for compute. No charge for replicas. No charge for algorithms. No Surprises.".

In addition, Memgraph Community is free (standard BSL license, which turns into Apache2 4 years after release date, https://github.com/memgraph/memgraph/blob/master/licenses/BS...), and it has many features that are usually considered enterprise (users, replication, not a single degradation in performance or scale, etc.).

Please elaborate more about why the pricing seems expensive, or put it into the infra-cost perspective :pray:

[+] smarx007|10 months ago|reply

If you want a production-grade graph DBMS, you don't have that many OSS options that are reliable and well-supported.

In the relational space, it took OSS options like Postgres many decades (and somehow paid-for person-years) to get to a place where enterprises seriously consider migrating off Oracle to it.

[+] jandrewrogers|10 months ago|reply

> The current graph has about 27K nodes and 230K edges

That is tiny even by historical standards. I was expecting there to be some type of technology here. Why is this interesting?

[+] demaga|10 months ago|reply

> 27K nodes and 230K edges

This is such an overkill for that kind of data. Even if they do plan to "scale up significantly", I doubt that they'll actually experience any benefit of graph db.

[+] mmooss|10 months ago|reply

Why do you say that?

[+] smarx007|10 months ago|reply

> "To make sure everyone understands that, I prefer label property graphs over RDF."

I have two major issues with virtually all graph DBMSs that are not RDF/SPARQL-based:

1) They do not allow structure-preserving querying. That is, I query a graph and want the results to be a smaller graph. This is trivial in SQL, you just 'SELECT * FROM x WHERE ...' and the result set you get is tabular just like the table x. In SPARQL, there are a CONSTRUCT/DESCRIBE queries that do just that - give you the results as a graph.

2) They don't use any (internationally recognized) standard to represent graph data. RDF is the only such format known to me (ignore all the semantic web stuff associated with it and just consider the format).

230k edges is peanuts for a graph db. It's like when the number of rows times columns in your SQL DB is 230k. NASA could (should?) have just used Oxigraph, RDF4J, or Jena. Stardog and Ontotext are the paid options. However, it is quite nice to see more interest in graph-based DBMSs in general!

> “Which employees have cross-disciplinary expertise in AI/ML?”

Regarding the study itself, I did not understand who is the target user of this. I would rather be more interested in the Lessons Learned 2.0 study (I understand it was attempted once before [1]). I don't think the study at hand would be able to correctly answer questions about expertise.

On the technical side, as far as I understand, the cosine similarity was computed per triplet? In that case, I could see how pgvector could be used for this. Relevance expansion is the only thing in the article that made me think that it would be cool if it works well. But I could see how in a combo of a regular RDF DBMS + pgvector, one could first do a cosine similarity query via pgvector and then compute an (S)CBD [2] of the subject (the from node) of the triplet.

[1]: https://youtu.be/QEBVoultYJg?t=1653

[2]: https://patterns.dataincubator.org/book/bounded-description....

[+] UltraSane|10 months ago|reply

"They do not allow structure-preserving querying. That is, I query a graph and want the results to be a smaller graph."

I'm not sure what you mean by this. The result of a query in neo4j is a set of nodes with specified relations linking them. It is much more flexible than the way SQL can only return a single table.

[+] gitroom|10 months ago|reply

Man, love seeing pushback on automated skill matchingsometimes feels like tech folks keep inventing new tools just to dodge actual conversations. Ever wonder if all this automation just makes things colder instead of smarter?

[+] rage4774|10 months ago|reply

And slower, if we‘re honest, since they never solve issues they intend to solve 100% and human attention is still needed (which is good)

[+] dpflan|10 months ago|reply

As an alternative to a pure graph db (e.g. here, memgraph), has anyone here used Apache's AGE graph-database extension for Postgresql? For making a knowledge graph that can live alongside SQL?

[+] dgllghr|10 months ago|reply

I believe AGE has unfortunately been defunded: https://github.com/apache/age/discussions/2150 It’s a shame because it seemed like being able to query data across multiple paradigms would be really useful

[+] mistrial9|10 months ago|reply

NASA announced LLMs in early days (years ago) - it seemed like they wanted to understand their own document libraries! What else could be inferred here? mass layoffs plus "people substitutes" ? is there a more diplomatic way to see this?

[+] karamanolev|10 months ago|reply

The only connection between "they wanted to understand their own document libraries" and "mass layoffs" is potentially "increased efficiency leads to needing less people for the same job". If there's anything else, please let me know.

And if it's that, then are you suggesting to not implement a certain technological efficiency tool in order to keep (now clearly redundant) jobs? That has never worked long-term in the history of mankind, AFAIK.

[+] cebert|10 months ago|reply

I think I found a place Dodge can save some money. Memgraph pricing is ridiculous.

[+] patcon|10 months ago|reply

Even paying a college grad to babysit a server costs more than their yearly rate. I assume you're speaking as someone who loves to host everything for themselves, but the logic is surely different in enterprise/government, no?

[+] citizenpaul|10 months ago|reply

My experience with tools like this is that they have only one single outcome. Piling work onto the most talented or desperate(ie need money or visa) people until they leave the org/company. Eventually leading to total skill erosion and a very low average skill/productivity across the company as people leave or hide their abilities.

Why? because there is never a reward attached. Oh you want to make me the AI resource for the agency but not remove former duties or increase my pay? Ummmm no thanks. Also things tend to happen in waves ie "AI" so everyone needs a lot from a very few people at the same time. No one ever asks how those people can be empowered. Just how can we put the screws to them so they work harder.

HR and Mgmt can f-off with their "skill resource bank" or whatever nonsense they call it this year. My skills are what I was hired for on the job description. If you want to discuss a new position or higher pay for different skills I'm very happy to talk about how I can work with the org to make that happen. Thats never the case though.

[+] thumbsup-_-|10 months ago|reply

Seems like a very simple use-case given that it will be barely used at scale. A few thousand employee entries and read qps a few 10s? What’s so special about it to post

[+] PeterStuer|10 months ago|reply

HP Tried this over 20 years ago. It stranded in HR and union disputes.

My own take is this will just be gamed to the max by ladder climbers looking out for number 1.

[+] unknown|10 months ago|reply

[deleted]

[+] dcreater|10 months ago|reply

Talks extensively about the details of the thing.

But doesn't actually show the thing.

That's AI hypecycle signal for probably bullshit/defective thing.

[+] neets|10 months ago|reply

Yall know that Apache Age is a thing to run Cypher in Postgres right?

[+] truetaurus|10 months ago|reply

Interesting, i am just going to plug something I just built around this concept: https://skillriskaudit.com/

Would love for some feedback!

[+] rkwz|10 months ago|reply

Any idea which LLM they're using?

53 comments