top | item 41314492

Degrees of Kevin Bacon Using Postgres

87 points| pramsey | 1 year ago |crunchydata.com

17 comments

order

its_bbq|1 year ago

They call the dijkstra implementation slow but that's because they aren't using the full information it presents. Dijkstra gives shortest paths from one node to every other node in the graph, so you run it once and materialize it and now you have a full Kevin Bacon database

compsciphd|1 year ago

Eh? you have a full graph of one node to every other node. not shortest path between every pair (which I assume is what a full kevin bacon database would be).

jtwaleson|1 year ago

Couldn’t you “just” create a nullable distance column and update anything that’s null and reachable by Bacon and set it to 1. Then update anyone still null and reachable by 1 to 2, etc. Then you have a materialized view of everyone’s Bacon number.

Not a generic solution of course, but seems extremely effective for the Bacon case.

mdp2021|1 year ago

Very, very good: perfect for didactic. I thought I'd try this on SQLite, but I am in a rush right now... So I checked the documentation to refresh my memory about the extent of relevant SQL implementation in SQLite, and...

TIL that the 'WITH' clause in SQLite can draw the classic Mandelbrot Set:

https://www.sqlite.org/lang_with.html#outlandish_recursive_q...

(And there is a Sudoku puzzle solver below that.)

jnordwick|1 year ago

Now do the Erdos number: defined the same way but for coauthors of papers linked to the absurdly production (and always high on amphetamines) great Paul Erdos. The most glorious of all, the Erdos-Bacon number, is defined as the sum of your Bacon number and Erdos number.

Mathematicians Daniel kleitman and Bruce Reznik both tie for top spot with a number of 3 (Erdos 1s and Bacon 2s). Danica McKellar and Elon both in there with 6s. And Mayim Bialik, Natalie Portman, and Kristen Stewart all up there with 7s.

https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93Bacon_numbe...

mr_mitm|1 year ago

My ex had an erdos-bacon number of six or seven when she was in grad school. One paper with your thesis advisor and having one role as an extra gets you there quickly.

robryk|1 year ago

I don't see why one has to create the actor-actor relation table. I would rather search for shortest paths in the bipartite graph where nodes are movies and actors, and divide the results by 2.

harel|1 year ago

That's all very cool. As an aside I always remembered the bacon number was between Bacon and anyone else, Hollywood actor or not. Keeping it withing Hollywood makes a lot more sense.

defrost|1 year ago

Hollywood bleeds out across the boundaries, quoting the Head of the Theoretical Physics Group at Imperial College (2011-2016):

    I have an Erdos-Bacon number of six (having written a paper with Shing-Tung Yau and appeared in the film Windrider with Nicole Kidman).
https://profiles.imperial.ac.uk/j.gauntlett

orthoxerox|1 year ago

Oracle's CONNECT BY is much easier to write and understand than recursive CTE. It's a pity it's not standart.

anotherevan|1 year ago

For slow changing, read-heavy scenarios like this, make your recursive CTE a materialised view and whack some indexes on it for big performance gains.

mr_mitm|1 year ago

Is there something like Cypher that can be leveraged to make the queries against a postgres graph a bit more digestible?

willf|1 year ago

Let’s normalize recursive CTEs.