top | item 16469114

(no title)

sinaa | 8 years ago

Great work!

Do you simply do a BFS to find the shortest paths? If so, are you doing any tricks to avoid the path explosion problem?

discuss

order

jwngr|8 years ago

Thanks! I'm glad you asked. I actually do what I call a bi-directional breadth first search[1]. The gist of it is that instead of just doing a BFS from the source node until I reach the target node, I do a reverse BFS from the target node as well and wait until the two searches overlap. That helps with the exploding path problem, although that still becomes an issue for longer paths (>= 5 degrees generally). I also pre-compute all the incoming and outgoing links for each page when I create the database[2] so I don't need to do that upon every search, which resulted in a huge performance boost.

[1] https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a4641630... [2] https://github.com/jwngr/sdow/blob/a2699dc95d884ec64a4641630...

vorpalhex|8 years ago

When I was first playing with this, I actually really expected you to be using an expensive performant solution like neo4j. When I read that you were using sqlite, I didn't believe it at first and thought it was a mistype from dev until I looked over the source.

That's an impressive and well thought out performance enhancement, and that the app runs so blazingly fast on sqlite is very impressive.

nly|8 years ago

The downside of bidirectional seems to be that things like place names are linked via short paths through boring articles like "Census designated place" or "City"