(no title)
markjgx | 1 year ago
I would prefer a cli tool with partial gather support. Something that I could easily setup to run on a cheap instance somewhere and have it scrape all my data continuously at set intervals, and then give me the data in the most readable format possible through an easy access path. I've been thinking of making something like that, but with https://github.com/microsoft/graphrag at the center of it. A continuously rebuilt GraphRAG of all your data.
madamelic|1 year ago
It builds an entire ecosystem around your data where it is programmatic rather than just dumping text files. The point of HPI is to build your own stuff onto it and it all integrates seamlessly together into one Python package.
The next stop after Karlicoss is https://github.com/seanbreckenridge/HPI_API which creates a REST API on top of your HPI without any additional configuration.
If you want to get more fancy / antithetical to HPI, you can use https://github.com/hpi/authenticated_hpi_api or https://github.com/hpi/hpi-graph so you can theoretically expose it to the web (I am squatting the HPI org, I am not the creator of HPI). I made the authentication method JWTs so you can create JWTs where it will give access to only certain services' data. (Beware, hpi-graph is very out of date and I haven't touched it lately but my HPI stuff has been chugging away downloading data).
Some of the /hpi stuff I made is a bit mish-mash because it was rip-and-replace from a project I was making so you'll see references to "Archivist" or things that aren't local-first and depend on Vercel applications.
bdcravens|1 year ago
michaelmior|1 year ago
slalani304|1 year ago