top | item 38416739

(no title)

unforgivenpasta | 2 years ago

I've looked into archiving all the pages i visit as well and warcprox[1] has been bookmarked for a while now

Hard drive storage space being so cheap in the ~$15/TB range makes this more feasible even for video archival

[1] https://github.com/internetarchive/warcprox

discuss

order

alchemist1e9|2 years ago

Excellent pointer with warcprox, I hadn’t seen it. I’m noticing mitmproxy, warcprox, webcrystal, and also obviously offpunk are all python.

It seems there should be some mashup of them all that can produce a solution. One that also involves using offpunk to access the archive in the terminal.

Mitmproxy caught my eye with transparent mode [1] and the idea that the client/user VM may not even need configuration in my setup, the vfio-pci GPU passthough desktop OS approach. The archiver VM produced archive/cache could just be NFS mounted over a private bridge interface between the desktop VM and archive VM.

[1] http://docs.mitmproxy.org.s3-website-us-west-2.amazonaws.com...

ploum|2 years ago

You can now start accessing the raw cache with "netcache --offline". Or you can access it by hand: the cache is only made of files stored in folders.