top | item 36366528

(no title)

citelao | 2 years ago

I would be extremely interested in a Google Takeout viewer if you ever end up releasing one.

I dealt with Google Takeout, trying to export my photos to Apple Photos (when Google was planning to charge money for old Google Workspace accounts), and I found it extremely difficult to deal with the file format. The script I wrote (https://github.com/citelao/google_photos_takeout_to_apple_ph...) ended up being decently reliable, but there were a ton of weird mismatches between the EXIF data in Google Photos metadata and the EXIF data in the photos themselves. Although some of that wonkiness was Apple Photos, not Google.

I'd love to see software that could wrangle the mess :)

discuss

order

mholt|2 years ago

It's called Timelinize (might rename it?), and you can follow it here: https://twitter.com/timelinize (Click "Media" on the Twitter account to view a few screenshots for a preview. More to come!) (There's no website or project page yet because I've been busy developing.)

If you want an invite to try out an early dev preview today, follow @timelinize on Twitter and tweet at it, I'll see about getting you into the Discord.

Some background:

Saving a local copy of my Google Photos has been a passion project of mine since ~2014 (before Google Photos even!). For years it was only focused on downloading the data using APIs -- but then we found out that Google strips location data (from your own photos!) if using the API, so I added Takeout support.

The problem is there was no viewer. Well in 2019 I finally started working on a viewer. It has evolved a lot since it's a very ambitious project and there's nothing quite like it.

It's not just Google Photos: it's any photos and videos. It's also for your text messages and emails. And your location history. And contact list. And chat apps. And really, any files you have. It also supports Facebook, Twitter, and Instagram account exports too. Oh, and iPhone backups.

Timelinize is entity-aware, and it can map identities across data sources (with enough info, or with a manual mapping, or some optional heuristics). It's just not a photo gallery.

It's basically a really detailed view of your life and online history. It's neat because I have my family pictures, my text messages between me and my wife when we were dating (and after of course), and there's different views to explore: map, timeline, conversations, gallery, and more to come (calendar, etc).

We can even place non-geolocated data on a map since we can correlate timestamp and entity. So when we went on our honeymoon, I can see text messages received from friends while we were driving to a beach.

It's really quite immersive and magical and I haven't seen anything quite like it.

And everything is stored on your own computer, it's a GUI app and you have to have enough space to store your stuff. The data is just organized as files within a folder on disk, with a SQLite DB holding the index and the small textual items.

chaxor|2 years ago

What is the correct tool to properly merge a large set of tar.gz files for which may have an enormous overlap of similar files, and some that have been altered just slightly?

Git plus some parsing seems close in that space, as analyzing the files to create a dendrogram like tree of potential alterations to files over time by levenstein distance may be useful to approximate commit history. However, this doesn't seem to exist or be popular as a tool. There's vimdiff or meld, but they are extremely manual and tedious to the extent of being pointless to try for something like a large history of takeout tar.gz's.

Throwing in the towel completely, borgfs can be helpful to reduce the amount of space they take by de-duplication on the block level, but this is a terrible solution as it doesn't really track file changes in a reasonable way, etc. It is useful to extract the files into a directory without the tar or gz, but this can also cause issues with how to appropriately organize the directory structure over the history.

Any thoughts or projects that do a better job of this?

mholt|2 years ago

> What is the correct tool to properly merge a large set of tar.gz files for which may have an enormous overlap of similar files, and some that have been altered just slightly?

Can you elaborate on this? My understanding is that they should all extract into the same target folder without issues because each archive's set of files is distinct. But maybe I'm just assuming wrongly?

What exactly is your goal, too? It sounds like you are trying to find and de-duplicate visually similar images? Like what do you mean by "enormous overlap" or "altered just slightly"?