alexwlchan's comments

alexwlchan | 1 year ago | on: Creating a Safari webarchive from the command line

1/ Why not wget?

For this project I wanted a consistent file format for my entire collection.

I have a bunch of stuff I want to save which is behind paywalls/logins/clickthroughs that are tricky for wget to reach. I know I can hand wget a cookies file, but that’s mildly fiddly. I save those pages as Safari webarchive files, and then they can drop in alongside the files I’ve collected programatically. Then I can deal with all my saved pages as a homogeneous set, rather than being split into two formats.

Plus I couldn't find anybody who'd done this, and it was fun :D

This is only for personal stuff where I know I'll be using Safari/macOS for the foreseeable future. I don't envisage using this for anything professional, or a shared archive -- you're right that a less proprietary format would be better in those contexts. I think I'm in a bit of a niche here.

(I'm honestly surprised this is on the front page; I didn't think anybody else would be that interested.)

2/ Proprietary format: it is, but before I started I did some experiments to see what's actually inside. It's a binary plist and I can recover all the underlying HTML/CSS/JS files with Python, so I'm not totally hosed if Safari goes away.

Notes on that here: https://alexwlchan.net/til/2024/whats-inside-safari-webarchi...

alexwlchan | 2 years ago | on: Making a PDF that's larger than Germany

Argh! I knew I was going to make a numerical mistake somewhere, thanks for spotting it. Correction will be up shortly. Thanks for spotting it! :D

And thanks for the text example! This looks like what I was trying, but clearly I had a mistake somewhere.

alexwlchan | 2 years ago | on: Cleaning up my 200GB iCloud with some JavaScript

Yeah, that was what I thought when I first worked with these APIs! But when you use PhotoKit, you have to explicitly opt-in to downloading files from iCloud.

AFAICT, PHAsset is only metadata. When I'm downloading the full-sized images, I use PHImageManager.requestImage() and pass in the PHAsset I'm looking at [1][2]. I know there's something similar for video, but I've never used it.

You can control the behaviour by passing a PHImageRequestOptions instance. This includes an isNetworkAccessAllowed bool which controls where Photos.app will download the file from iCloud if not present locally, and it defaults to false.

[1]: https://developer.apple.com/documentation/photokit/loading_a...

[2]: https://developer.apple.com/documentation/photokit/phimagema...

[3]: https://developer.apple.com/documentation/photokit/phimagere...

alexwlchan | 2 years ago | on: Cleaning up my 200GB iCloud with some JavaScript

> I think that all "photos" or "videos" are just a view of the underlying "photo or video object". If you crop a video, the full-size video will remain. Only if you export the video, it will be cropped and the smaller file size will manifest.

Yup, the Photos app keeps the unmodified original file, and then any edits/crops are stored separately. You can always revert to the original file and redo your edits. So they might be storing multiple copies of the same image, with and without edits.

Which API were you looking at for "file size"?

I was able to get the size data from Photos.app with the PhotoKit API [1]. I've only tested it with my library of ~26k items, but it was useful for getting an indicator of the biggest items. (Although I didn't think to check whether exporting a 1GB video caused my iCloud usage to drop by 1GB.)

[1]: https://alexwlchan.net/2023/finding-big-photos/

alexwlchan | 2 years ago | on: Ask HN: Could you share your personal blog here?

https://alexwlchan.net/writing/

I passed 400 posts a month or so ago; been writing for about a decade. It's a mix of programming, arty stuff, digital preservation, personal thoughts – the first link describes the sort of writing I do, and examples of each.

Some favourites:

* https://alexwlchan.net/2022/screenshots/ – You should take more screenshots, a perennial darling of HN

* https://alexwlchan.net/2022/marquee-rocket/ – Launching a rocket in the worst possible way, aka abusing the <marquee> tag

* https://alexwlchan.net/2022/bure-valley/ – A day out at the Bure Valley Railway, trains!

* https://alexwlchan.net/2022/snapped-elastic/ – Finding a tricky bug in Elasticsearch 8.4.2, the sort of deep-dive debugging I don’t do often enough

(And a fairly basic post about prime factorisation with Python has been on the HN front page several times, for reasons I do not understand at all)

alexwlchan | 8 years ago | on: The Design and Use of QuickCheck

For CI systems like Travis, people add it to the cached directories, and it's shared between runs. I know Travis, Circle and AppVeyor all have some way to cache data between runs – nominally for dependencies, but .hypothesis works too.

According to our docs (http://hypothesis.readthedocs.io/en/latest/database.html?hig...), you can check the examples DB into a VCS and it handles merges, deletes, etc. I don't know anybody who actually does this, and I've never looked at the code for handling the examples database, so I have no idea how (well) this works.

If tests do throw up a particularly interesting and unusual example, we recommend explicitly adding it to the tests with an `@example` decorator, which causes us to retest that value every time. Easier to find on a code read, and won't be lost if the database goes away.

(Disclaimer: I'm a Hypothesis maintainer)

alexwlchan | 11 years ago | on: Truecrypt report

There's a paragraph in the Phase I Audit Report (published a year ago) which includes a checksum:

> The iSEC team reviewed the TrueCrypt 7.1a source code, which is publicly available as a zip archive (“truecrypt 7.1a source.zip”) at http://www.truecrypt.org/downloads2. The SHA1 hash of the reviewed zip archive is 4baa4660bf9369d6eeaeb63426768b74f77afdf2.

The Phase II report (today;s release) claims to be auditing 7.1a, so I assume it's exactly the same version and ZIP file.

Last June, they published "a verified TrueCrypt v. 7.1 source and binary mirror", including file hashes, on GitHub: https://github.com/AuditProject/truecrypt-verified-mirror

I just cloned that repo and inspected the source ZIP; the SHA1 sum matches what they quote in the report.

alexwlchan | 13 years ago | on: John Siracusa's OS X 10.8 Mountain Lion Review

Even if you could raise the money, I doubt that he would do it, or that it would be comparable to his OS X articles.

If you listen to Hypercritical (his weekly 5by5 podcast), you'll have heard that he struggles just getting the OS X reviews out the door. Since Apple is trying to move to a yearly release cycle, that’s just going to get harder. When’s he going to get the time to write an Ubuntu review?

It’s also worth considering that, “He has been a Mac user since 1984” (from his Ars bio). Part of what makes his reviews so good is his deep-rooted knowledge of the Mac platform, and having watched OS X (and previous versions of Mac OS) “grow up”, so to speak. I don’t know how much experience he has with Ubuntu, but I bet it’s not as extensive as OS X.

And that’s putting aside all the arguments of whether it’s a good thing to do, or whether the Ubuntu community would want him to write such a thing.

page 1