top | item 41504331

Iron Mountain: It's Time to Talk About Hard Drives

197 points| severine | 1 year ago |mixonline.com

201 comments

order

simonw|1 year ago

My understanding is that the only reliable way of long-term digital archival storage is to refresh the media you are storing things on every few years, copying the previous archives to the fresh storage.

Since storage constantly gets cheaper, 100GB first stored in 2001 can be stored on updated media for a fraction of that original cost in 2024.

Loic|1 year ago

Long term archival is successive short/middle term archival.

I think I read this quote on Tim Bray's blog[0], but I am not sure anymore. This is now my approach, my short/middle term archival is designed to be easily transferred to the next short/middle term store on a regular basis. I started with 500GB drives, now I am at 14TB.

[0]: https://www.tbray.org/ongoing/

abracadaniel|1 year ago

Pretty much. You see hobbyists getting data off of 30+ year old hard drives for the novelty of it, but I can’t imagine relying on that as a preservation copy. Optical media rots, magnetic media rots and loses magnetic charge, bearings seize, flash storage loses charge, etc. Entropy wins, sometimes much faster than you’d expect.

cherrycherry98|1 year ago

M-Disc is a digital optical medium for archival storage. Not indestructible but more resilient to degradation than the typical BD/DVD-R.

dathinab|1 year ago

interestingly this is how long term cold tape storage works more or less (in case of taps you have a bit different failure characteristics so it's more like "check read" at least every "some_time" and on checksum errors rewrite to new tape restoring from "raid" duplicates, but conceptual it's kinda the same idea)

sam_goody|1 year ago

For a while we were being sold on CDs on a more permanent medium, such a M-disks.

Assuming you store your own players, and have a convertor from USB to whatever exists in fifty years, is that a real solution?

Hard_Space|1 year ago

I have multiple 5tb external disks attached to my main tower that (among other things) serves up Plex content. I switch each one out every year, for equivalent of about a hundred dollars each. I try and find a compromise between the amount of read requests and availability for these disks, but in the end, if they're read often enough, they die soon enough.

What killed the last one was an experiment with installing Emby. Like many similar systems, it bewilderingly has no rate-limiting function, and will thrash a disk to within an inch of its life in order to index it. And that was the most recent thing that killed one of my external Plex drives, with multiple series and movies on it.

So yes, just keep refreshing the media, at reasonable intervals.

PS Yes, I know this is a poor method of content storage. NAS is looming up for me one of these days.

hooli42|1 year ago

If it does't have to be offline for long durations, software raid + adding a new drive every once in a while, and discarding failing drives is pretty foolproof.

AFAIK large data centers automate something like this.

sgarland|1 year ago

Related, CD-Rs. When I left my submarine in 2013, they (by which I mean the entire Virginia class) were still using them to store archived logs, despite my explanation that they’d be lucky to get a decade out of them. The first chosen storage location was literally the hottest part of the engine room, right in between the main engines. Easily 120+ F at all times. After protest, we moved ours to a somewhat cooler location. Still hot, and still with atmospheric oil and other fun chemicals floating around.

I look forward to the first time logs from a few decades ago are required, and the media is absolutely dead.

EDIT: they weren’t even Azo dye, they were phthalocyanine. A decade was probably generous.

6510|1 year ago

I was curious how some of the more wealthy yacht owners solved the marine puzzle. What kind of computer would they use? What kind of parts would go in? What would a basic system cost? So I asked one, he opened up a compartment with a stack of cheap Acer laptops vacuum sealed in bags. They last 2 to 6 months, when they stop working he throws them away. The sealed one has everything installed, a full battery and will sync as soon as internet becomes available. When plugged into something the new laptop is never the problem. He spend a small fortune arriving at this solution.

1oooqooq|1 year ago

That sounds like it was very much by design and nobody wanted those logs to survive

hunter-gatherer|1 year ago

Knowing nothing of submarines or seafaring, I'm genuinely curious as to what is logged on a ship that may be necessary a decade later?

retrochameleon|1 year ago

> Atmospheric oil and fun chemicals

Sounds nice and healthy

Clamchop|1 year ago

They have lots of problems:

1. Incomplete copies with missing dependencies. 2. Old software and their file formats with a poor virtualization story. 3. Poor cataloging. 4. Obsolete physical interfaces, file systems, etc. 5. Long-term cold storage on media neither proven nor marketed for the task.

Managing archives is just a cost center until it isn't, and it's hard to predict what will have value. The worst part of this is that TFA discusses mostly music industry materials. Outside parties and the public would have a huge interest in preserving all this, but of course it's impossible. All private, proprietary, copyrighted, and likely doomed to be lost one way or another.

Oh well.

cookiengineer|1 year ago

Related documentary that comes to mind: Digital Amnesia (2014) [1]

It broke my heart seeing those librarians in disbelief when their national library was sold off to the highest bidder. When they said "It seems our country does not value our own culture anymore".

Books lasted hundreds of years. Good luck trying to read a floppy from the 90s, or even DVDs that are already beyond their lifetime and are a very recent medium.

It gets worse when you read the fine print of the SSD specifications, wherein they state that an SSD may lose all its data after 2 weeks without power, and data retention rates are at less than 99%, meaning they will degrade after the first year of use. And don't get me started on SMR HDDs, I lost enough drives already :D

Humanity has a backup problem. We surely live in Orwellian times because of it.

[1] https://youtube.com/watch?v=NdZxI3nFVJs

lizknope|1 year ago

Tape doesn't last forever either.

https://en.wikipedia.org/wiki/Linear_Tape-Open#Generations

LTO-1 started in 2000 and the current LTO-9 spec is from 2021. But it only has backwards compatibility for 1 to 2 generations. You can't read an LTO-6 tape in an LTO-9 drive.

https://en.wikipedia.org/wiki/Sticky-shed_syndrome

> Sticky-shed syndrome is a condition created by the deterioration of the binders in a magnetic tape, which hold the ferric oxide magnetizable coating to its plastic carrier, or which hold the thinner back-coating on the outside of the tape.[1] This deterioration renders the tape unusable.

Stiction Reversal Treatment for Magnetic Tape Media

https://katalystdm.com/digital-transformation/tape-transcrip...

> Stiction can, in many cases, be reversed to a sufficient degree, allowing data to be recovered from previously unreadable tapes. This stiction reversal method involves heating tapes over a period of 24 or more hours at specific temperatures (depending on the brand of tape involved). This process hardens the binder and will provide a window of opportunity during which data recovery can be performed. The process is by no means a permanent cure nor is it effective on all brands of tape. Certain brands of tape (eg. Memorex Green- see picture below) respond very well to this treatment. Others such as Mira 1000 appear to be largely unaffected by it.

Data migration and periodic verification is the answer but it requires more money to hire people to actually do it.

I've got files from 1992 but I didn't just leave them on a 3.5" floppy disk. They have migrated from floppy disk -> hard drive -> PD phase change optical disk -> CD-R -> DVD-R -> back to hard drive

I verify all checksums twice a year and have 2 independent backups.

wazoox|1 year ago

I have restored a few hundreds LTO-1 and 2 tapes using an LTO-3 drive a few years ago. If you keep the drives around and run Linux (which supports obsolete hardware better), keeping LTO tapes 10, 15, 20 years is not a problem at all.

A few weeks ago I wrote for a customer a restore utility for LTO-4/5/6 made with a now-deceased archival system from a deceased software company. Most of these tapes are up to 16 years old, have been kept in ordinary office cupboards, and work perfectly fine.

But you're right that archival isn't much about the media, but is a process. "Archive and forget" isn't the way.

adrian_b|1 year ago

While tape does not last forever, the LTO tapes are specified for at least 30 years.

The more serious problem is as you say that the older drives become obsolete. Even so, if you start using an up-to-date LTO format you can expect that suitable new tape drives will be available for buying at least 10 years in the future.

For HDDs, the most that you can hope is a lifetime of 5 years, if you buy the HDDs with the longest warranties.

bell-cot|1 year ago

Policy at $Job - all important data is backed up to a rotation of high-quality hard drives. Which are stored off-site, powered down. Every N weeks, each one of them is powered up (in an off-line system) and checked - both with the SMART long test, and `zfs scan` (which verifies ZFS's additional anti-bit-rot checksums for the data).

Yes, it's a bit of a PITA. OTOH, modern HD's are huge, so a relative few are needed. And we've lost 0 bits of our off-site data in our >25 years of using that system.

bogwog|1 year ago

This article is too vague. It sounds like they're talking about the physical drive not working, but they're giving examples where you can't playback because you need to install the correct old software, plugins, etc... Which doesn't have anything to do with hard drives.

So what's actually wrong with hard drives for archival? Do they deteriorate? Do they "rot" like DVDs/blurays/etc have been known to do? Or is this just an ad for their archival service?

mystified5016|1 year ago

Magnetic HDDs do suffer bit rot, yes. But perhaps more importantly the mechanisms suffer physical failure over time. You can't just pop the platters into a new drive, even if you had an identical model.

That's really the main disadvantage of hard drives: the media is permanently coupled to the drive. If your tape drive fails, you can just pop the tape into a working drive and still get your data back.

wmf|1 year ago

Hard drives are known to suffer "sticktion" where the heads get stuck to the platter and either the drive won't spin up or it spins up and the heads damage the platters. I imagine hard drives could also have bad capacitors but I haven't heard of that happening.

antisthenes|1 year ago

What's the scenario where you cannot take the old 1990's hard drive and back up its data in multiple cloud service providers cold storage (Azure/AWS/GCP) and have to keep the obsolete physical media on hand?

I'm struggling to understand why these miles of shelves filled with essentially hardware junk haven't been digitized at the time when this media worked and didn't experience read issues.

The article doesn't really provide an explanation for this other than incompetence and the business biting off more than it can reasonably chew. I'd be furious if I paid for a service that promised to archive my data, and 10-15 years later told me 25% of it was unreadable. I mean it's not like it was a surprise either. These workflows became digital 2-3 decades ago. There was plenty of time to prepare and convert this.

That's kind of what I'm paying you for.

As always, seems like the simple folk of /r/datahoarder and other archivist communities are more competent than a legacy industry behemoth.

Cheer2171|1 year ago

> I'd be furious if I paid for a service that promised to archive my data, and 10-15 years later told me 25% of it was unreadable.

The article is very vague on this, but I thought this company was first doing something like a bank safety deposit box. Send us your media in whatever format and we will keep it secure in a climate controlled vault. They don't offer to archive your data, they offer to store your media. Now it seems they pivoted to archiving data. This is an ad for their existing media storage clients to buy their data archive service:

> Iron Mountain would like to alert the music industry at large to the fact that, even though you may have followed recommended best practices at the time, those archived drives may now be no more easily playable than a 40-year-old reel of Ampex 456 tape.

kmeisthax|1 year ago

It's not a matter of incompetence, it's a matter of being very, very cheap.

Artistic endeavors are a unique blend of "extremely chaotic workflows nobody bothers to remember the moment the work is 'done'", "90% of our output doesn't recoup costs so we don't want to burn cash on data storage", and "that one thing you made 20 years ago is now an indie darling and we want to remaster it". A lot of creatives and publishers were sold on the promise of digital 30-odd years ago. They recorded their masters - their "source code" - onto formats they believed would be still in use today. Then they paid Iron Mountain to store that media.

Iron Mountain is a safe deposit box on steroids, they use underground vaults to store physical media. You store media in Iron Mountain if you want that specific media to remain safe in any circumstance[0], but that's a strategy that doesn't make sense for electronic media. There is no electronic format that is shelf-stable and guaranteed to be economically readable 30 years out.

What you already know works is periodic remigration and verification[1], but that's an active measure that costs money to do. Publishers don't want to pay that cost, it breaks their business model, 90% of what they make will never be profitable. So now they're paying Iron Mountain even more for data recovery on the small fraction of data they care about. The key thing to remember is that they don't know what they need to recover at the time the data is being stored. If they did, publishers wouldn't be spending money on risky projects, they'd have a scientific formula to create a perfect movie or album or TV show that would recoup costs all the time.

[0] The original sales pitch being that these vaults were nuke-proof.

[1] Your cloud provider does this automatically and that's built into the monthly fees you would pay. People who are DIYing their storage setup and using BTRFS or ZFS are using filesystems that automate that for online disks, but you still pay for keeping the disks online.

0cf8612b2e1e|1 year ago

It depends on what specifically Iron Mountain is selling you. A place to store your physical data device or are they promising to keep your data available? The former sounds cheaper and easier for Iron Mountain. Given Iron Mountain started in the 1950s, redundantly backing up customer data was infeasibly expensive for most of the company’s lifetime.

akira2501|1 year ago

> the business biting off more than it can reasonably chew

It's hoarding behavior. They paid "a lot" of money for it, have no idea how to further exploit it, but can't shake the feeling that it might be massively valuable one day.

The only difference is they pay someone to hold their hoard for them.

bob1029|1 year ago

Iron mountain also provides services like source code escrow.

With 2 parties involved in the data, you may want to impose additional restrictions regarding how and when it can be replicated. The party requesting escrow clearly has interest in the source being as durable as possible, but the party providing the source may not want it to be made available across an array of dropbox-style online/networked systems just to accommodate an unlikely black swan event.

A compromise could be to require that the source reside on the original backup media with multiple copies and media types available.

tecleandor|1 year ago

Also where you don't render the tracks pre and post processing and leave them aside to the ProTools project files. I don't know who expects to open a ProTools project with a bunch of unknown plugins after some years have passed...

surgical_fire|1 year ago

I mean, even if by contract they were supposed to store physical media with the backups, it is still horrible incompetence to not have the same data backed up twice, and from time to time test the disks for failure to rebuild the backup from one of the copies.

It would be extremely unlikely for both disks to fail together.

What I'm describing is the bare minimum. This is their job, by all accounts. Amazing.

MisterTea|1 year ago

Makes me wish we didn't stop advancing optical media technology to where we have cheap and reliable archival quality 1TB discs for a few bucks each. I guess LTO is the best option for personally controlled archival.

0cf8612b2e1e|1 year ago

We haven’t, but sadly the technology is locked to big tech.

Microsoft has demoed some cool technology where they store data in glass, Project Silica. Sadly, it seems unlikely this will ever be available to consumers. One neat aspect of the design is that writing data is significantly higher power than reading. So you can keep your writing devices physically separated from the readers and have no fear that malicious code could ever overwrite existing data plates.

Some blurbs

  Project Silica is developing the world’s first storage technology designed and built from the media up to address humanity’s need for a long-term, sustainable storage technology. We store data in quartz glass: a low-cost, durable WORM media that is EMF-proof, and offers lifetimes of tens to hundreds of thousands of years. This has huge consequences for sustainability, as it means we can leave data in situ, and eliminate the costly cycle of periodically copying data to a new media generation.

  We’re re-thinking how large-scale storage systems are built in order to fully exploit the properties of the glass media and create a sustainable and secure storage system to support archival storage for decades to come! We are co-designing the hardware and software stacks from scratch, from the media all the way up to the cloud user API. This includes a novel, low-power design for the media library that challenges what the robotics and mechanics of archival storage systems look like.
https://www.microsoft.com/en-us/research/project/project-sil...

Twirrim|1 year ago

Optical media is neat, but has a number of drawbacks when it comes to large scale operations.

What you're talking about already sort of exists, albeit media hadn't reached "cheap" yet, because the manufacturing scale wasn't there. People weren't interested enough in it. Archival Disc was a standard that Sony and Panasonic produced, https://en.wikipedia.org/wiki/Archival_Disc. Before the standard was retired you could by gen3 ones with 5.5TB of capacity, https://pro.sony/ue_US/products/optical-disc-archive-cartrid...

LTO tape was already at 15TB by the time their 300GB Discs came out, and reached 45TB capacity 3 years ago. Tape is still leaps and bounds ahead of anything achievable in optical media and isn't write-once. (https://en.wikipedia.org/wiki/Linear_Tape-Open)

Part of the problem is you can't just store and forget, you have to carry out fixity checks on a regular basis (https://blogs.loc.gov/thesignal/2014/02/check-yourself-how-a...). Same thing as with your backups, backups that don't have restores tested aren't really backups, they're just bitrot. You want to know that when you go to get something archived, it's actually there. That means you're having to load and validate every bit of media on a very regular basis, because you have to catch degradation before it's an issue. That's probably fine when you're talking a handful of discs, but it doesn't scale that well at all.

The amount of space that it takes for the drives to read the optical disc, the machinery to handle the physical automation of shuffling discs around etc. combined with the costs of it, just make no sense compared to the pre-existing solutions in the space. You don't get the effective data density (GB/sq meter) you'd need to make it make sense, nor do the drives come at any kind of a price point that could possibly overcome those costs.

To top it all off, the storage environment conditions of optical media isn't really any different from Tape, except maybe slightly less sensitive to magnetic interference.

netrap|1 year ago

Unfortunately recordable optical is on it's way out. Sony recently slashed the staff at the Japan plant that makes BD-R's (BD-R XL's). Still CMC makes CDR, DVDR, BDR though.

shiroiushi|1 year ago

No, LTO isn't a viable option at all for most people: it's simply far too expensive. The drives themselves cost thousands of dollars each.

akira2501|1 year ago

> It may sound like a sales pitch, but it’s not; it’s a call for action

Your entire article sounds like a sales pitch. Your solution is, well, it's bad, but trust us, we can maybe recover it anyways. Otherwise your article fails to convey anything meaningful.

derefr|1 year ago

No, the call-to-action being referenced in the article is "stop archiving to hard drives" (and use tape instead, every other industry does.)

alchemist1e9|1 year ago

Does it mean LTO tape for the win then?

We’re about to start a project to build an LTO-9 based in-house backup system. Any suggestions for DIY Linux based operation doing it “correctly” would be appreciated. Preliminary planning is to have one drive system on in our primary data center and another offsite at an office center where tapes are verified before storage in locked fireproof storage cabinet. Tips on good small business suppliers and gear models would be great help.

bluedino|1 year ago

Tapes are fun. You can fit a petabyte of data in a bankers box!

The problem quickly becomes:

- Do we have a drive that can read this tape?

- Do we have server we can connect it to?

- Do we have storage we can extract it to? (go ask your internal IT team for 10TB of drivespace...)

- What program did we create this tape with? Backup Exec, Veritas, ArcServe, SureStore

- You have the encryption keys, right?

- How much of this data already exists on the previous months backup?

- Who's going to pay for the storage to move it to Glacier/etc?

-How long is it going to take to upload?

gosub100|1 year ago

Make sure the bandwidth exists to keep up with the write speed of the LTO drive. For instance, the write speed for LTO-6 (which I own as a hobbyist) is around 300MB/s, but line speed of gigabit Ethernet is about 100MB/s. Translate those numbers to LTO-9 and make sure that the NAS, network, or local storage can keep up. It's not a deal-breaker to underflow the drive, but it causes the tape to stop, rewind, and re-buffer (called shoe-shining) which takes more time and causes unnecessary wear on the drive and cartridges.

akira2501|1 year ago

> fireproof storage cabinet.

Nothing is fire proof. Is the cabinet "fire suppression system liquid" proof?

> Tips on good small business suppliers and gear models would be great help.

Hire an auditor would be my advice. Every business is different.

I am, just now, having flashbacks of when I was in a SOX environment and had to regularly contract with them... and while the experience can be somewhat unpleasant I've often found good auditors to be extremely knowledgeable about solutions and their practical implementation considerations.

wazoox|1 year ago

Archival is more of a process than only a question of media. First you must create a proper database of your archived data. Maybe you want to do 3 copies, not two. Maybe you want to use two different archival formats such as tar and LTFS, just in case. Maybe you want to source your media from both available producers (Sony and Fuji) because in the long run, maybe one or the other may grow some funky error mode or corruption problem. Etc.

Also check my tape management primer: https://blogs.intellique.com/tech/2022/01/27#TapeCLI

adrian_b|1 year ago

LTO-9 tapes can be easily found on Amazon in many countries, made by IBM, HP, Quantum or Fuji.

The vendor does not matter, whichever happens to be cheaper at the moment is fine.

For the tape drives, the internal drives can be cheaper by around 10%, but I prefer the tabletop drives, because they are less prone to accumulate dust, especially if you switch them on only when doing a backup or a retrieval. The tape drives have usually very noisy fans, because they are expected to be used in isolated server rooms.

I believe that the cheapest tape drives from a reputable manufacturer are those from Quantum. I have been using a Quantum LTO-7 tape drive for about 7 or 8 years and I have been content with it. Looking now at the prices, it should be possible to find a tabletop LTO-9 drive for no more than $5000. Unfortunately, the prices for tape drives have been increasing. When I have bought an LTO-7 tabletop drive many years ago it was only slightly more than $3000.

The tapes are much cheaper and much more reliable than hard disks, but because of the very expensive tape drive you need to store a few hundred TB to begin to save money over hard disks. You should normally make at least two copies of any tape that is intended for long-term archiving (to be stored in different places), which will shorten the time until reaching the threshold of breaking even with HDDs.

Even if there are applications that simulate the existence of a file system on a tape, which can be used even by a naive user to just copy files on a tape, like copying files between disks, they are quite slow and inefficient in comparison to just using raw tape commands with the traditional UNIX utility "mt".

It is possible to write some very simple scripts that use "mt" and which allow the appending of a number of files to a tape or the reading of a number of consecutive files from a tape, starting from the nth file since the beginning of a tape. So if you are using only raw "mt" commands, you can identify the archived files only by their ordinal number since the beginning of the tape.

This is enough for me, because I prepare the files for backup by copying them in some directory, making an index of that directory, then compressing it and encrypting it. I send to the tape only encrypted and compressed archive files, so I disable the internal compression of the tape drive, which would be useless.

I store the information about the content of the archives stored on tapes (which includes all relevant file metadata for each file contained in the compressed archives, including file name, path name, file length, modification time, a hash of the file content) in a database. Whenever I need archived data, I search the database, to determine that it can be found, for instance in tape 63, file 102. Then I can insert the corresponding cartridge in the drive and I give the command to retrieve file 102.

I consider much better the utility "mt" of FreeBSD than that of Linux. The Linux magnetic drive utilities have seen little maintenance for many years.

Because of that, when I make backups or retrievals they go to a server that runs FreeBSD, on which the SAS HBA card is installed. When a tabletop drive is used, the SAS HBA card must have external SAS connectors, to allow the use of an appropriate cable. I actually reboot that server into FreeBSD for doing backups or retrievals, which is easy because I boot it from Ethernet with PXE, so I can select remotely what OS to be booted. One could also use a FreeBSD VM on a Linux server, with pass-through of the SAS HBA card, but I have not tried to do this.

My servers are connected with 10 Gb/s Ethernet links, which does not differ much from the SAS speed, so they do not slow much the backup/retrieval speed. I transfer the archive files with rsync over ssh. On slow computers and internal networks one can use rsync without ssh. I give the commands for the tape drive from the computer that is backed up, as one line commands executed remotely by ssh.

The archive that is transferred is stored in a RAMdisk before being written on the tape, to ensure that the tape is written at the maximum speed. I write to the tape archive files that have usually a size of up to about 60 GB (I split any files bigger than that; e.g. there are BluRay movies of up to 100 GB). The server has a memory of 128 GB, so I can configure on it a RAMDdisk of up to 80 GB without problems. This method can be used even with a slow 1 Gb/s or 2.5 Gb/s network, but then uploading a file through Ethernet would take much more time than writing or reading the tape.

There is one weird feature of the raw "mt" commands, which is poorly documented, so it took me some time to discover it, during which I have wasted some tape space.

When you append files to a partially written tape, you first give a command to go to the end of the written part of the tape. However, you must not start writing, because the head is not positioned correctly. You must go 2 file marks backwards, then 1 file mark forwards. Only then is the head positioned correctly and you can write the next archived file. Otherwise there would be 1 empty file intercalated at each point where you have finished appending a number of files and then you have rewound the tape and then you have appended again other files at the end.

hinkley|1 year ago

The advice I got long ago from an IT guy was: if you wait long enough, tape will be on top again.

That was a long time ago but I’ve peeked in at backup systems in the intervening years and it does seem to hold true over time.

But it really depends how much data you have. My ex dropped a single HDD in a safety deposit box at CoB, N times per week and fetched back the oldest disk. I don’t think she ever said how many were in there but I doubt it was more than three. I think the CTO took one home with him once per week.

The silly thing about most of this set up is that the office, the bank, and the data center were all within half a kilometer of each other. If something bad happened to that part of town they only had the infrequent offsite backup.

WaitWaitWha|1 year ago

Q: Why not archival M-Disc?

antisthenes|1 year ago

What's your budget?

mercurialuser|1 year ago

There are several articles about film preservation in digital format. Every X years all the data is "upgraded", from LTOn to LTOn+1 or +2.

So it may sound like a sales pitch but I consider it more a warning notice

esafak|1 year ago

That means the hardware and the file format.

steve3055|1 year ago

I once pressed my boss into having off-premises storage of documents so we could still manufacture product in a new facility if the current facility burnt down. Unfortunately, someone started the habit of sending the primary documents to the same facility if the product was deprecated. One day, that off-premises facility burnt down and all the contents was lost. I think it was a regular self-storage space.

That aside, this sounds extremely old-fashioned, but it seems to me that the only media that is acceptable for long-term storage is going to be punched paper tape. How long does paper last? How long do the holes in it remain readable? Can it be spliced and repaired?

hilbert42|1 year ago

"Of the thousands and thousands of archived hard disk drives from the 1990s that clients ask the company to work on, around one-fifth are unreadable."

Why is this surprising?

It's been known for decades that magnetic media loses remanence at several percent a year. It's why old sound tape recordings sound noisy or why one's family videotapes of say a wedding are either very noisy or unreadable 20 or so years later.

Given that and the fact that hard disks are already on the margin of noise when working properly it's hardly surprising.

The designers of hard disks go to inordinate lengths to design efficient data separators. These circuits just manage to separate the hardly-recognizable data signal from the noise when the drive is new and working well so the margin for deterioration is very small.

The solution is simple, as the data is digital it should be regenerated every few years.

Frankly I'm amazed that such a lax situation can exist in a professional storage facility.

Edit: has this situation developed because the digital world doesn't know or has forgotten that storing data on magnetic media is an analog process and such signals are deteriorated by analog mechanisms?

kkfx|1 year ago

People should learn a thing: data are not tied to the physical media hosting them, like words on paper, and the sole way to preserve data is migrating them from a physical support to another regularly, also converting their formats sometimes, because things changes and an old format could end up unreadable in the future.

We can't preserve bits like books.

Thrymr|1 year ago

> We can't preserve bits like books.

The only reason we have any copies of "books" (i.e. long written works) from the ancient world is that they were painstakingly copied over centuries from one medium to another, by hand for most of that time.

TZubiri|1 year ago

I'm launching a competitor for Iron Mountain. It's called DevNull LLC. Just send us your files! We'll take care of it don't worry.

hulitu|1 year ago

> Of the thousands and thousands of archived hard disk drives from the 1990s that clients ask the company to work on, around one-fifth are unreadable

Some 25 years ago, the hardest part in booting some Apollo workstations, was to make hard drives spin.

somat|1 year ago

For long term archiving, the fundamental hard problem is the storage density, The further the storage unit size drifts below human scale the harder it is to long term archive.

I think for the average person, the best thing to do, for long term archives is to take advantage of sturgeons law, "90 percent of everything is crap". triage the things you want to archive to a minimum, then print them out, at human scale, on paper. have physical copies of the photos you want to keep, listings of the code you are proud of, correspondence that is dear to you.

This will last, with no intervention, a very long time. Because as is increasingly becoming obvious, once the format drifts below human scale the best way to preserve data is to manage the data separate from the medium it is stored on with a constant effort to move it to a current medium. where it easily evaporates once vigilance drifts.

jwsteigerwalt|1 year ago

I grew up in Pittsburgh. When I was flying in and out of the Pittsburgh airport (usually to Atlanta) during and after college, I would often see Iron Mountain uniformed employees waited for standby seats carrying their little pelicans cases…

nayuki|1 year ago

I had to do a double take on this, as I associate Iron Mountain as the brand that shreds papers and hard drives as their most common service.

1oooqooq|1 year ago

Well, this one is storing spinners POWERED DOWN... So it's petty much a slower data wiping service :shrug_emoji

RobRivera|1 year ago

The second M doesn't help.

Almost like the title was purposely crafted to mislead you to draw eyeballs.

jcpham2|1 year ago

A hard drive from 1995 will most likely be formatted FAT16/or FAT32 and the last Windows OS that reads that filesystem by default is Windows XP - I keep an old XP workstation operational as a test rig for reading data on old hard drives.

Or I have to mount them in another OS that isn't Windows. It's more than just adjusting DAW settings and updating plugins at this point you need to know that around 2000 the filesystems completely changed with NTFS and added security that wasn't present before.

By the time Vista/7 FAT hard drive support is gone from Microsoft land. There are of course add-ons and such but you still need to _know_ this happened and FAT drives look unformatted in modern Windows.

0ID|1 year ago

An interesting topic would be the debate on how to store data for extremely long periods of time, several hundred thousand years, for example documentation of nuclear waste sites from power plants... Any ideas?

elzbardico|1 year ago

The solution is metallic punch cards.

crazygringo|1 year ago

As counterintuitive as it may be, it seems to me like the only reliable long-term storage for data is with commercial cloud providers.

Any time you're physically warehousing old hard drives and whatnot, they're going to be turning into bricks.

Whereas with cloud providers, they're keeping highly redundant copies and every time a hard drive fails, data gets copied to another one. And you can achieve extreme redundancy and guard against engineering errors by archiving data simultaneously with two cloud providers.

Is there any situation where it makes sense to be physically hosting backups yourself, for long-term archival purposes? Purely from the perspective of preserving data, it seems worse in every way.

thadt|1 year ago

^ This. Physical media is continuously degrading. Large storage systems work by regularly reading, verifying, and replicating data - it is always doing backups and restores. If this isn't happening actively and regularly, your data will cease to exist at some point in time.

Whether we collectively need to store all these things is another question entirely. But if we want to keep it - we'll have to do the work to keep it maintained.

otabdeveloper4|1 year ago

> they're keeping highly redundant copies and every time a hard drive fails, data gets copied to another one

Or so they say. It's not like you can double-check.

> Is there any situation where it makes sense to be physically hosting backups yourself, for long-term archival purposes?

Yes, political and legal risks. There's no guarantee your cloud won't terminate your account for any of a thousand reasons in the future.

steve3055|1 year ago

It sounds like the only reliable backup media is punched paper tape.