Vellum works and has been proven to work for hundreds of years. Archival paper is interesting but largely unproven tech. On paper it works (no pun intended) but we really haven't seen if it will last even 200 years, and we should before we start printing our most treasured documents on it.
Digital long term storage is just, no. Look, all key UK documents are ALREADY digital they're available to download right now. However we aren't talking about digital Vs. non-digital, we're talking about what to store BACKUPS on.
Digital long term storage is something people who frankly know little about technology point to. It is up to IT-types to figure out the costs and the "how?" from then on out. There is no digital storage media which will still exist reliably in 200-400 years, not CDs, not HDDs, and not SSDs. Tapes are the closest and they won't last that long.
So we aren't talking about slapping it on a CD and storing it in some archive, we're talking about slapping it on a piece of digital media today and then moving it every one hundred years without fail, or we lose it forever... If one government or generation loses interest then future generations suffer (see the current government as an example: moving away from reliable Vellum to unproven papers).
Plus with digital you also have to worry about: solar flares/EMP, format knowledge (both of the file system & file format), media protocol knowledge (SATA or CDs might be forgotten technology), and the ability to alter historical documents without detection (not that vallum is immune, but it requires more skill/time).
PS - The "Digital Preservation Coalition" are a group of historians and archivists. They aren't technologists, they just point blindly at digital without explaining the nitty gritty of HOW.
Archiving things for future civilizations to dig up is one thing. Then you may be talking about hundreds or thousands of years that the data just sits there. Then you have to worry about the longevity of bits, and whether the technology and knowledge to read the data will exist in the future. Digital is very hard for that use case.
On the other hand, storing data for an active/continuously used archive doesn't seem so hard. You just keep the data digitally on some media and every few years you upgrade the tech to whatever technology and formats are "current" before the old format and tech is completely forgotten. Keeping digital data around for a decade is pretty easy. If you do that 100 times in a row you have kept it for a thousand years and you never had a file format or a storage media older than a decade. You could "forget" to upgrade the archive, but you could also "forget" to scribble on the calfs some time.
Entirely agree with you about the current lack of longer term stable data storage options.
That said, waiting the full duration of a material's projected life span before actual use of that material would limit options hugely, especially with the massively increasing corpus of generated data.
In this sort of context (amount of data) one can spread your bets and use vellum, as well as other methods, until other more stable forms of storage are easily available.
> Archival paper is interesting but largely unproven tech. On paper it works (no pun intended) but we really haven't seen if it will last even 200 years, and we should before we start printing our most treasured documents on it.
No no. This doesn't make sense. If we're not going to put our treasured documents on archival paper until we've proved that archival paper really will last for centuries, they're never going to be put on archival paper, no matter how great it might or might not be, because we'll all be dead.
You wanted to say we should see how archival paper pans out before we stop printing our treasured documents on other types of material.
While there's much truth in your statement, I wouldn't be so quick to discount the knowledge and experience of archivists, who have collectively as a profession put a tremendous amount of time and effort into researching archival theory and practice, both physical and digital, and who count among them extremely knowledgeable technologists. They are acutely aware of the fact that you can't just burn a cd and call it a day. With that said, I don't know anything about the Digital Preservation Coalition; you may be right about them in particular. [Edited for clarity]
Yes, also makes complete sense from a cultural perspective. If there is any sort of process involved in making a copy on new formats as they emerge, then you have to maintain a culture of people that are willing and able to execute the process. Far simpler to use a technology that is proven to work over several hundred years. Exhibit A: The Magna Carta [1]. With care, you can pick it up and read it, even today,
Automation won't hack it either. At the moment, the most complex machine that we know how to make persist into deep time is a very simple clock [2], and even that requires human intervention.
In Anathem [3], Neal Stephenson speculates on what this combo of culture and deep-time technology might look like.
> Archival paper is interesting but largely unproven tech.
Archival paper is tested by artificially aging it (e.g. exposure to heat and moisture to simulated passage of time.) Some long-term tests are underway to determine whether accelerated aging tests accurately simulate the natural aging process.[1] And there are associated standards, e.g. ISO 5630-3.[2] But you are correct, we really have no proof of how modern archival paper will perform over a multi-century timeline.
Actually, archivists and technologists have been working together to solve the problems of digital preservation for a long time. There is an ISO Standards for digital preservation systems - ISO 14721:2012 originally authored by and still maintained by The Consultative Committee for Space Data Systems (CCSDS, which part of NASA). There are commercial systems that were developed to these standards, among them Preservica (http://preservica.com/) which is used in many archives in Europe.
Archivists are acutely aware of the obsolescence of digital formats and media. And the solution cannot be just printing them out on vellum or archival paper because increasingly records are being created digitally that cannot be meaningfully or adequately converted into printed formats. I am talking about records that are audio, video, high resolution photographs (like those that NASA is seeking to preserve digitally) etc. Even emails when printed out onto paper would be missing a lot contextual information that would have been kept if you preserved it as a digital format like EML or MSG.
So, existing digital preservation systems may still be imperfect, but nevertheless we still need find improved solutions for preserving them digitally.
I don't work for CCSDS or Preservica. I work for a national archive as a specialist on digital records and archives, but my background is in IT having spent about 15 years in software development.
Doing this doesn't cost much, and it's a basic archive of the business of the people worth keeping in human readable form.
At the end of the day, tech is good, but the business of people needs to be human readable, accessible in a court, trusted by all. This is a basic requirement humans have, and without it in place, we are forced to trust and depend on tech for law, and it's real consequences.
Nothing wrong with using tech, mind you. We can, should and do. It's more efficient.
But, say something bad happens. That's what the vellum is for.
There is no such medium as digital. Digital is the encoding method. All media have a lifespan, a certain durability. That determines its future survivability. A digital format could be coded onto vellum, but plain text is more reliable. This is the typical enthusiasm for "technology" by leaders who know only how to get re-elected.
This is a good point, for more reasons that mere pedantry. Describing storage methods as "digital" also encourages people two conflate two equally important but distinct concerns:
1. Durability of the physical medium over long periods of time, and
2. Our ability to decode extract information from the media even if they do physically survive. (I remember this being a hot issue several years ago, but one that seems to have faded a bit recently.)
I wonder what the maintenance costs would be to keep a digital document storage system running robustly and effectively for 800 years (as long as the Magna Carta document lasted)?
(Not trying to imply that vellum should be the only storage method but that sounds like cheap robustness, and I'd not want to throw that away just because you want to be capital-D Digital.)
The closest equivalent I can think of immediately is Amazon Glacier, which is intended for 'long-term' storage where they take care of transferring the data as necessary to preserve its accessibility. Obviously they're thinking years-to-decades, not centuries, but as a ballpark it may work.
Their current pricing is $0.0007 per GB-month, or $0.000064 per KB for 800 years in today's dollars. Since the Latin text of Magna Carta is 28 KB of UTF-8 text, that works out to $0.0018 to store the text for 800 years. You'd obviously also want a high-resolution scan of the original document. I don't know how big that would be. If it was a 100MB image, it would cost $6.66 to store for 800 years at Amazon's current Glacier prices.
Obviously we're making immense assumptions about the reliability of the service, of price stability to deliver the quality of service, and so on. And perhaps the biggest issue with this analysis is that we're also relying on huge economies of scale for it to be profitable to deliver an archival service like this over centuries.
If this sounds crazy cheap to you, you're not alone. As technologists we're used to dealing with huge volumes of data, so it's easy to forget that historically these volumes simply did not exist.
Our research group is looking at this exact question, and it turns out that there are several factors that effect the balance between cost and reliability/longevity in an archive.
Paper: http://www.ssrc.ucsc.edu/Papers/gupta-mascots14.pdf
It would require a trust, a fund and a dedicated organisation, possibly using lawyers. Legal documents relating to land and titles have survived for hundreds of years in this manner.
It's a social not a technological problem.
But how to do it is much easier to solve.
Why to do it is the actual problem. What to keep and when to keep it without knowing the future.
How does anyone know that the 100 photos that a person from Morocco put onto Flickr 5 years ago would have value in the future? Why should those be archived? In 200 years these photos could be immensely valuable.
How can this potential future value ensure the archiving of the photos today? Is there economics for this?
Hard to say, considering tech will continue to evolve over that time. Who knows, 100 years from now we might have found an extremely reliable way to store digital information long-term, thus we'd only have to do the "copy from one medium to another" dance for a few generations. Or we might not.
Reminds me of the 'leafs' from Anathem[0]. for sure none of the current digital storage artifacts will be readable in +500 years, like scrolls. Contiguous upgrade required by digital will always have a cost and only 'certain' items will end up surviving, where the criteria for choosing will vary with fashion, religion, politics. Wonder is archaeologist in 2500 CE will reveal "500 years old file discovered in digital archives" and then, year later, announce the succesful decoding of an original cca. 2016 animated cat GIF...
Anathem was inspired by the Long Now which is trying to build the 10,000 year Clock and Library. They've been doing a lot of work to try to preserve things for that long. One attempt is the Rosetta Project which is text and info on 1500 languages in 13000 pages etched and electroformed onto a nickle platter. [1]
There was a post here just a few days ago about the difficulties in accessing the data from the BBC's Domesday Project just 30 years later. The original Domesday Book was written on vellum and copies have survived almost a millennium.
First job out of college, I worked as a oil data digital archivist, recovering data from old tapes. The data was geological shot data (seismic data collected by blowing dynamite, ground thumping trucks and explosive gas ships).
There are likely thousands of different combinations of hardware, data and formats that we could support.
This was just from a 50 year period.
I've no doubt that data rot is one of the hardest challenges for the future.
If it's really important then you better put it on something physical that lasts.
It seems to me that a lot of the issues people are bringing up in relation to this story were handled in the age of vellum's primacy by orders of monks or the like?
It would seem to me that the place to look for knowledge about digital archiving is those who have taken up that call in modern times, such as the American examples of ibiblio.org and archive.org. I'm fairly confident that there exist European equivalents, as well. Why not look to the research, as opposed to breathless naysaying?
It's certainly not a solved problem, but it's definitely not one that's being handwaved away, either.
Archiving isn't a one and done. Anyone that sells a single solution and doesn't have it including costs for migration over time have never dealt with data for very long or don't have any foresight at all.
This has really got me wondering: What would be a format that is fundamentally readable by both humans and computers? And what does that question really mean?
[+] [-] Someone1234|10 years ago|reply
Digital long term storage is just, no. Look, all key UK documents are ALREADY digital they're available to download right now. However we aren't talking about digital Vs. non-digital, we're talking about what to store BACKUPS on.
Digital long term storage is something people who frankly know little about technology point to. It is up to IT-types to figure out the costs and the "how?" from then on out. There is no digital storage media which will still exist reliably in 200-400 years, not CDs, not HDDs, and not SSDs. Tapes are the closest and they won't last that long.
So we aren't talking about slapping it on a CD and storing it in some archive, we're talking about slapping it on a piece of digital media today and then moving it every one hundred years without fail, or we lose it forever... If one government or generation loses interest then future generations suffer (see the current government as an example: moving away from reliable Vellum to unproven papers).
Plus with digital you also have to worry about: solar flares/EMP, format knowledge (both of the file system & file format), media protocol knowledge (SATA or CDs might be forgotten technology), and the ability to alter historical documents without detection (not that vallum is immune, but it requires more skill/time).
PS - The "Digital Preservation Coalition" are a group of historians and archivists. They aren't technologists, they just point blindly at digital without explaining the nitty gritty of HOW.
[+] [-] alkonaut|10 years ago|reply
On the other hand, storing data for an active/continuously used archive doesn't seem so hard. You just keep the data digitally on some media and every few years you upgrade the tech to whatever technology and formats are "current" before the old format and tech is completely forgotten. Keeping digital data around for a decade is pretty easy. If you do that 100 times in a row you have kept it for a thousand years and you never had a file format or a storage media older than a decade. You could "forget" to upgrade the archive, but you could also "forget" to scribble on the calfs some time.
[+] [-] imprecision|10 years ago|reply
That said, waiting the full duration of a material's projected life span before actual use of that material would limit options hugely, especially with the massively increasing corpus of generated data.
In this sort of context (amount of data) one can spread your bets and use vellum, as well as other methods, until other more stable forms of storage are easily available.
Interesting piece released yesterday related to long term data storage; a projected lifespan of 13.8 billion years doesn't sound too bad... University of Southampton - http://www.southampton.ac.uk/news/2016/02/5d-data-storage-up...
[+] [-] thaumasiotes|10 years ago|reply
No no. This doesn't make sense. If we're not going to put our treasured documents on archival paper until we've proved that archival paper really will last for centuries, they're never going to be put on archival paper, no matter how great it might or might not be, because we'll all be dead.
You wanted to say we should see how archival paper pans out before we stop printing our treasured documents on other types of material.
[+] [-] gglitch|10 years ago|reply
[+] [-] KineticLensman|10 years ago|reply
Automation won't hack it either. At the moment, the most complex machine that we know how to make persist into deep time is a very simple clock [2], and even that requires human intervention.
In Anathem [3], Neal Stephenson speculates on what this combo of culture and deep-time technology might look like.
[1] https://en.wikipedia.org/wiki/Magna_Carta [2] https://en.wikipedia.org/wiki/Clock_of_the_Long_Now [3] https://en.wikipedia.org/wiki/Anathem
[+] [-] flashman|10 years ago|reply
Archival paper is tested by artificially aging it (e.g. exposure to heat and moisture to simulated passage of time.) Some long-term tests are underway to determine whether accelerated aging tests accurately simulate the natural aging process.[1] And there are associated standards, e.g. ISO 5630-3.[2] But you are correct, we really have no proof of how modern archival paper will perform over a multi-century timeline.
[1]https://www.loc.gov/preservation/scientists/projects/100-yr_... [2]http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_...
[+] [-] tsaixingwei|10 years ago|reply
Archivists are acutely aware of the obsolescence of digital formats and media. And the solution cannot be just printing them out on vellum or archival paper because increasingly records are being created digitally that cannot be meaningfully or adequately converted into printed formats. I am talking about records that are audio, video, high resolution photographs (like those that NASA is seeking to preserve digitally) etc. Even emails when printed out onto paper would be missing a lot contextual information that would have been kept if you preserved it as a digital format like EML or MSG.
So, existing digital preservation systems may still be imperfect, but nevertheless we still need find improved solutions for preserving them digitally.
I don't work for CCSDS or Preservica. I work for a national archive as a specialist on digital records and archives, but my background is in IT having spent about 15 years in software development.
[+] [-] ddingus|10 years ago|reply
Doing this doesn't cost much, and it's a basic archive of the business of the people worth keeping in human readable form.
At the end of the day, tech is good, but the business of people needs to be human readable, accessible in a court, trusted by all. This is a basic requirement humans have, and without it in place, we are forced to trust and depend on tech for law, and it's real consequences.
Nothing wrong with using tech, mind you. We can, should and do. It's more efficient.
But, say something bad happens. That's what the vellum is for.
[+] [-] venomsnake|10 years ago|reply
[+] [-] CapitalistCartr|10 years ago|reply
[+] [-] pdabbadabba|10 years ago|reply
1. Durability of the physical medium over long periods of time, and
2. Our ability to decode extract information from the media even if they do physically survive. (I remember this being a hot issue several years ago, but one that seems to have faded a bit recently.)
[+] [-] rileymat2|10 years ago|reply
http://dictionary.reference.com/browse/digital
5. available in electronic form; readable and manipulable by computer: Scan these two pages so you'll have them as a digital document.
[+] [-] lhnz|10 years ago|reply
(Not trying to imply that vellum should be the only storage method but that sounds like cheap robustness, and I'd not want to throw that away just because you want to be capital-D Digital.)
[+] [-] a-priori|10 years ago|reply
Their current pricing is $0.0007 per GB-month, or $0.000064 per KB for 800 years in today's dollars. Since the Latin text of Magna Carta is 28 KB of UTF-8 text, that works out to $0.0018 to store the text for 800 years. You'd obviously also want a high-resolution scan of the original document. I don't know how big that would be. If it was a 100MB image, it would cost $6.66 to store for 800 years at Amazon's current Glacier prices.
Obviously we're making immense assumptions about the reliability of the service, of price stability to deliver the quality of service, and so on. And perhaps the biggest issue with this analysis is that we're also relying on huge economies of scale for it to be profitable to deliver an archival service like this over centuries.
If this sounds crazy cheap to you, you're not alone. As technologists we're used to dealing with huge volumes of data, so it's easy to forget that historically these volumes simply did not exist.
[+] [-] avani|10 years ago|reply
[+] [-] beejiu|10 years ago|reply
[+] [-] chippy|10 years ago|reply
It's a social not a technological problem.
But how to do it is much easier to solve.
Why to do it is the actual problem. What to keep and when to keep it without knowing the future.
How does anyone know that the 100 photos that a person from Morocco put onto Flickr 5 years ago would have value in the future? Why should those be archived? In 200 years these photos could be immensely valuable.
How can this potential future value ensure the archiving of the photos today? Is there economics for this?
[+] [-] richmarr|10 years ago|reply
Not to mention supporting the UK's tourism brand... castles and queens and whatnot.
[+] [-] profmonocle|10 years ago|reply
[+] [-] rusanu|10 years ago|reply
[0] https://en.wikipedia.org/wiki/Anathem
[+] [-] rtkwe|10 years ago|reply
[1] http://rosettaproject.org/disk/concept/
[+] [-] ascorbic|10 years ago|reply
[+] [-] unknown|10 years ago|reply
[deleted]
[+] [-] forinti|10 years ago|reply
[+] [-] Someone1234|10 years ago|reply
The end result is almost indistinguishable from the paper version of the same, unless you touch it.
[+] [-] convivialdingo|10 years ago|reply
There are likely thousands of different combinations of hardware, data and formats that we could support.
This was just from a 50 year period.
I've no doubt that data rot is one of the hardest challenges for the future.
If it's really important then you better put it on something physical that lasts.
[+] [-] jameslk|10 years ago|reply
[+] [-] vetrom|10 years ago|reply
It would seem to me that the place to look for knowledge about digital archiving is those who have taken up that call in modern times, such as the American examples of ibiblio.org and archive.org. I'm fairly confident that there exist European equivalents, as well. Why not look to the research, as opposed to breathless naysaying?
It's certainly not a solved problem, but it's definitely not one that's being handwaved away, either.
[+] [-] iolothebard|10 years ago|reply
[+] [-] borkabrak|10 years ago|reply
[+] [-] ape4|10 years ago|reply
[+] [-] J_Darnley|10 years ago|reply
[+] [-] ranko|10 years ago|reply
[+] [-] fimbaz|10 years ago|reply
[deleted]