top | item 21765055

(no title)

starsinspace | 6 years ago

Although efforts like internet archive are noble (and I find it occasionally useful), I'm not sure it's always so great that everything anyone does online will be permanently archived.

I know many people feel that everything should be available forever. But for me... it's pushing me away from doing much on the web. I liked it in the 90s when things were more ephemeral. When you could make mistakes and not have them easily found by anyone with a few clicks, forever.

discuss

order

VonGuard|6 years ago

You're right, let's burn the library down because one book has a liable chapter in it.

This argument is so horrible as to be actively harmful to Archive's work. Jason Scott is a god, and if we didn't have him, we'd have to invent him.

WE DO NOT GET TO CHOOSE WHAT THE FUTURE FINDS INTERESTING.

We live in the only point in human history where we can actually save all of humanity's knowledge and culture, and we can do so without having to worry about physical space or staff to work the "library." It's a remarkable time we live in, and yet, 99% of our society either doesn't care, thinks this work is stupid, or actively works against it through horrific copyright laws.

We know more about how Rembrandt painted and lived than we do about how Atari 2600 programmers worked and lived. I can go to Rembrandt's house and see where he lived, where he painted, how he worked, where he slept and ate and mixed his paints and taught his classes.

Atari's old HQ is just another office building. The source code to those games is mostly gone (thankfully, it's assembly and easier to disassemble). We need to save our culture and digital heritage, else we forget where we come from.

Deleting some old tweets is one thing, but actively worrying about Archive's work is just harmful to us all. We need 10,000 more Archives, dammit. It's supremely important work that is helping stem the tide of lost culture due to stock market forces. Geocities is gone forever because Yahoo! didn't find it profitable. This cannot keep happening.

ballenf|6 years ago

I’m not convinced it’s dangerous to explore whether there are benefits to ephemerality.

I’m also not sure your Rembrandt example shows what you suggest it does. The average Atari 2600 programmer would be more equivalent to the hundreds of now unknown artists in Rembrandt’s time. The John Carmack’s of today will be remembered in detail with or without blanket archive efforts.

Maybe, just maybe, Rembrandt’s Status in our minds is a result of generations of people each seeing the individual value in his work. That is, each generation does indeed get to decide what future generations remember. Or at least it used to be true until the digital age.

Maybe the change is an improvement. But maybe not.

And libraries are the epitome of what you’re fighting against. They are by definition works chosen by humans based on judgment calls of their perceived value.

Let’s at least acknowledge that blanket archive efforts are a fundamental change in themselves and a departure from the human status quo for thousands of years. Then let’s debate whether the change is an unabated good.

gumby|6 years ago

> We live in the only point in human history where we can actually save all of humanity's knowledge and culture,

Because the the welter of proprietary, undocumented formats, media bitrot and the like we are actually moving away from such a point.

Turns out historians may not be so upset. You can be a historian of early medieval France have a chance of reading 100% of the surviving documentation. Too much data can obscure the story.

Of course historically you could get a PhD for compiling a concordance to Shakespeare, something that can now be done mechanically in seconds. Future historians could (and will) apply the same tools to today's surviving documentation. But I don't believe there'll be as much of it as you seem to think.

starsinspace|6 years ago

> You're right, let's burn the library down because one book has a liable chapter in it.

Jeez... You may want to re-read my comment. I have written no such thing, and it is not my opinion at all.

tptacek|6 years ago

This comment would be better without the first three paragraphs, which actively detract from what you're trying to say.

probably_wrong|6 years ago

> You're right, let's burn the library down because one book has a liable chapter in it.

I feel you got the comment backwards: a better analogy would be "if a used-books store full of Dan Browns were to burn down, would we regret the loss of maybe one chapter that has some value?"

Your position seems to be "yes", but I wouldn't dismiss so easily the opposite view: that 90% of everything is crap, and that keeping everything forever "just in case" sounds surprisingly similar to hoarding.

I do not oppose "purposeful archiving" - as someone mentioned, saving outgoing Wikipedia links seems smart. But my old twitter account, where I kept track of missed trains? There are better sources for that, and no one missed it when it was gone.

pergadad|6 years ago

I think a better analogy than a library would be your average day in the office: would you want everything you say and do in the office recorded for eternity? Sure it would help, say, catch fraudsters, track responsibility and credit, allow sociologists fascinating analysis - but is that worth it? The >1GB of Google+ is a good example. Probably many interesting posts from people that are the core experts on topic X - and many nonsensical Twitter-like posts of people sharing whatever they encountered or thought that day.

at-fates-hands|6 years ago

>> We live in the only point in human history where we can actually save all of humanity's knowledge and culture

Playing devils advocate here for a moment. . .

Considering we as humans learn little from our past, keeping all of this knowledge is a benefit to whom then? Some people who feel nostalgic about Sony's first walkman? Or maybe people using it for nefarious reasons? If humans continue to make the same historical mistakes over and over, what benefit does the human race gain from cataloging all this information? I would venture to guess, its more plausible it will be used against us instead of furthering our own culture.

>> We know more about how Rembrandt painted and lived than we do about how Atari 2600 programmers worked and lived

There is a huge difference between saving all of Rembrandt's stuff than it is some 22 year old college drop out programmer who created a video game in the hey days of long forgotten startup company. And yeah, there have been numerous documentaries, and articles written about Atari in those early days. Who would want to save a dilapidated roller rink under the auspices that a great and noble video game company used it as their HQ for a few years??

https://www.polygon.com/2018/7/6/17542154/atari-book-valley-...

But then this roller rink down the block became available: 10,000 square feet! I mean, we were just jam-packed, and we had people on roller skates actually running around on the roller-skate rink building Pongs.

While I do think leaving certain things to the sands of time is a good thing, vacuuming up everything is just as worrisome. Are we going to be hoarders of a bygone technological past where a large majority of the "stuff" we save will have little, if any use to anybody anymore??

Having a background in anthropology, I find it fascinating there will be many generations of kids who leave no physical trace of their existence since a large majority will be in electronic form. Just imagine how people's lives are in a sort of suspended animation after passing away and having their Facebook pages live on forever.

It's a bit hard to wrap your head around tbh.

SkyBelow|6 years ago

>You're right, let's burn the library down because one book has a liable chapter in it.

It is more like, either you burn the library down or every thing you have written in your private journal is now available to be checked out by anyone.

It really shouldn't be that way and I think we should fix the problem of holding people responsible for bad behavior in the past. But how do we draw lines (for example, what about holding people responsible for past crimes).

>We need to save our culture and digital heritage, else we forget where we come from.

I agree, but we also need to ensure this is done without costing individuals. Technology has advanced, but society has not. Out technology outpacing our culture has and will continue to hurt many people and we should try to find a way to fix it.

GhettoMaestro|6 years ago

> Atari's old HQ is just another office building. The source code to those games is mostly gone (thankfully, it's assembly and easier to disassemble). We need to save our culture and digital heritage, else we forget where we come from.

Very good point!!!

In my view the Internet Archive should be the Digital equivalent of the the role of the National Register of Historic Places (NRHP). Shepherds of documentation, to give it a cool-ish sounding name.

gravitas|6 years ago

My personal, obscure ISP user page (think the ~user/ era) from 1995 is preserved in all it's drop shadow blink tag marquee glory at archive.org with me doing nothing, it was just captured by whatever natural processes. The things I said on mailing lists, random forum posts etc. - it's all archived. That 90s stuff isn't/wasn't as ephemeral as folks think in my opinion, it's out there somewhere. $0.02 :)

DuskStar|6 years ago

> I'm not sure it's always so great that everything anyone does online will be permanently archived.

But you see, even if the Internet Archive didn't exist, someone would probably still be saving a copy of the things you do. It'd just be a megacorp or surveillance agency instead of a more egalitarian organization.

So the choice isn't "things on the internet are ephemeral" or "things on the internet are available forever to everyone", it's that or "things on the internet are available forever to some subset of the rich and powerful".

garaetjjte|6 years ago

Maybe if everything would be archived forever, we could understand that everybody makes mistakes, and stop paying so much attention to old posts? Though I admit this is very optimistic view on human behavior.

baroffoos|6 years ago

Ancient posts rarely do get any attention though unless you are a politician and even then most people agree its worthless information. There was the recent event with the guy from Canada having photos of him wearing blackface almost 20 years ago and most people agreed that something so long ago is totally irrelevant to today.

Avamander|6 years ago

Some mistakes are also worth recording. I like seeing bad predictions of the 2000s from the 1970s for example.

Not to mention that quite a lot what is archived today has been made by companies, there's no "right to be forgotten" that companies could ever deserve. For example I've uncovered quite a few mistakes in currently public datasets/websites based on archived sites, who knows how many mistakes are made now and never fixed because we lose the original sources. Point being that the lack of original source doesn't mean the information gets lost, it just becomes a big version of the kids game "telephone" where everyone recites what they heard and it gets distorted in the end.

strenholme|6 years ago

>I'm not sure it's always so great that everything anyone does online will be permanently archived

The real problem here is the runaway cancel culture, where we attack people for things they said or did years or decades ago which were (at the time) perfectly acceptable and reasonable.

The most egregious example I have seen so far is cancel culture advocates who think we should disregard the late Richard Feynman’s legacy because he said some rude things to a lady back in 1946, even though the lady herself was not offended, since she did sleep with him later that same evening.

There’s a point where we just have to say “That was a long time ago, no one at the time was offended, get over it.”

philpem|6 years ago

Indeed. Context matters, and societies evolve over time. Opinions which we'd consider abhorrent today were, once upon a time, may have been acceptable.

A comment made years ago is only a reflection of a person's opinions at that point in time; opinions which may have changed since.

Bartweiss|6 years ago

The consolidation and permanence of the web are definitely concerning.

Moving from "somebody knows this happened" or "this is in a file drawer somewhere" to "there's a searchable record of this" expand everyone's access to the info, and can do a lot to stave off forgetfulness and bit rot. But the people whose gain the most access are the ones who weren't involved in the first place, and the intersection of "uninvolved" and "cares enough to check" tends to be people who are actively hostile. Hence doxxing, stolen photos, and callouts over years-old tweets.

But that's a broad result of digitization. If a reporter or opposition researcher wants to embarrass someone, they can already look through digitized student newspaper essays, find interview subjects off class rolls, or simply comb through Twitter for long-forgotten offenses. (This holds for both good and ill - it applies to both serious skeletons and misleading or trivial issues.)

The Internet Archive, then, seems like sousveillance offsetting surveillance. For those who can point time, money, and connections at a target, it's enough that evidence exists, and more than enough that it's available online. But for the general public, it's much harder to keep track of countless sources or publicize news. If you can't dedicate interns and an archive to tracking every news story you read, you can't find or prove edits. (And while most newspapers noted corrections or morning/evening revisions, silently changing online stories has become common practice even for the likes of the BBC.) If you can't point out a webpage or tweet to thousands of people at once, the evidence is likely to be taken down before it's recognized. There are a lot of dedicated sites like NewsDiffs working on this problem, but Internet Archive provides a general-purpose answer to "let an average person see the history of a page or create a trusted record of it".

I worry that this just amounts to an eye for an eye, and still increases the total amount of scrutiny we're all under. But as long as more content is becoming permanent, it still seems better to have symmetrical access to it.

krapp|6 years ago

It's not actually true that everything anyone does online will be permanently archived. If it were, there would be no need for the Internet Archive.

The truth is, only the things someone has an interest in archiving will be archived, and only so long as someone has an interest in maintaining those archives. Just look at the recent announcement about Yahoo Groups... no one was, and likely no one is, going to permanently archive most of that. Sites, content and history get lost all the time.

NeedMoreTea|6 years ago

I think it would be reasonable to establish a bar, similar to offline where everything above it is archived, and everything below is optional opt-in.

In the offline world the National Libraries get a copy of every book, magazine and newspaper published, by law. At least that's the way the UK and US do it. They archive a lot of other stuff as well, including music, audio, adverts, but that's more informal, and there is no requirement to preserve.

Personally I'd like things politicians and personalities (by dint of having chosen to live large) say online archived, all business (to later hold them to account) along with the sites of anyone in the business of influence - think tanks, parties, lobbyists, activists, "grass roots" organisations etc. Individuals, anon forums, HN and reddit subs and other places of shooting the breeze should be allowed to stay ephemeral. In fact I think conversation is freer that way - some will choose to say less, say different, or say nothing if all everyone says is forever...

sarbaz|6 years ago

In the US, mandatory deposit technically applies to any copyrighted work of any kind. We should fund LoC to enforce mandatory deposit on digitally published works as well.

In a sense this is also a good demarcation point. If something is serious enough to be worthy of copyright protection, it's probably worth archiving.

ryandrake|6 years ago

Funny you mention HN and Reddit as ephemeral because I always thought they were more permanent than most. While you can email the mods and ask that a particular post of yours be removed, I don’t think they will wholesale scrub your content out of the archives if you request, or help to anonymize them in any way.

I consider HN pretty much permanent and tread carefully with controversial opinions or things that might one day be considered not-PC.

njharman|6 years ago

What law in The US? In ancient times, like 40+ yrs ago, before the first big extension / copyright automatically granted at moment of creation. It used to be req to send copy to LoC to earn right to enforce copyright.

I've published several books the LoC does not have.

Also the national libraries aren't the sole archives of culture. Univ and private libs preserve all the important stuff government has not the interest or budget for.

enumjorge|6 years ago

Not only that, but I also wonder if we're overestimating the value of keeping all of this data around. Who's going to have the time to search and curate these mountains of information when we're generating tons more of it every day? I imagine the ideal goal is to allow future historians to learn about our past selves, but I think there's a tipping point where only those with lots of resources can afford to meaningfully consume it. Those typically are wealthy companies or individuals, and I'm generally less excited about what do with our information.

Obviously there's value in archiving some information, but a save all or even same most approach starts sounding a little hoarder-ish. Sure you might one day make use of that 1997 November TV guide, but chances are you won't and in the meantime you're paying the opportunity cost of storing it.

Maybe we need to take a page from Marie Kondo and only keep that which sparks joy and learn to let go of the rest. There's a chance someone will need a bit of info that no longer exists, but we'll probably be ok.

kevingadd|6 years ago

Part of the challenge here is that it's hard to know in advance what is or isn't worth archiving. It may only be clear a few years later that some big chunk of now-dead data was important.

In that sense, curating all of it doesn't really matter as long as you archived it. Someone trying to find the data later (or curate it!) can find their way to the right URLs using other sources, and then begin the process of curating this archived data after-the-fact.

baroffoos|6 years ago

The internet archive is most useful for when you click a link and it is dead which is very often. The wikipedia references are filled with dead links which now point to IA.

There is probably a lot of junk data on IA though especially video site archives but its worth keeping stuff that isn't needed if it means keeping stuff that was useful.

saagarjha|6 years ago

> Who's going to have the time to search and curate these mountains of information when we're generating tons more of it every day?

Presumably some sort of search engine, not a person.

big_chungus|6 years ago

Well, there are some tools that have been developed that have pretty amazing capabilities to crunch through staggering quantities of data and come up with useful insights. It's basically big g's core competency, and there are tons of other companies that do the same thing, as well as open-source solutions that can be used.

crucialfelix|6 years ago

In the not so distant future many people will record their entire lives: movements, utterances, biometrics, audiovisual and sensory data. Then they are going to freakout when dead people's lives start getting deleted because nobody is going to pay to host all this crap

kirstenbirgit|6 years ago

I was having the same thoughts. Most of what I've used Archive for is to look up e.g. old blog posts for personalities that show their hypocrisy compared to today, for example. Or someone posted something daft when they were 15 and their handle was leaked and now it's out there forever, and we can laugh at how stupid they were.

I'm sure glad I went to efforts to scrub my personal sites I made when I was a teenager!

I don't think all blogs and personal content(for the lack of a better word) should just be archived. You should need consent. Most people have no idea it's going on. Or it should be very easy to delete something from the archive.

nicky0|6 years ago

I'm convinced that this instinct to preserve everything forever is psychologically connected with the the denial of mortality. (Edit: I'm not saying this is a bad thing, just suggesting the phenomena may be connected.)

bjornjaja|6 years ago

I think it’s actually simply evolution at work. That’s part of our evolutionary process as humans, building on historical achievements of our ancestors.

bluntfang|6 years ago

It's good to note that there's a difference between archive and (big A) Archive, as a practice and discipline. As far as I can tell, Archivists (like, people who went to school to be an Archivist), don't really agree with Jason Scott's agenda and approach.

lidHanteyk|6 years ago

On one hand, sure, library science and forensic analysis are extremely important, and nothing lasts forever, especially without the care of curators. We aren't dismissing traditional nor classical archival methods, and they already have taught us much about how to do digital archiving. [0]

On the other hand, clearly the Internet Archive is a competent digital archiver, and they've earned the capital-A "Archive". They publish a larger digital commons than anybody else, I think, especially at the low low price of gratis.

It sounds like your entire complaint is in two points. First, that IA doesn't ask (much) permission, which is unsurprising. The history of libraries is not one of asking permission, but of simply doing it. The public has been convinced repeatedly, over the decades, that libraries are good for them, and this public support helps insulate librarians from corporate interests.

Second, that IA doesn't employ enough women. I can't help you with that, but you are free to improve yourself.

[0] https://en.wikipedia.org/wiki/Disc_rot

toomuchtodo|6 years ago

Why is going to school for a subject a proxy for competency? Jason works for the Internet Archive. Perhaps "archivists" might consider more real world experience versus academic exercises?

asimpletune|6 years ago

Do you know more about what they think?

tcd|6 years ago

I'm inclined to agree with this position. Does every Youtube, Reddit, Twitter, Hackernews, Facebook comment need to be archived and stored for the next thousand years?

I'd argue no, and there's a huge amount of waste in there - so many bot posts, or just spam.

But here we are, people want to archive every byte of information that traverses through the internet.

These days you have to approach cautiously, as every thing you do or post may be archived.

bitwize|6 years ago

There are huge privacy concerns in the present day, but there's the other side of it too: today we consider it a treasure when we find "Maximus sucks a big dong" type graffiti on some wall in Pompeii. The glimpse into the life of the Romans is itself exhilarating. A thousand years hence you won't be around to care about how your embarrassing posts affect your reputation, and those who find it might be more grateful for the glimpse into early 21st century life than inclined to snigger or cringe at your comment.

mirimir|6 years ago

> These days you have to approach cautiously, as every thing you do or post may be archived.

That was obvious in the 90s.

But sadly enough, it hasn't worked out that way. Before the Internet Archive anyway.

redisman|6 years ago

They don't archive social media content so this whole point is moot.