top | item 2781615

Aaronsw indicted for hacking MIT network to download millions of JSTOR docs

603 points| Estragon | 14 years ago |documentcloud.org | reply

313 comments

order
[+] _delirium|14 years ago|reply
The repeated use of "stole" in the indictment is interesting, even beyond the usual metaphorical usage to discuss copyright infringement.

In this case, the indictment alleges that the documents were stolen from JSTOR, which does not even own them! In the vast majority of cases JSTOR scanned documents whose copyright is owned by someone else, and acquired or was donated a non-exclusive license to distribute copies via its service. In many cases the documents are even public domain. The indictment continues the theft metaphor by discussing the effort and expense JSTOR incurred in scanning the documents, and the alleged attempt to render this less valuable by redistributing "its" documents, analogizing this to the loss someone suffers in a theft.

But effort expended to build a private repository consisting of copies of things you don't own doesn't give you ownership of the result, any more than Google Books doing the same has given them ownership of the documents that they've scanned. If you scraped Google and "stole" their scans, you would be violating Google's Terms of Service, and Google might indeed feel subjectively like you've taken something of value (their exclusive access to this repository of scans), but I think it would be a stretch to say that you've "stolen" "their" documents.

[+] Alex3917|14 years ago|reply
They are talking about theft of services, not copyright infringement. In any event these charges are going to be very difficult to beat since they're federal, even though there are some obvious holes in the indictment. It will be almost impossible to get any of the evidence thrown out even if there was an illegal search and seizure. His best bet is probably to get the Harvard legal team to go to bat for him, although it's difficult to say how likely that is.
[+] sp332|14 years ago|reply
Yeah, that bothered me too. Especially on page 14, where they demand that he give back the "proceeds obtained." How is that going to be determined in such an unrealistic sense?
[+] DenisM|14 years ago|reply
You're precisely incorrect as far as the law is concerned.

If I make a painting, I own the copyright. If you take a picture of said painting, you own the copyright on the picture. If someone makes a collage of your picture they own the copyright on the collage. Both of you are liable for copyright infringement against my rights, but this is independent of your own rights as described above.

This is not a legal advice, talk to a lawyer.

[+] Apocryphon|14 years ago|reply
Let's try another analogy.

What if someone hacked into Netflix and downloaded copies of all of the media that Netflix offered?

[+] dgreensp|14 years ago|reply
What Aaron did sounds seriously sketchy (sneaking into MIT wiring closets, trying to download the entire database, etc.), a fact that Demand Progress and several commenters here seem to be ignoring.

Defending his actions would require a very strong, multi-pronged version of the argument "if it's physically / technologically possible, it must be ok." Can MIT legally limit guest access to its network? Can JSTOR limit access to its content? Well, technically, their software didn't limit it, right? He just changed his IP address and they let him right back on, gave him permission. And then he had to change his MAC address. And then physically move to a different building.

But it doesn't matter anyway, because legal restrictions are legal restrictions. It's impossible to enforce every legal restriction in software. Put another way, we don't have to read JSTOR's server code to figure out if there's a violation of policy here -- the policy is written out as a legal document.

In the hacker world, there's a tendency to think that if something's possible, even easy, then it shouldn't be considered "breaking in" or "stealing." If my Gmail password is "password," then of course you're going to read my email! I had it coming. In the real world, though, this is still a crime.

[+] neilk|14 years ago|reply
Right, because the standard penalty for trespass onto campus property is a federal indictment. Good thing no MIT student has ever snuck into a restricted area before.
[+] herdrick|14 years ago|reply
Thanks for the dissenting opinion.

Is this worse than the sort of thing that goes on in the early days of most startups, including our most revered? People around here have a lot of respect for pg, rtm, tlb and their startup Viaweb - go back to "Founders at Work" and read how they got computer time needed to get the startup going. That kind of thing is practically universal in startups. The good ones, anyway.

So Aaron appears to have cut some corners in getting an interesting project off the ground. Slap him on the wrist.

[+] njharman|14 years ago|reply
sketchy != illegal

Trying to download entire databases!!! Oh my. Won't someone please think of the children!!!

[+] runningdogx|14 years ago|reply
This is the most technically competent charging document I've ever read. I guess there must have been some hackers on the grand jury.

Paragraph 35 & 36: which "protected computer" on MIT's network did he access? Certainly they're not trying to claim his laptop was a protected computer? Are they talking about the DHCP server or whatever registration frontend MIT has for the DHCP assignments? I have trouble with the concept that a violation of a computer use agreement (when there are no operative security barriers in place) constitutes a violation of the computer fraud and abuse act. Then again, I've always thought that act was vague and therefore overbroad.

Obviously what he did was bad in some sense (at least from the perspective of JSTOR and MIT), but even if it should be a crime rather than a civil dispute or internal disciplinary action at MIT, I don't like the fact that just about any misbehavior on the internet becomes a federal case because the probability of no interstate resources being used is very low.

Finally, I take issue with the notion that someone who is accessing a service through a public interface is criminally responsible for downtime if too high an access rate causes service degradation or an outage. The claims that JSTOR's servers were overloaded and (one?) even went down at some point are clearly there to set up a later claim of damages. Haven't they heard of rate limiting (in this case, since it was a rogue laptop stashed in a data closet, rate limiting by IP)? That wouldn't work against a concerted denial of service attack, but this was no denial of service attack. JSTOR seems to have been relying on manual intervention to stop article leeching that could lead to a (partial) outage. That's naive, and not a good idea.

[+] wiredfool|14 years ago|reply
More than likely, the document was written by the prosecutors office.

The procedure as I understand it is:

* Prosecutor assembles evidence, writes indictment.

* Prosecutor presents evidence to Grand Jury. This may include witnesses or documents.

* Grand Jury votes on if there is enough there to approve indictment

If they've got a computer crimes division, then they're going to have hacker types in the prosecutor's office to do this stuff and get the details right.

The indictment is going to be the most slam dunk part of the evidence that there is, as it's written by the prosecutor and there's no counter to it. If it doesn't look airtight, then it's probably a very weak case.

Though, looking at it here, It's not looking very good for aaronsw. The combination of mac address spoofing and a locked wiring cabinet show physical and electronic security that was bypassed, repeatedly. That's easy to explain to a jury.

[+] mbreese|14 years ago|reply
> no operative security barriers in place

I don't know... Even if my front door was open, you still aren't allowed to enter my house without my permission.

[+] pak|14 years ago|reply
>I take issue with the notion that someone who is accessing a service through a public interface is criminally responsible for downtime if too high an access rate causes service degradation or an outage

Ah... well surely you've heard of people being prosecuted for denial of service attacks? Most recently, members of anon getting raided because they used LOIC? If you use a network in a way that is intended to degrade others' quality of service, even if you are just accessing things via normal protocols at a really high rate, you are breaking the law. In this case, it does not look like they are alleging that he intended to cause a service disruption, but they claim that he repeatedly circumvented measures that JSTOR put in place to halt his unauthorized activities, which caused service disruptions for other legitimate users and therefore denial of service.

[+] rryan|14 years ago|reply
I think 18USC1030 is pretty broad in its definition of a "computer". Back in the MBTA hacking case, MBTA claimed a magnetized piece of paper was a computer under this clause, and the first judge that looked at it bought that (sanity prevailed and that decision was later over-ruled). I wouldn't be surprised if they are considering the MIT network or at least the routers that were configured to prevent his access the protected computer in this case.
[+] 3pt14159|14 years ago|reply
I suppose the router could be construed as a protected computer, since it had blacklisted his MAC address and it is a computer with an OS.
[+] redthrowaway|14 years ago|reply
>This is the most technically competent charging document I've ever read.

I'm guessing MIT had a hand in penning it, or at least provided someone who could easily explain the relevant material to the DA/Grand Jury.

[+] nitrogen|14 years ago|reply
I don't like the fact that just about any misbehavior on the internet becomes a federal case because the probability of no interstate resources being used is very low.

So is that why Comcast routes traffic to networks 30 miles away across three states and back?

[+] mukyu|14 years ago|reply
http://blog.demandprogress.org/2011/07/federal-government-in...

“It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal added.

Nowhere do they say he did not do it however.

cached: http://webcache.googleusercontent.com/search?q=cache:http://...

[+] guywithabike|14 years ago|reply
This is almost too good:

"As Swartz entered the wiring closet, he held his bicycle helmet like a mask to shield his face, looking through ventilation holes in the helmet."

[+] mbreese|14 years ago|reply
This all hinges on what he was going to do with the documents. If he was looking to perform some large-scale analysis (such as he has done before) and publish the results academically, then this would fall under the academic mission of MIT, and therefore be legit. But if this were the case, why go through the hassle of hacking the system? Why not just ask JSTOR for cooperation? Or maybe he did, and they rejected it?

There has got to me more to this story, because I just can't for the life of me believe that he would download the documents to "free" them on internet (as is alleged).

[+] GHFigs|14 years ago|reply
A detail you missed is that Swartz was not a student or faculty member at MIT. He was a fellow at Harvard.
[+] carbonica|14 years ago|reply
I wonder what they'll push for. He sounds pretty screwed if this evidence pans out. Looks like he could even end up with a few years' time if the prosecutors want.

1. Wire fraud maxes out at 20 years outside of a presidentially-declared emergency. No fine cap, it seems. http://uscode.house.gov/download/pls/18C63.txt

2. Computer fraud under 1030(a)(4) caps out at 5 years with no prior offense, no fine cap. http://uscode.house.gov/download/pls/18C47.txt

3. 1030(a)(2), (c)(2)(B)(iii) looks to be another cap of 5 years. Ibid.

4. 1030(a)(5)(B), (c)(4)(A)(i)(I),(VI) looks like another cap of 5 years. Ibid.

IANAL, just trying my best to read the code itself.

[+] snikolic|14 years ago|reply
Ignoring legality, Aaron's actions, case specifics, etc., I have to admit: I really wish that the data in question was free and publicly available.
[+] acangiano|14 years ago|reply
When someone risks 35 years in jail for something like this, you know your justice system is broken.

I know he won't get 35 years, but it's nevertheless outrageous that it could happen.

[+] mukyu|14 years ago|reply
The title is inaccurate.

It is alleged that he signed up for guest accounts on their network with different laptops, changed his MAC address and re-registered if the IP he was using was blocked (by JSTOR) or cut off of the network (by MIT), and finally connected a laptop in a basement networking closet.

I guess you could say that is 'hacking' in the unauthorized access sense, but not in any meaningful sense. It isn't breaking and entering if someone repeatedly trespasses somewhere (say, banned from a store) even if they change their clothes to avoid detection.

[+] troutwine|14 years ago|reply
The indictment asserts that Mr. Swartz intended to distribute the files downloaded but did not substantiate this claim. I wonder what proof they have of this? (There are, of course, a great many laws dealing with probable intent that need only convince a jury of said intent without demonstrating it's validity.)
[+] tessro|14 years ago|reply
Demand Progress PAC's website is down, but they released a statement:

(from: http://webcache.googleusercontent.com/search?q=cache:9k5ryiX... )

Cambridge, MA– Moments ago, Aaron Swartz, former executive director and founder of Demand Progress, was indicted by the US government. As best as we can tell, he is being charged with allegedly downloading too many scholarly journal articles from the Web. The government contends that downloading said articles is actually felony computer hacking and should be punished with time in prison.

“This makes no sense,” said Demand Progress Executive Director David Segal; “it’s like trying to put someone in jail for allegedly checking too many books out of the library.”

“It’s even more strange because the alleged victim has settled any claims against Aaron, explained they’ve suffered no loss or damage, and asked the government not to prosecute,” Segal added.

James Jacobs, the Government Documents Librarian at Stanford University, also denounced the arrest: “Aaron’s prosecution undermines academic inquiry and democratic principles,” Jacobs said. “It’s incredible that the government would try to lock someone up for allegedly looking up articles at a library.”

Demand Progress is collecting statements of support for Aaron on its website at …URL…

“Aaron’s career has focused on serving the public interest by promoting ethics, open government, and democratic politics,” Segal said. “We hope to soon see him cleared of these bizarre charges.”

Demand Progress is a 500,000-member online activism group that advocates for civil liberties, civil rights, and other progressive causes.

About Aaron

Aaron Swartz is a former executive director and founder of Demand Progress, a nonprofit political action group with more than 500,000 members.

He is the author of numerous articles on a variety of topics, especially the corrupting influence of big money on institutions including nonprofits, the media, politics, and public opinion. In conjunction with Shireen Barday, he downloaded and analyzed 441,170 law review articles to determine the source of their funding; the results were published in the Stanford Law Review. From 2010-11, he researched these topics as a Fellow at the Harvard Ethics Center Lab on Institutional Corruption.

He has also assisted many other researchers in collecting and analyzing large data sets with theinfo.org. His landmark analysis of Wikipedia, Who Writes Wikipedia?, has been widely cited. He helped develop standards and tutorials for Linked Open Data while serving on the W3C’s RDF Core Working Group and helped popularize them as Metadata Advisor to the nonprofit Creative Commons and coauthor of the RSS 1.0 specification.

In 2008, he created the nonprofit site watchdog.net, making it easier for people to find and access government data. He also served on the board of Change Congress, a good government nonprofit.

In 2007, he led the development of the nonprofit Open Library, an ambitious project to collect information about every book ever published. He also cofounded the online news site Reddit, where he released as free software the web framework he developed, web.py.

Press inquiries can be directed to [email protected] or 571- 336- 2637

[+] microarchitect|14 years ago|reply
“This makes no sense,” said Demand Progress Executive Director David Segal; “it’s like trying to put someone in jail for allegedly checking too many books out of the library.”

No it's not. It's like sneaking into the library at night and making photocopies of all the books. Then, upon getting caught, the perpetrator sneaks back into the library in a different disguise and continues to photocopy more books. Repeat this action of getting caught and sneaking back in a few more times and combine this with the fact that his downloading of documents affected JSTOR performance for other legitimate users of the archive and you get a sense of what he's really done.

How is this excusable?

I'm completely onboard with those who claim that we need some reform in scientific publishing, but Aaron's actions smack of low ethical standards to me, not to mention extremely poor judgement on his part.

EDIT: Hi downvoter! Can you please explain why you think I'm wrong?

[+] sp332|14 years ago|reply
How did Aaron get access to the for-pay articles (page 9)?

Also: nice going, Aaron! Drag research access into the 21st century, kicking and screaming!

Does anyone think it's odd that an Acer laptop could write these files to disk faster than JSTOR could serve them?

[+] SpikeGronim|14 years ago|reply
"Does anyone think it's odd that an Acer laptop could write these files to disk faster than JSTOR could serve them?"

Nope. I bet the JSTOR servers are serving many concurrent requests. If he had the servers to himself then yes, that would be surprising.

[+] rryan|14 years ago|reply
Anyone on MITNet has access to JSTOR articles free of charge to the user. Similarly the ACM, IEEE, etc. all have agreements like this with major universities.
[+] jdvolz|14 years ago|reply
Yeah, I was laughing about how an Acer laptop took down their service and did damage to their network. If I was JSTOR, I wouldn't prosecute just because it makes our company look ridiculous.
[+] yid|14 years ago|reply
> Does anyone think it's odd that an Acer laptop could write these files to disk faster than JSTOR could serve them?

Why would that be odd? SATA = 3 Gbps throughput with minimal overhead, Ethernet = 1 Gbps with lots of overhead (IP headers, Ethernet headers, HTTP headers)

[+] tghw|14 years ago|reply
Does anyone think it's odd that an Acer laptop could write these files to disk faster than JSTOR could serve them?

Not at all. Writing to the hard drive is going to take much less time than downloading the articles from a remote server.

[+] rryan|14 years ago|reply
"Aaron Swartz ... was a fellow at Harvard University’s Center for Ethics"
[+] llimllib|14 years ago|reply
It is not impossible to imagine a code of morality which views his alleged actions as ethical.
[+] flocial|14 years ago|reply
It's hard to trivialize downloading 4 million articles using a web scraper's bag of tricks and then some. If the information was publicly accessible these charges wouldn't stand unless he tried to distribute it. If it was something so commendable, why would you cloak your activities or go to a different university to do your dirt instead of Harvard (where your a fellow of some sort) or Stanford (where you attended). Regardless of the motives and ideals or the excess of the charges, this isn't one of those hapless grandma versus the RIAA stories. He must have known what he was doing.

The pricing and restrictions on the dissemination of academic papers is by any rational evaluation nothing short of ridiculous and contradicts the academic ideal of free exchange of ideas for the advancement of knowledge. However, history of scholarship is also a history of patronage, academic politics and in-fighting for greater prestige.

It's sad that someone like Aaron has to be treated like a domestic terrorist. It's sad that we have a vindictive justice system willing to flaunt the Constitution in this day and age with what effectively amounts to cruel and unusual punishment so they can "make an example" out of someone.

However, it's no one's fault that Aaron was so emboldened to take this initiative without sufficiently ensuring that he would be free from criminal prosecution.

Am I alone in thinking that these "hacktivists" will only prompt government to push more frivolous data theft laws and heavier punishment for offenses that may one day victimize hapless, innocent people? It's going to get a lot worse before it gets better.

[+] jacquesm|14 years ago|reply
> However, it's no one's fault that Aaron was so emboldened to take this initiative without sufficiently ensuring that he would be free from criminal prosecution.

Maybe he did it knowing full well what the consequences might be. He seems to be a pretty principled guy.

[+] sigil|14 years ago|reply
Consider showing your support for Aaron here:

http://act.demandprogress.org/sign/support_aaron/

Demand Progress is an organization Aaron co-founded. They've done some great watchdog work on things like PROTECT IP, the Patriot Act, the Internet Blacklist Bill etc.

[+] Aloisius|14 years ago|reply
There is a riser closet in my office with various internet service providers wiring in it feeding the entire building.

If I were to enter this riser closet and plug into my laptop into one of these lines, I would be charged with theft of service and deservedly be sent to jail. It doesn't matter if the door is locked or not. It doesn't matter what kind of security they put in place or not. It doesn't matter if I only sent a few bytes of data on their network and didn't harm anyone elses' service. It is still theft of service.