If linking to copyrighted data 'should' be illegal (SOPA), then what about descriptions of that data that are sufficient to identify the original, but not reconstruct it (magnet links)?
And if those were made illegal, then what about descriptions of those descriptions? You can recurse infinitely on this.
Beyond mere amusement, after just one or two recursions, you get to the point where it would be difficult to write a law that would criminalize magnet links without also criminalizing people who link to a Sparknotes-like summary or commentary for a piece of media.
You're thinking about it the wrong way. From a programmer's POV, the link to the torrent can be abstracted endlessly into new and distinct forms, each of which you believe needs to be legislated away in turn. From a lawyer's point of view, the specifics are really not important, but rather the end result: is the user illegally procuring copyrighted material, or is the distributor providing them with a readily accessible means of doing so?
Law has certain resemblances to regular code, but folks here seem to think that if something isn't properly specified that the law will break in the same way that a program will fail to compile or run properly. But that's not how it works. Poorly drafted laws can fail, certainly, but it's not that hard to draft something that focuses on the end result.
Consider ordinary offences, such as robbery. You wouldn't get anywhere by arguing that you're alleged to have put your right hand in your pocket and pulled out a small hatchet, and that since there's no law specifically forbidding right-hand wielding of hatches, you should go free. The technicalities of how you committed the robbery are irrelevant as long as it can be established that you took someone's property in a violent fashion. I'm a little perplexed as to why folks think torrenting/piracy/filesharing etc. is so different that it can't be addressed legally. Sure, the law needs to be clear and logical, but only up to a point. It doesn't need to be absolutely exhaustive, and 'beyond a reasonable doubt' has never meant 'beyond any imaginable possibility'. People do make arguments like that in criminal defense cases from time to time, but they typically fail because the doubts they attempt to raise are absurdly far-fetched.
A description of copyrighted data would be the torrent file, which content producers would probably like to argue are infringing.
A magnet link is a hash of the torrent file, so it's already two steps removed.
Of course, the Pirate Bay magnet dump is itself a torrent, so it's a hash of a hash of a hash of copyrighted data.
And that torrent itself has a magnet link: 938802790a385c49307f34cca4c30f80b03df59c is a hash of a hash of a hash of a hash of copyrighted data. (In the MP/RIAA's ideal world, I've just committed criminal copyright infringement with damages reaching into the $billions.)
Theoretically, the Pirate Bay dump could include the torrent for the Pirate Bay dump, and be an infinitely recursive description of itself... but that's probably an intractable cryptographic process.
The more one studies computer science, the less one believes that information can be owned in any meaningful way. I remember reading a paper on a "lightnet" that basically XORed arbitrary blocks of data together, then produced a recipe on how to recover some original data by XORing the appropriate blocks. With just this recipe you could ask other nodes for the random blocks, then recover the original data locally. The paper made a good argument that however you try to define ownership of a block, it will lead to some contradiction. The system was implemented but I can't recall its name. I think it was hosted on SourceForge.
You're thinking legality, but I'm thinking efficiency. So we can now distribute 1.5 million torrents (of a total of several thousand TBs, no doubt) in a file 90MB big, in a torrent which itself has a magnet address that takes up...20 bytes?
The size savings as you go up the tree are incredible. I see no reason why you couldn't create an almost-entirely distributed torrent site in this way.
Think about: Torrent discovery could be done by regular distribution of index torrents, and the clients use that to find out what can be downloaded and where.
In fact, in the world of magnet addresses, "uploading" a torrent would be as simple as requesting that its URI be put in the day's index. So running a torrent site would be as simple as curating a list of magnet URIs each day into an index, then publishing that torrent's URI somewhere. Like Twitter. You could run a torrent site entirely from Twitter.
Someone could suggest a system, whereby spelling mistakes are used to encode partial information about a magnet link.
One switched pair could give you the position in the magnet-link key, the other switched pair could give you the value. That way, you could never pin down exactly who gave you what information.
Lawmakers can easily avoid this meta situation by writing simpler more encompassing laws.So rather than being specific about how the pirated material is accessed they can write a more open law along the lines of "a site that's main use is assisting the distribution of pirated material".
Hello, I am an author of the scrape. I did it more to try it, but who knows, maybe it will be useful to someone.
I went trough the description pages like http://thepiratebay.se/torrent/$i by increasing the $i and saving the magnet if pirate bay didn't return 404 error. I went trough the pages as unlogged user, though, so that might be the reason why I got only 1.5m torrents.
I didn't know pirate bay has hidden porn torrents; there is TONS of porn in the scrape already.
The script is in perl, I will post it to pastebin in a moment.
The porn torrents are only hidden from naive searchers; all the pages for them are still accessible if you've got a direct link to them, so your scraper should've picked all of them up.
i tried to run the script, however, i get an error (added diagnostics for more info, so line 13 refers to line 11 of your script, line 27 to line 25):
Can't use an undefined value as an ARRAY reference at
piratebay_magnet_scrape.pl line 13 (#1)
(F) A value used as either a hard reference or a symbolic reference must
be a defined value. This helps to delurk some insidious errors.
Uncaught exception from user code:
Can't use an undefined value as an ARRAY reference at piratebay_magnet_scrape.pl line 13.
at piratebay_magnet_scrape.pl line 13
main::__ANON__(20697, 0, undef, 0, 0) called at /usr/share/perl5/Parallel/ForkManager.pm line 354
Parallel::ForkManager::on_finish('Parallel::ForkManager=HASH(0x9cd7ac8)', 20697, 0, undef, 0, 0) called at /usr/share/perl5/Parallel/ForkManager.pm line 333
Parallel::ForkManager::wait_one_child('Parallel::ForkManager=HASH(0x9cd7ac8)', undef) called at /usr/share/perl5/Parallel/ForkManager.pm line 285
Parallel::ForkManager::start('Parallel::ForkManager=HASH(0x9cd7ac8)') called at piratebay_magnet_scrape.pl line 27
The Pirate Bay front page claims 4.187.907 torrents. But, this 164MB is only ~1.5 million torrents. Is the discrepancy from exclusion of the porn torrents? I'm guessing this guys scrape missed them; you have to be logged in to TPB to see them.
When TPB had to be blocked in the Netherlands and they switched to recommending magnet links instead of torrents (pretty close after each other), I thought someone would have done this sooner. But it's here now, and proxies do their job just fine ^^. (I couldn't load the page directly as it's blocked here.)
magnet of the magnets: magnet:?xt=urn:btih:938802790a385c49307f34cca4c30f80b03df59c&dn=The+whole+Pirate+Bay+magnet+archive&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80
Hope the community doesn't think I've hijacked the thread for my own purposes. I just thought it was an interesting little discussion and wanted to point it out.
I wonder if some self-updating mechanism could be implemented in magnet links. Something like additional signature part in the magnet url so the owner could inform other peers that content is changed and need to be updated.
I was inspired to steal and pirate the above magnet link into this quick vigilantist internet liberation site: http://yason.kapsi.fi/piratebay.html. I would be positively surprised to catch the interest of even a single MAFIAA party, though.
sorry if i'm being daft, but can't you get this down to a magnet link itself? THe page linked does just that:
magnet:?xt=urn:btih:938802790a385c49307f34cca4c30f80b03df59c&dn=The+whole+Pirate+Bay+magnet+archive&tr=udp%3A%2F%2Ftracker.openbittorrent.com%3A80&tr=udp%3A%2F%2Ftracker.publicbt.com%3A80&tr=udp%3A%2F%2Ftracker.ccc.de%3A80
but wikipedia has an example of a magnet link like this:
so could we get the magnet link to ALL of the magnet-hashes for ALL torrents on the Pirate Bay tdown to, what is that, 35 characters oplus the magnet cruft "magnet:?xt=urn:sha1:"?
You can leave out the "dn" (just a name), but if you leave out the tr links to trackers, the peers using that magnet link will have a harder time finding each other, and will often end up partitioned.
Maybe I'm being a bit black-and-white on this, but while the meta and the philosophy is interesting to talk about, no one is mentioning the morality. Stealing is wrong. You're taking someone else's work and not compensating them for it. I think it's sad that we're all so worried about the law when, in reality, you shouldn't pirate music for the same reason you don't steal a candy bar from the grocery store or snag 5 dollars out of your coworker's wallet or hack into Dropbox to get extra storage for free. It's wrong.
I'm getting a bit sick of the 'piracy is theft' nonsense. It isn't. Nobody is deprived of their possession. Me copying your song doesn't result in you no longer having a song.
Piracy is far closer to plagiarism, but even then only to a point. In plagiarism, one attempts to pass off the work of another as one's own. In piracy, one simply copies another's work for one's own use. They are fundamentally different.
This is why piracy is as prevalent as it is: it simply is not as bad as plagiarism, let alone theft. Most people have an intuitive understanding of this, and those who pirate do so without the cognitive dissonance that comes with acting against their moral code. It might be "wrong" in an abstract sense, sort of like lying on your resume is "wrong", but it's not wrong in the absolute sense of harming another person's body or property.
[+] [-] chimeracoder|14 years ago|reply
If linking to copyrighted data 'should' be illegal (SOPA), then what about descriptions of that data that are sufficient to identify the original, but not reconstruct it (magnet links)?
And if those were made illegal, then what about descriptions of those descriptions? You can recurse infinitely on this.
Beyond mere amusement, after just one or two recursions, you get to the point where it would be difficult to write a law that would criminalize magnet links without also criminalizing people who link to a Sparknotes-like summary or commentary for a piece of media.
[+] [-] anigbrowl|14 years ago|reply
Law has certain resemblances to regular code, but folks here seem to think that if something isn't properly specified that the law will break in the same way that a program will fail to compile or run properly. But that's not how it works. Poorly drafted laws can fail, certainly, but it's not that hard to draft something that focuses on the end result.
Consider ordinary offences, such as robbery. You wouldn't get anywhere by arguing that you're alleged to have put your right hand in your pocket and pulled out a small hatchet, and that since there's no law specifically forbidding right-hand wielding of hatches, you should go free. The technicalities of how you committed the robbery are irrelevant as long as it can be established that you took someone's property in a violent fashion. I'm a little perplexed as to why folks think torrenting/piracy/filesharing etc. is so different that it can't be addressed legally. Sure, the law needs to be clear and logical, but only up to a point. It doesn't need to be absolutely exhaustive, and 'beyond a reasonable doubt' has never meant 'beyond any imaginable possibility'. People do make arguments like that in criminal defense cases from time to time, but they typically fail because the doubts they attempt to raise are absurdly far-fetched.
[+] [-] Cushman|14 years ago|reply
A description of copyrighted data would be the torrent file, which content producers would probably like to argue are infringing.
A magnet link is a hash of the torrent file, so it's already two steps removed.
Of course, the Pirate Bay magnet dump is itself a torrent, so it's a hash of a hash of a hash of copyrighted data.
And that torrent itself has a magnet link: 938802790a385c49307f34cca4c30f80b03df59c is a hash of a hash of a hash of a hash of copyrighted data. (In the MP/RIAA's ideal world, I've just committed criminal copyright infringement with damages reaching into the $billions.)
Theoretically, the Pirate Bay dump could include the torrent for the Pirate Bay dump, and be an infinitely recursive description of itself... but that's probably an intractable cryptographic process.
[+] [-] kinghajj|14 years ago|reply
[+] [-] phillco|14 years ago|reply
You're thinking legality, but I'm thinking efficiency. So we can now distribute 1.5 million torrents (of a total of several thousand TBs, no doubt) in a file 90MB big, in a torrent which itself has a magnet address that takes up...20 bytes?
The size savings as you go up the tree are incredible. I see no reason why you couldn't create an almost-entirely distributed torrent site in this way.
Think about: Torrent discovery could be done by regular distribution of index torrents, and the clients use that to find out what can be downloaded and where.
In fact, in the world of magnet addresses, "uploading" a torrent would be as simple as requesting that its URI be put in the day's index. So running a torrent site would be as simple as curating a list of magnet URIs each day into an index, then publishing that torrent's URI somewhere. Like Twitter. You could run a torrent site entirely from Twitter.
[+] [-] tectonic|14 years ago|reply
[+] [-] wisty|14 years ago|reply
One switched pair could give you the position in the magnet-link key, the other switched pair could give you the value. That way, you could never pin down exactly who gave you what information.
Or maybe I shouldn't suggest it?
[+] [-] Gustomaximus|14 years ago|reply
[+] [-] netshroud|14 years ago|reply
Such as the name of the copyrighted work?
[+] [-] allisfine|14 years ago|reply
I went trough the description pages like http://thepiratebay.se/torrent/$i by increasing the $i and saving the magnet if pirate bay didn't return 404 error. I went trough the pages as unlogged user, though, so that might be the reason why I got only 1.5m torrents.
I didn't know pirate bay has hidden porn torrents; there is TONS of porn in the scrape already.
The script is in perl, I will post it to pastebin in a moment.
edit: allright, the script itself is here http://pastebin.com/8RXXthXB
as you can see, it's not very complicated.
[+] [-] redthrowaway|14 years ago|reply
It might be an to release a diff against this once a week, and write a quick script to grab it, keeping the list up-to-date.
[+] [-] makomk|14 years ago|reply
[+] [-] youwish|14 years ago|reply
Can't use an undefined value as an ARRAY reference at piratebay_magnet_scrape.pl line 13 (#1) (F) A value used as either a hard reference or a symbolic reference must be a defined value. This helps to delurk some insidious errors.
Uncaught exception from user code: Can't use an undefined value as an ARRAY reference at piratebay_magnet_scrape.pl line 13. at piratebay_magnet_scrape.pl line 13 main::__ANON__(20697, 0, undef, 0, 0) called at /usr/share/perl5/Parallel/ForkManager.pm line 354 Parallel::ForkManager::on_finish('Parallel::ForkManager=HASH(0x9cd7ac8)', 20697, 0, undef, 0, 0) called at /usr/share/perl5/Parallel/ForkManager.pm line 333 Parallel::ForkManager::wait_one_child('Parallel::ForkManager=HASH(0x9cd7ac8)', undef) called at /usr/share/perl5/Parallel/ForkManager.pm line 285 Parallel::ForkManager::start('Parallel::ForkManager=HASH(0x9cd7ac8)') called at piratebay_magnet_scrape.pl line 27
[+] [-] joejohnson|14 years ago|reply
[+] [-] schiffern|14 years ago|reply
The IDs are sequential, but there are substantial gaps. Removed spam torrents, most likely.
[+] [-] lucb1e|14 years ago|reply
[+] [-] denysonique|14 years ago|reply
[+] [-] devindotcom|14 years ago|reply
Hope the community doesn't think I've hijacked the thread for my own purposes. I just thought it was an interesting little discussion and wanted to point it out.
[+] [-] cabirum|14 years ago|reply
[+] [-] ak2012|14 years ago|reply
[+] [-] haakon|14 years ago|reply
[+] [-] jeremysalwen|14 years ago|reply
[+] [-] Mithrandir|14 years ago|reply
[+] [-] cinch|14 years ago|reply
[+] [-] dmn001|14 years ago|reply
[+] [-] kokey|14 years ago|reply
[+] [-] Bogdanp|14 years ago|reply
[+] [-] allisfine|14 years ago|reply
Just one thing: I see you are splitting by |, and some torrents (very few, but some) have | in their name (I didn't bother with escaping that).
[+] [-] yason|14 years ago|reply
[+] [-] unknown|14 years ago|reply
[deleted]
[+] [-] tiku|14 years ago|reply
[+] [-] vidarh|14 years ago|reply
[+] [-] xpose2000|14 years ago|reply
[+] [-] its_so_on|14 years ago|reply
but wikipedia has an example of a magnet link like this:
magnet:?xt=urn:sha1:YNCKHTQCWBTRNJIV4WNAE52SJUQCZO5C
so could we get the magnet link to ALL of the magnet-hashes for ALL torrents on the Pirate Bay tdown to, what is that, 35 characters oplus the magnet cruft "magnet:?xt=urn:sha1:"?
[+] [-] blhack|14 years ago|reply
[+] [-] JoshTriplett|14 years ago|reply
[+] [-] allisfine|14 years ago|reply
[+] [-] mahannay|14 years ago|reply
[+] [-] redthrowaway|14 years ago|reply
Piracy is far closer to plagiarism, but even then only to a point. In plagiarism, one attempts to pass off the work of another as one's own. In piracy, one simply copies another's work for one's own use. They are fundamentally different.
This is why piracy is as prevalent as it is: it simply is not as bad as plagiarism, let alone theft. Most people have an intuitive understanding of this, and those who pirate do so without the cognitive dissonance that comes with acting against their moral code. It might be "wrong" in an abstract sense, sort of like lying on your resume is "wrong", but it's not wrong in the absolute sense of harming another person's body or property.