BitTorrent file sharers are heavily monitored, study finds
Large internet companies logging IP addresses involved in file exchanges on behalf of copyright enforcers
If you've downloaded even one movie, song or TV show using the BitTorrent file-sharing system, chances are, it didn't go unnoticed.
A U.K. study has found that pretty much all files shared with the help of popular torrent sites like The Pirate Bay are monitored — mostly by large internet service companies likely acting on behalf of copyright enforcers or private corporations.
Researchers at the University of Birmingham examined the 100 most popular files in every content category on The Pirate Bay and found that the IP addresses of the computers of the people around the world doing the file sharing were being tracked by a number of monitors pretending to be file sharers themselves.
Music and movie files were the most heavily monitored.
"We picked The Pirate Bay as the biggest illegal downloading site and one which is getting a lot of attention at the moment," said lead study author Tom Chothia, a computer scientist at the University of Birmingham.
"What we've shown is that there is very large–scale monitoring going on. There could well have been a lot of monitoring which we didn’t see, as well."
The Pirate Bay is one of the file-sharing sites that has been most actively pursued by authorities and made headlines again this week after one of its co-founders was arrested in Cambodia and threatened with deportation to Sweden, where he has been convicted of copyright violations and faces a one-year prison sentence.
The most high-profile court cases against file sharing have generally been those targeting the administrators of large file-sharing sites like The Pirate Bay, isoHunt and Megaupload, but there have also been numerous attempts to sue individual users for illegal downloading activity.
"The work we did partly resulted out of these court cases where people have been threatened with legal action," Chothia said. "We wondered what kind of evidence people would actually need to take action against someone."
Monitoring being done on behalf of others
Monitoring of file-sharing sites has been documented in the past, but Chothia and his colleagues wanted to get a better sense of the extent and type of monitoring going on.
Over the two years between 2009 and 2011 when they conducted their research, they found that those doing the heaviest monitoring of file sharing activity were large internet service providers that rent out server space, host websites and offer other computer services for big clients.
"We speculate that copyright enforcement companies are using these hosting companies as a front to disguise their identities," the researchers write in a paper presented at the SecureComm computer security conference in Padua, Italy, this week.
Four of the six largest monitors the study identified were based in the U.S., one was in Brazil and another in Ireland, but the parties actually collecting the data — and the people being monitored — could be anywhere in the world.
The six largest monitors identified in the paper are:
- Speakeasy Inc.
- Qwest LLC
- HEAnet Ltd.
The type of monitoring these companies are doing is not the same as ISPs like Rogers or Bell monitoring bandwidth use of their own customers, which they can do on their own internal networks, or turning over the names of clients whose IP addresses have been found to be involved in file sharing.
The companies identified in the study are running file-sharing programs on behalf of third-party customers who want to detect file sharers.
"These are all businesses who rent out computing space and internet space," Chothia said. "The jobs of those hosting companies are to run these kind of computations for other people."
Who exactly the monitoring companies are working for and what their clients plan to do with the information collected is uncertain.
One possibility is that the third parties are copyright enforcement agencies or companies that plan to use the information as evidence of illegal file sharing in court cases.
Another possibility is that they are companies that want the data for commercial purposes.
"It could just be collecting marketing information," said Chothia. "So, if it's information about who's downloading what files where in the world, it's actually quite valuable, because it shows how popular various copyrighted material, various music and films are in different territories."
Chothia and his study co-authors also identified a few small-scale hosting and computer security companies in the U.S. and Europe monitoring file sharing, but those tended to look at particular files or subjects rather than perform mass monitoring.
Some of the smaller monitors, like one called Checktor, openly identified themselves as providers of commercial BitTorent monitoring services or were known copyright enforcement agencies (Peer Media Technologies, for example) while others did not "publicly acknowledge monitoring BitTorrent," the researchers write.
BitTorrent users have taken measures to thwart monitoring activity by compiling so-called blocklists of suspect IP addresses, including those associated with law enforcement agencies, that are rejected by BitTorrent software, but the researchers found that those lists do not capture many of the biggest and most active monitors.
For the most popular files, it took an average of three hours for a person's file-sharing activity to come to the attention of a monitor, the study found, and monitors did not differentiate between habitual users sharing large numbers of files and those who shared a single file for a brief period.
"All the illegal files we looked at were monitored, but some of the least popular ones, it would take a day, two days for them to actually be connected to," Chothia said.
Chothia and his colleagues also looked at some sites that facilitate legal sharing of copyright-free content such as open source software but found that those types of files were not monitored.
Dummy torrent client helped snag monitors
The researchers were able to monitor the monitors by setting up dummy BitTorrent client software, the program needed to initiate and manage the file-sharing process. Their fake client acted like a regular file sharer in all ways but one — it never actually shared any files.
The BitTorrent system of sharing files works by having users, called peers, exchange pieces of a file over a network that uses something called a tracker to facilitate communication between peers.
File sharers download fragments of files from multiple users based around the world and simultaneously share them with other users, making the downloading process faster than a simple peer-to-peer exchange.
Trackers act as directories of users, identified only as IP addresses, sharing a particular file.
Sites like The Pirate Bay aggregate links to trackers, organizing information by names of files — in most cases, names of TV shows, movies, songs and albums — and acting as de facto search engines for torrent files.
"The process starts by a single peer who has a complete file telling a tracker it has a complete file," Chothia said. "Then, that peer just waits online for another peer to connect, and the file is transferred directly between the peers.
"The tracker just has a big list of IP addresses for each file so whenever any of those other clients ask, the tracker says, 'These IP addresses are all sharing the file, if you want a copy, go and talk to them.' "
Direct vs. indirect monitoring
One way copyright enforcement agencies collect evidence of file sharing is simply to track the tracker and not individual users.
Past studies have looked at this type of indirect monitoring and found it to be an unreliable way of tracking file sharing since trackers can contain random IP addresses put there to thwart monitors or IP addresses that were assigned to one user but later were reassigned to another.
"When a tracker says this IP address is sharing a file, there's absolutely no guarantee that that IP address actually is," Chothia said. "There can be all kinds of IP addresses in there – wrong IP addresses — so that kind of work … certainly wouldn't stand up in court and certainly couldn't be used as real evidence of file sharing."
Chothia and his team looked at direct monitoring, whereby a monitor first gets the list of file sharers from the tracker database, then connects directly to individual peers to verify they are, indeed, sharing the file. It does this by masking as a regular file-sharing peer without ever actually completing the downloading process.
It was by first studying how normal file sharers behave and then looking for unusual patterns that the researchers were able to spot the spies in the file-sharing swarm.
"When our client [software] pretended to share illegal content, it would quite quickly get connected to by other alleged file sharers from particular IP addresses who would keep checking back with us every so often," Chothia said. "However long we pretended to be sharing, they'd keep checking us, and those clients themselves would never actually download.
"When we asked them what pieces [of a file] they were sharing, they would always report something random, never consistent, so, clearly, were not actually downloading themselves."
Legality of data collection in question
Still, even direct monitoring has its flaws. While carrying out their sting operation, the Birmingham researchers found that while the monitors would connect with the peers sharing a particular file, they never verified whether those peers truly had the file.
"For this study, we never had any illegal content, but the monitors never actually checked if we had any, so our fake clients would have looked just as guilty as people who are really file sharing," Chothia said.
Chothia said he assumes the monitors are logging not only IP addresses but also the names of shared files and the dates and times of when they were shared, information that can be later linked to actual individuals.
But it's unclear whether such evidence could ever be relied on in court or even whether the act of monitoring itself is legal, meaning the ISPs providing the service could be subject to inquiries by privacy regulators.
"In the EU, there are quite strong data protection laws, and people who store personal data have to fulfil a lot of criteria, and this could definitely be looked on as personal data about the people being monitored," Chothia said.
"Without knowing where the companies are based, it's hard to say if it's legal for them or not."
No clear winner in content 'arms race'
Chothia figures the information collected through direct monitoring definitely provides enough evidence to threaten to sue someone "but not necessarily enough to follow through."
In the meantime, the cat-and-mouse game between those who believe content should be free and those who want to keep it confined within the statutes of copyright law continues — with no clear winner in sight.
"[File sharers] can make it harder to be monitored," says Chothia. "They can only share for short periods of time or change their IP address frequently. It's possible they can use proxy services to hide their identity.
"But then that gets into a bit of an arms race: illegal sharers can do something to make it harder to track them; the monitors can have more advanced systems to track the sharers. It could definitely go both ways."