The internet is undoubtedly one of the greatest inventions of the modern age. Never in all of history have so many people had access to so much information, easily and - in the most part - for free. Projects such as Wikipedia have shown what can be achieved by an ideology to do good and harness the power of the masses working together for a common goal.
Where previous generations in need of knowledge had turned to the Encyclopedia Britannica (if they were fortunate enough to own the multi-volume repository) people now instead access its online contemporaries - Google, Wikipedia, or simply asking a question on Twitter. Even Britannica itself has replaced the famous leather tomes with a subscription based website that’s better equipped to keep up with the public’s constant thirst for knowledge.
Social media has grown at a phenomenal rate and with it transformed our ability to communicate on a wide scale. Tracking down old friends and colleagues is no longer the preserve of amateur detectives with a battered book of phone numbers and addresses circa 1993.
Now it’s simply a case of going on Facebook and typing in their name - unless of course they happen to be called John Smith or Patrick Murphy, then you might still be in for a spot of sleuthing.
We have available to us an incredible amount of online services such as Skype, Gmail, Google Docs, Twitter, Facebook, Dropbox, and innumerable websites that have now become an essential part of people’s lives - all of which offer their wondrous bounty for the princely sum of naught. Truly this is a golden age for technology and those that would have the good fortune to use it. But how does this make any sense at all?
We know that businesses need to make money to even exist, let alone thrive. Google’s data centres are famously home to thousands of computers, each holding fragments of the world wide web within them. YouTube users upload over 48 hours of video to the site every minute, which works out to a staggering 8 years of content every day. Facebook is currently home to over 900 million users, giving it a population greater than most countries.
We don’t pay to search, upload badly taken photographs of school plays, or watch little pandas sneezing, yet Google is one of the most profitable companies in the world and when Facebook recently went public it was valued at a whopping £66 billion.
So how do they do it? What’s the secret to their success? Well, technical brilliance aside, the answer is very simple. It’s you. Or more accurately, what you like. Or even more accurately, what you are likely to buy.
The free services we access on a daily basis are watching us, where we go, what we do, and using that information to provide their advertisers, or in some cases other people’s advertisers, with profiles that enable them to sell to us more effectively and increase the chance that we will click the ‘add to basket’ button. As the saying goes ‘If you’re not paying for it, you’re not the customer; you’re the product being sold’.
During his recent TED talk ‘Tracking the Trackers’, Mozilla CEO Gary Kovacs discussed the idea of Behaviour Tracking and its proliferation across the web. In essence, when you visit a website a cookie is created in your browser which allows the site to know you are there and helps perform basic tasks such as maintaining the contents of your shopping basket while you continue to browse the site.
They can also gather information on the pages you visit and items you click on, so that the contents you are offered are more relevant to your tastes. Generally they enhance the browsing experience (just try disabling all cookies in your browser privacy settings to see how clunky the web can really be) and often save us from having to log in or set preferences every time we visit a favourite site.
All of this is perfectly acceptable, in fact it’s helpful, but the issue Kovacs was discussing happens after you leave the site in question and go elsewhere.
Traditionally you would expect a site to retain the information on the cookie for the duration of your stay and then for it to become inert once you leave. But that isn’t always the case. Third-Party cookies are now very likely to also be watching our movements, sometimes across several sites, and to none of which we have ever given our consent.
The effects are easy to observe, in fact you’ve probably already seen them several times. By simply browsing or searching for details on, say, the new Batman film it won’t be long until related products begin appearing in the advertisements on other sites you visit, sometimes with unnerving accuracy.
This is made possible by the relationships between the host sites and online advertising companies such as Scorecard Research, Tribal Fusion and Google’s own DoubleClick. The idea is to provide you with a more tailored experience online, and of course tempt you to click through to the item in question and make a purchase.
Revenue generated from these advertisements is quite staggering. In 2011 Google was reported to have made $37 billion, with nearly all of it coming from its AdWords and AdSense income streams. Facebook also raked in a highly respectable $3.7 billion with 85% coming from advertising. As you can see from the figures it’s no surprise that companies are very keen to know what we want to buy, and the most effective way to place those products in front of us.
The idea of our habits being monitored, especially by those that would seek to profit from it, is an uncomfortable truth that now accompanies our heavily interconnected online world. Gary Kovacs puts it very eloquently: "We are like Hansel and Gretel, leaving bread crumbs of our personal information everywhere we travel through the digital woods".
Continues on next page
To observe the scale of the tracking that goes on Mozilla have developed the Collusion Project, alongside its Firefox Plugin, which displays the tracking cookies in a fashion similar to cells under a microscope - and it doesn’t take long for those cells to multiply.
"On a technical level, Collusion plugs into your browser and watches all of these requests to web sites and third parties involving cookies", explains Mozilla's Ryan Merkley. "Firefox already makes a record of some of this (your browsing history) and Collusion is recording a little bit more so it can be drawn on your screen. The more you browse, the more Firefox and Collusion accumulate about the site relationships, and your graph gets larger."
That’s something of an understatement. In an experiment to see how widespread the behaviour was we cleared the browser history on one of our machines, installed Collusion, then visited some normal, everyday favourites from our bookmarks: The Guardian, Football365, Facebook, Twitter, and a few others. After viewing just nine sites, and spending less than 20 minutes online, we had been tracked by thirty five third-party cookies.
In another test we reset the browser and visited the site of a popular high-street video game retailer - the results were startling. By just landing on the home page we saw eleven third-party links appear on our Collusion graph. Ryan Merkley’s initial experience of the application was equally concerning.
"I first tried Collusion when one of our engineers shared an early proof of concept," he says. "I was shocked at the number of trackers, and most of all, by the number of times a very small group of trackers showed up. Those few trackers know more about my combined browsing habits than any website ever would. It made me want to know how they use that data, and have a tool to decide for myself whether they would be able to collect my data at all."
It’s not just the secret third-parties that watch what we do, sometimes it’s sites that we trust. In late 2011 blogger Nik Cubrilovic wrote a blog post that showed how Facebook was using persistent cookies that could track web use even after users had logged out from its site.
The news that the social media giant might be quietly watching exterior online behaviour quickly spread across the internet and brought angry responses on blog posts and forums (which to be fair is not an unusual location for those sort of reactions).
Facebook immediately addressed the issue and went to lengths to reassure people that it hadn’t gathered information but instead the cookies were used as a form of security against spammers and unauthorised log-ins. Others worked with the ‘Like’ functions found on various sites around the web. Within two days of his post Facebook fixed the apparent bugs and Cubrilovic confirmed this on his blog along with thanks for the speedy resolution.
Then shortly afterwards Cubrilovic was contacted by a friend on Twitter who had found a third party site on which Facebook has set one of the previously offending ‘datr’ cookies, only now it was capable of sending information back to Facebook without the user having ever logged in.
The cookie worked behind the ‘Like’ function on the page and was able to identify the user even if they didn’t interact with the widget. Cubrilovic investigated further and found several other sites that now had these cookies active.
Facebook was once again quick to respond, saying that this wasn’t a re-enabling of the cookies but rather a bug that affected certain sites that called the API in a non-standard way. Once again it fixed the issue and assured users that it didn’t build profiles using this kind of data.
It’s reasonable to accept what Facebook says; after all it did move to plug the gaps quickly and was public about its reasons for using cookies. However, this isn’t the only occasion on which its attitude to the privacy of user data has been brought into question.
Several times in the past few years it has introduced new functions to the site and automatically opted-in users, often making data that was previously private suddenly public - at least until tech-savvy users sent around instructions of how to reverse the problem.
The latest instance was in June when Facebook replaced every users’ email address with an @facebook.com alternative without asking their permission or even letting them know that it had happened. A story also emerged in July that revealed the existence of a Facebook ‘Data Science department’ which analyses the information the company gathers on its users to search for patterns that it may be able to use.
In the article by Tom Simonite which appeared on MIT’s Technology Review site he reported that one of the team of data scientists - Etyan Bakshy - had conducted a sutble experiment. According to Simonite, Bakshy ‘messed with how Facebook operated for a quarter of a billion users.
Over a seven-week period, the 76 million links that those users shared with each other were logged. Then, on 219 million randomly chosen occasions, Facebook prevented someone from seeing a link shared by a friend.
Hiding links this way created a control group so that Bakshy could assess how often people end up promoting the same links because they have similar information sources and interests.’ The theory might be interesting, and the results possibly useful, but the methods of obtaining the information remain highly questionable.
Continues on next page
Who else is tracking you?
Of course it’s not only Mark Zuckerberg and his social scientists that are watching our clicks with interest. Twitter was in the news recently when it was revealed that the micro-blogging company had sold two years' worth of archived Tweets to data research company DataSift.
Social media app Path was found to be uploading the entire contact list from iPhones without the consent of their owners. Android phones (mainly in the US) were being sold with Carrier IQ software installed that some analysts believed was capable of tracking keystrokes and text messages.
Last February, The Wall Street Journal reported that Google had been tracking users of Apple’s mobile Safari browser through cookies that acted as if the user had granted permission for ads to be displayed. During the investigation it was discovered that a few other large advertising companies were also using similar coding to capitalise on the loophole in Safari.
Google responded that the newspaper mischaracterized what happened and said in a statement that it ‘used known Safari functionality to provide features that signed-in Google users had enabled. It's important to stress that these advertising cookies do not collect personal information.’
Whatever the justification Google promptly disabled the code and Apple set about closing the loophole in its browser. As we go to press there is widespread reporting that the US Federal Trade Commission is about to fine Google $22.5 million for the breach of privacy. This would make it the single biggest penalty the government body has ever handed out.
Google wasn’t collecting more information, but due to the composite nature of the different sets of data there’s no doubt that the information was more valuable to advertisers. Recently at the Google I/O developers conference the company also revealed a new feature for its Android mobile operating system - Google Now.
This acts as a personal assistant similar to Apple’s Siri, but the aim of Now is for it to learn about your behaviours - where you live, how you travel, foods you like to eat, places you like to shop - and try to provide you with information relevant to your interests by using location service to know where you are and the possibilities that exist around you. It’s hugely ambitious, possibly brilliant, but also means once more that your privacy is being brought into a questionable area when a device is tracking essentially how you live.
The short explanations very rarely mention anything about third-party cookies and often only appear the first time you visit the site, meaning that if users just continue as normal then they will be tracked in the same way they were before.
What you can do to avoid being tracked
For a more permanent solution it would be wise to consider using the ‘Private’ or ‘Incognito’ browsing modes that stop sites from adding permanent cookies or tracking your web history.
Another sensible move is to visit the security settings in your browser of choice and look for the ‘Do Not Track’ or ‘Do not allow third-party cookies’ option. Plug-ins such as Ghostery are available for most browsers and offer an enhanced level of security, while Internet Explorer users can also use the ‘Tracking Protection Lists’ function that enables them to subscribe to lists of known offending sites, and have their browser deny them placing cookies on your machine.
Sadly even these settings can be overcome by something called Device Fingerprinting. Your system is made up of lots of small details - such as browser type, operating system, plugins, and even system fonts - that can be scanned and interrogated to reveal an individual digital ‘fingerprint’ that identifies your machine from any other on the web.
The Electronic Frontier Foundation is a digital rights group which campaigns against these invasions of civil liberties. To illustrate the effectiveness of Device Fingerprinting it set up the Panopticlick website which allows users to see how identifiable their systems really are. We tested our road-worn laptop and were slightly unnerved to discover that out of the 2,287,979 users that had scanned their machines on the site ours was unique and therefore trackable.
Some would argue that the ability for companies to track our interests is a fair price to pay for the services we receive. After all, websites are only offering opportunities for advertisers to pitch at us. Television, radio, and cinema all do the same thing and use demographic profiling to promote certain products at calculated times - hence all the beer and crisp commercials during football coverage.
No-one forces us to use social networks although if you do it will invariably be one of the big three, as you’ve no hope of finding friends on any other. Plus, there are alternatives to all of Google’s offerings. Is it such a bad thing that we are being watched? Well it all depends on who is doing the watching...and why.
Continues on next page
The law on privacy
Currently negotiating its way through parliament is the government’s ‘Communications Data Bill’, or ‘Snooping Bill’ as opponents prefer to call it. This radical restructuring of British law would force ISPs to retain communications data for all emails, web browsing, and even mobile phone use in the UK.
The authorities would then be able to access the information in a limited capacity at any time, or in its entirety when granted a warrant. The government is also claiming that it will be able to decode SSL encryption as part of the process.
For a nation that’s already often heralded as the most surveillance-heavy country in the world this seems to many like a step too far. What some privacy campaigners see as the real problem though isn’t necessarily the government prying into our digital lives (although that’s obviously a concern that they rate highly), but rather the fact that so much information about us will be collected and stored, with potentially dangerous consequences.
Most of us would agree that although we might not always like how our elected officials behave, we still have a reasonably accountable government. We could also agree that while companies like Facebook and Google make money from us in a less than transparent manner, they too are not out to harm us in any way. But the fact that they're all intent on assembling hugely detailed data sets about us poses a problem if that information is ever used by organisations or individuals that are not so benign.
In January 2010 Google announced that it had been the victim of a ‘sophisticated cyber attack originating in China’. The hackers had gone after the email addresses of human rights activists, journalists, and some senior US officials. Wikileaks then released leaked cable communications that suggested a senior member of the Chinese Politburo had been involved in the attacks.
Some reports suggested that the motives behind the attack were to identify and locate political dissidents who were speaking out against the government’s human rights records. 2011 saw the rise to prominence of the ‘Hacktivist’, modern day protestors against what they see as abuses of civil liberties or digital freedom.
Among the high-profile victims of groups such as Anonymous and LulzSec are Sony, the US Bureau of Justice, our own Ministry of Justice, the Home Office, several Chinese government sites and even the Vatican...twice. In fact Verizon released figures reporting that over 100 million users’ data had been compromised by the groups in the past year.
Although the motivations of Hacktivists are generally more positive ones, the ease in which they seem to be able to either bring down or infiltrate supposedly secure sites should bring into question the wisdom of having so much personal information stored in one location, undoubtedly attached to the internet.
So what can we do? Well, at the moment the Communications Data Bill isn’t yet law, so contacting your MP would be a good place to start. In fact, getting as many people as possible to contact their MPs would be a better place.
Change your online habits
But a more immediate solution is that you need to be aware that whatever you put online is no longer private. Use tools such as Collusion to monitor who is watching you.
Change your browser security settings to stop these cookies being stored in the first place, and regularly clear out histories and cookies to remote those that slip through the net. Limit the use of location services, resist clicking the ‘Like’ button, and even use a different browser for your social networks entirely.
The more cautious could boot a Linux operating system from a USB flash drive when using public Wi-Fi, or download the new Wi-FI Guard from AVG. In the end, though, the most potent weapon we can use against those who would try to track us and profit from it is an awareness of what might be happening.
Call in the professionals
Allow is a UK company that specialises in helping you manage your online profile. Its new subscription-based plan offers services such as searching the web to see what marketing databases your name appears on and helping remove you from them, a Google Chrome ad-blocking plug-in, and an email tool which randomly generates addresses for filling out forms or subscribing to websites anonymously, but which you can access from the Allow website.
There are also social media tools which analyse your privacy settings to warn you of any dangers to your data, and a profiling service that tells you how potential employers would view your public content. The whole package is new and could well prove useful. Prices are still being finalised, but expect to pay around £6 per month.
A few simple ways to protect yourself
- Limit the information you share publicly. If including your birthday in a social network profile, avoid putting the year - this is a key identifier which trackers use to pinpoint you in a crowd of John Smiths.
- Use a secondary email address when registering for websites, forums, competitions, or shopping. Hotmail and Yahoo offer free accounts which are quick to set up.
- Use one browser for general web-activities and another for more sensitive information. Also avoid the ‘Like’ or ‘+1’ buttons on sites.
- Change your passwords regularly and don’t use the same one on all your accounts. Use a password manager so you don't have to remember them all.
- Make sure any browser you use has the Privacy settings switched on. Generally their default setting is 'off'.
The internet is growing up and maturing into a stunningly powerful tool and an amazing place to explore. Like any scene of great adventure there are a few dangers to negotiate along the way.
We shouldn’t let the fear of a Big Brother state or shadowy data brokers deny us the advantages of services like Google or Facebook, and if we’re careful to limit the important data we share then we should be able to protect ourselves in some measure.
Ryan Merkley from Mozilla sums it up rather well, "Tracking is a complicated issue, and a no-tracking universe probably isn't the answer. We want users to be informed and in control of their web experience: the more they know, the less likely anyone can track them without their knowledge. In the long run, informed and empowered users has always been the best thing for the web."