We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message
78,713 News Articles

Net 'full of websites that are never visited'

Zscaler says chunks of the web remain untouched

A huge number of websites are never visited, says Zscaler.

Analysing visits to several million websites during the last quarter of 2009 for its State of the web report, the cloud security startup created a Hilbert curve-generated 'heatmap' of active and inactive IPv4 sites from real customer data.

As expected, the grid that emerged from this showed clusters of active sites as white dots, a large volume of reserved or non-routed addresses in grey, but it was the sea of dark that loomed largest of all.

In the three months of the analysis, vast numbers of sites were not visited at all. On the assumption that Zscaler's customers are typical of internet users more generally, these are the web's lost continent: sites nobody ever visits, or visit so infrequently that it doesn't register.

"It's a fascinating view which exposes just how vast the internet truly is. Even when analysing traffic from millions of users over the course of three months, it can be seen that much of the internet remains untouched," say the report's authors.

Commentators often refer to the 'dark side of the web', meaning the criminal and unsavoury parts of the internet few normally look closely at, but what Zscaler has turned up on its map is dark in a more literal sense. Nobody looks at these sites or if they do it is incredibly hard to detect from the cloud.

Some of this 'unlit space' could, of course, be non-English speaking domains beyond the ken of Zscaler's customer base, which raises the possibility that there are several 'long dark tails' on the internet which depend from which point you measure the phenomenon.

Part of the explanation for what does not get visited in Zscaler's report might also be explained in relation to what does.

According to the company, even half a decade ago the web was just that, a space defined by html files.

Although many persist on seeing the web in this way, the file types moving across its servers have changed markedly. Now, more than half of such files are Jpegs or Gifs, with html files accounting for only 0.57 percent of files.

Popular domains also dominate the internet, hovering up more and more of people's attention span. Liveperson, Google, doubleclick (the web ad distribution network), Yahoo, Facebook and a clutch of less well known but structurally important web domains took a large percentage of all web visits, a sign that the web is becoming more concentrated on fewer locations. This is the part of the internet that is growing.

Tellingly, a similar story of concentration is seen in terms of malware hosts, though with considerable fluctuations.

Depending on the particular type of scam being looked at, huge number of malicious URLs emanate from a very small number of hosts.

Whether botnets, phishing websites, or malware servers, there is usually a single mega-source, one or two large sources, and a large number of sources with extremely small shares.

See also: UK Web Archive will offer just 1% of websites by 2011


IDG UK Sites

Motorola Moto G vs Nokia Lumia 530 comparison: What's the best budget smartphone

IDG UK Sites

Everything you need to know about Apple's iPhone Camera in iOS 8

IDG UK Sites

Why you shouldn't trust password managers

IDG UK Sites

How to make an 'Apple iWatch' using an iPod nano and a 3D printer