We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message
79,796 News Articles

The Web is missing links

Web pages lead to cul de sacs

An in depth study undertaken by AltaVista, Compaq, and IBM reveals that not all pages on the World Wide Web are as well connected as we think.

The Web is shaped like a large bow tie with many underconnected sites out on its hard-to-reach fringes, say the researchers, who hope to use their indexed results to design better search engines and help electronic-commerce sites get noticed.

To determine the Web's structure, the companies used the AltaVista search engine and Compaq AlphaServer hardware to perform two massive "crawls" of more than 200 million pages by following the 1.5 billion hyperlinks connecting them.

Search engines normally perform crawls to create the indexes that help speed up searches, says Jim Schissler, an AltaVista spokesperson.

IBM researchers analysed the results and discovered that about a third of all Web sites are in a "strongly-connected core"--the knot of the figurative bow tie.

You can easily travel between those pages via hyperlinks. Meanwhile, one side of the tie, containing about a quarter of all Web pages, consists of "origination" pages that let you eventually get to the core, but can't be reached from it.

Likewise, "termination" pages on the other side of the tie can be reached from the core, but have trouble returning to it. Finally, one-fifth of the pages can't be reached from the core at all, but only from origination or termination pages.


IDG UK Sites

Samsung Galaxy Alpha vs iPhone 5S comparison review: Metal smartphones fight

IDG UK Sites

Gateway to your kingdom: why everybody should check and update their broadband router

IDG UK Sites

Fonts review

IDG UK Sites

Best Mac? Complete Apple Mac buyers guide for 2014