We use cookies to provide you with a better experience. If you continue to use this site, we'll assume you're happy with this. Alternatively, click here to find out how to manage these cookies

hide cookie message
80,258 News Articles

UK National Archives to unlock file formats

Bid to open up legacy Microsoft files

The UK has embarked on a plan to make terabytes of government data locked up in mostly Microsoft proprietary file formats viewable to the public in their original form.

The National Archives, the repository for government records, has digital records in literally hundreds of esoteric file formats, however, the vast majority of data is stored in legacy Microsoft Office formats, said Natalie Ceeney, chief executive.

Changes in software and operating systems have made viewing those documents in their original format impossible, said David Thomas, director of technology and chief information officer for the National Archives.

"We're not building a museum of old computers here," Thomas said. "We want to make it [National Archive content] readable on current desktop technology."

Microsoft has offered its assistance for the National Archives to use Virtual PC 2007, a virtualisation product that allows multiple OSes - as well as legacy OSes such as Windows 95 - to run on a single piece of hardware. Microsoft is also providing older versions of Windows OSes as well as Office applications for the project.

The technology would offer the public the advantage of viewing documents in the form they were created, which can add context and depth, Ceeney said.

The National Archives receives much of its government information through a secure intranet, and that data is backed up to tape, Thomas said. Tape storage is the cheapest and the most robust way to keep data, so there are no old floppy disks around, he said. So far, the National Archives has about 580 terabytes of digital data.

Eventually, the National Archives envisions a system when a citizen could use a computer at the National Archives running Virtual PC 2007 and view, for example, an older Microsoft Word document in its original form. A further step would be creating a way where people could do that over the Internet, he said.

At an event held at the National Archives southwest of London, Microsoft didn't hold back in making a hard case for the default Office 2007 file format, Open XML (extensible markup language).

The format was approved in December 2006 as a standard by Ecma International, a European standards body. Microsoft has been trying to drum support for it, over rival Open Document Format, (ODF) an XML-based format used in free office suites such as Open Office.

For a long time, document file formats were mostly proprietary, and "we certainly weren’t the only ones who were doing it," said Gordon Frazer, managing director of Microsoft UK.

Frazer stressed that Open XML is no longer a proprietary or binary format controlled by Microsoft, but Ecma. "We've worked very hard to embrace open standards," Frazer said.

Microsoft's attitude toward compatibility has been "a huge sea change" in recent years, said Adam Farquhar, head of e-architecture at the British Library, which is also working with Microsoft on digitising books in its collection.

"Microsoft has taken tremendous strides forward in solving the problem [of compatible file formats]," Farquhar said.

(Additional reporting by Leo King of Computerworld UK.)

IDG UK Sites

Best camera phone of 2015: iPhone 6 Plus vs LG G4 vs Galaxy S6 vs One M9 vs Nexus 6

IDG UK Sites

In defence of BlackBerrys

IDG UK Sites

Why we should reserve judgement on Apple ditching Helvetica in OS X/iOS for the Apple Watch's San...

IDG UK Sites

Retina 3.3GHz iMac 27in preview: Apple cuts £400 of price of Retina iMac with new model