When it comes to digital data, we're churning out more than ever but as new storage mediums and file format emerge, making others obsolete, it becomes harder to access data we have previously archived. There's a good chance the digital data we are currently generating will very likely become unusable within our lifetimes unless we take steps to preserve it. We look at the drawbacks of digital storage.
Looking to the future
All in all, those responsible for the stewardship of digital archives don't sound upbeat about the future.
"There is no answer to the core technology issue at the moment, which is that our infrastructure does not take the need for long-term preservation into account," says Maltz.
"The word is vigilance," says Faundeen at the USGS. "Preservation efforts must be ongoing. You cannot rest on past work. You must continuously look forward."
Says Le at the National Archives: "It's a never-ending process, and if anything the situation is getting worse." The number of data formats keeps proliferating, and the volume of data arriving at the National Archives could at any moment become overwhelming. Nonetheless, he says, "I am confident that the things we process will be preserved."
The last word, for now, goes to Coughlin. "If you want data to last, you can't just let it sit there," he says. "It has to be active. You have to care for it, or it may eventually get lost."
Archival standards in the making
An often-cited example of a group doing archival standardiSation work is the Storage Networking Industry Association (SNIA) www.snia.org in San Francisco. Wayne Adams, SNIA's chairman and a senior technologist at storage vendor EMC says the association has developed the following three standards to address the issue:
- XAM (Extensible Access Method)
Adams says that this standard separates the application from the data and "lets you manage the data in its own right and not worry about the forward migration of the application. Otherwise, to use the data 15 years from now you'd have to put a whole system in a time capsule". According to SNIA, XAM contains metadata definitions to help archived data achieve application interoperability and to make it more searchable. SNIA's web site lists XAM-based products or services from 13 different organisations.
- SIRF (Self-contained Information Retention Format)
This standard should make it possible for future users to query archived data without having to use the original application. SNIA literature calls it "a specification that defines a logical container format appropriate for the long-term storage of digital information".
- CDMI (Cloud Data Management Interface)
This standard also defines metadata and other storage parameters and is therefore applicable to archiving, according to Adams.
See also: A history of removable storage