When it comes to digital data, we're churning out more than ever but as new storage mediums and file format emerge, making others obsolete, it becomes harder to access data we have previously archived. There's a good chance the digital data we are currently generating will very likely become unusable within our lifetimes unless we take steps to preserve it. We look at the drawbacks of digital storage.
The Library of Congress itself has archived about 167 terabytes of digital content, including websites involved in national elections, and information about major events like Hurricane Katrina. Like the National Archives, the Library of Congress keeps multiple copies and is on the lookout to avoid format obsolescence, says LeFurgy.
Thanks to its ongoing satellite surveys, the US Geological Survey (USGS) adds about 50TB per month to its archives, and now manages about 4.5PB (counting copies) says John Faundeen, an archivist at the USGS Earth Resources Observation and Science Centre.
The centre has a three-copy storage strategy: the first copy is online, the second is near-line and the is third off-site. (This mirrors the storage strategy known as information life-cycle management that many enterprise IT shops are adopting.) The Earth Resources Observation and Science Centre tries to migrate to new media every three to five years. And it tries to track all of the media it's using by date, to avoid situations where it uses something that a vendor no longer supports, Faundeen explains. Every other year, the centre undertakes a study of the off-line media industry to see what's on the market.
NEXT PAGE: Troubled Oscars and libraries