The old refrain that “storage is cheap, just keep everything” was never true. Recently the global market intelligence firm IDC estimated that the world’s demand for storage is increasing by 60% a year.
Given market research firm IHS iSuppli estimates hard disk storage densities will only improve by 19% a year for the next five years, and IT budgets are growing at an annual rate between 0 and 2%, there is clearly a looming storage crisis.
The challenges involved in preserving the huge datasets created by governments, businesses and research institutions have prompted some dire predictions about the loss of digital history.
Doomsayers suggest the only solution is the frequent transfer of data from device to device. Some even propose conversion to paper or microfiche. But there are no paper or analogue equivalents for many forms of digital information, such as the data generated by environmental sensors (e.g. wave heights, river flows) and GPS tracking devices.
The challenges are both more complex and less daunting than depicted.
Clearly, not all digital information can or should be kept – much of it is “ephemeral” or of short-term value. But much is of continuing value as collective memory, and vital evidence of our identity and our past.
The first of the digital challenges we face relates to what to preserve as digital cultural heritage. Who decides, and using which criteria? Once we’ve decided, how can we preserve it, keep it secure, guarantee its authenticity, ensure its accessibility and maintain its meaning over long periods of time?
Trusted digital repositories around the world use a mix of strategies. They address pressing issues of software and hardware obsolescence, media degradation, and bit rot (changes at the level of individual atoms).
Migration across systems and platforms is being coupled with conversion to more stable and long-lived data formats. This strategy lengthens the intervals between the format transfers necessary to avoid obsolescence.
Encapsulation techniques wrap digital objects in layers of metadata that identify, describe and index them. Accessibility and meaning are thereby maintained.
Audit trails of use, instructions that trigger migration, re-formatting and preservation action, and digital signatures that testify to an object’s authenticity are also captured.
Australasian archival institutions have been at the forefront of research and development efforts to preserve digital information.
The Australasian Digital Recordkeeping Initiative (ADRI) is a collaboration between all the government archives in Australia and New Zealand. Its innovative specification for transferring digital records between recordkeeping systems across organisations will be issued soon through CEN, the European standards body.
The Commonwealth Government initiative ANDS (the Australian National Data Service) is building the Australian Research Data Commons. In the Commons, once invisible, isolated and unmanaged data collections will be structured, connected, findable and reusable.
VERS, the Victorian Electronic Records Strategy, was developed by the Public Record Office of Victoria (PROV).
It was one of the first operational digital archives in the world, going live in 2005. Innovative features include use of digital signatures to guarantee authenticity.
Encapsulation of digital records with metadata carries their meaning and context forward through time.
In Europe and the UK, governments have supported the development of groundbreaking digital preservation tools and services. In 2008-10 the European Commission funded the PLANETS project developed a unique planning tool, PLATO, for managing digital preservation workflows.
Also in the UK, from 2005 to 2007, the JISC funded the PARADIGM project at the University of Oxford which delivered an exemplar workbook for preservation of digital personal archives. SCAPE, co-funded by JISC and the European Commission, is developing scalable tools to preserve large heterogeneous digital collections.
Cloud computing facilities are seen by some as the solution to the massive data storage problems being faced by many organisations. While current commercial business models may provide for short-term data storage, recent reports by archival authorities point to the associated risks.
They include lack of business continuity, technology obsolescence, threats to data integrity and security and high bandwidth costs.
Information may also be vulnerable to disclosure under legislation such as the US Patriot Act, which applies to data stored by any US-owned company wherever the storage facility is located.
In the current model, the risks in putting digital information of continuing value in the cloud are too high. But in the future the storage strategies employed in cloud-computing facilities (multiple locations, shared storage, technical adaptability) might combine creatively with the features being pioneered in trusted digital repositories (long-term access, authenticity, protection against technological obsolescence).
Trusted digital repository and archival services will be delivered seamlessly via distributed systems.
The final challenge is perhaps the most daunting of all. Over the centuries societies have invested heavily in the GLAM sector, building the galleries, libraries, archives and museums that preserve our material heritage. Today huge amounts of money are being spent on the technologies that create the digital deluge in the short-term.
But, so far, investment in the infrastructure needed to preserve our digital cultural heritage is a drop in the ocean.
Many of the solutions to the challenges of digital preservation have been developed or are in the pipeline.
More and sustained investment in digital continuity would enable scaling of these solutions to support a resilient digital cultural heritage.