us-eu.small.jpg (31333 bytes)

 


Recommendations to Projects on Data Storage Technologies
Peter Allan, Head, Space Data Division
Space Science and Technology Department
CLRC/Rutherford Appleton Laboratory

Whatever the current level of technology for storing digital data, there will always be projects that need "large" amounts of storage. In this context, "large" means that the projects needs to store more data than can be easily and cheaply stored on a simple magnetic disk or tape system. Projects having this need typically look around for the cheapest storage option that is consistent with their performance requirements. However, such a strategy often leads to the use of "bleeding edge" technology that can fall out of favour within a few years. If the data need to be preserved on longer timescales, which is usually the case, this can lead to unplanned long term expenses as equipment that has reached a technological dead end needs to be maintained or replaced. I have a specific example in mind. Here at RAL, we have an archive of data from an astronomy mission with the data held on optical disks and catalogued in a particular fashion. The optical disk system was never very reliable and it is now too expensive to maintain. The cost of moving the data to a different form of storage (we could easily put it on magnetic disk these days) and re-cataloguing it may be too high, so the data may die.

One approach to solving this problem is addressed in the paper "Digital Information Preservation Perspectives". I propose that a simpler solution should be considered in parallel with that one. People would be in a better position at the beginning of a project with large data storage requirements if they were more aware of the potential impact of the decision to use a particular storage technology. There have been sufficiently many projects that fall into the "large data" category over the years that I propose that a study should be done and a report prepared on the technologies that have been used in various projects and the ease with which they supported long term storage or the migration to another form of physical storage. The purpose is not to recommend a particular technology, but to warn of the cost of using non mainstream solutions. The fact that a particular form of storage may now be obsolete does not invalidate it being included in the study. In fact it must be included. The report would be a set of recommendations for things people should consider when evaluating large storage systems, with some quantification of the risks that go along with bleeding edge technology. All this information exists, but generally in the form of anecdotes and people asking friends how they did it.