|
Project Home InSAR GIOD LIGO XSIL
|
Scientific Data Archives
|
| Many scientific endeavors produce large quantities of heterogeneous data
that is to be analyzed by loose, distributed collaborations. There is a call for
federally-funded data to make itself useful through its availability to those who are not
experts in the meaning of the data. Scientific data, in contrast to text or image data, is
often useless without sophisticated, customized data-mining and knowledge extraction
tools. Given these three conditions, there is an urgent need for software infrastructures
to create, maintain, evolve, and federate these active digital libraries of scientific
data; infrastructures that consider the newcomer learning to use the system as well as the
seasoned professional. CACR is involved in several projects that examine approaches to such active digital libraries:
CACR hosted the first Interfaces to Scientific Data Archives workshop in Pasadena in March 1998. The objective of this workshop was to examine approaches to such active digital libraries through case-studies and tools. The interaction of these illustrates needs for standards and abstraction, identifies similarities, focuses on real-life problems, and thus curbs the excesses of theory. Using small-group discussion, we identified solutions, consensus, and challenges in creating and maintaining active digital libraries of scientific data, for ensuring that the archive is flexible and extensible, that it is as easy as possible to learn how to use, and so that different groups can use each other's development work instead of repeating it. The full report from the workshop is available from the web site below, with findings, recommendations, descriptions of case-studies and tools, and a survey of scientific data archives. |
![]() |