scholarly journals Ten Simple Rules for Digital Data Storage

2016 ◽  
Vol 12 (10) ◽  
pp. e1005097 ◽  
Author(s):  
Edmund M. Hart ◽  
Pauline Barmby ◽  
David LeBauer ◽  
François Michonneau ◽  
Sarah Mount ◽  
...  
2016 ◽  
Author(s):  
Edmund Hart ◽  
Pauline Barmby ◽  
David LeBauer ◽  
François Michonneau ◽  
Sarah Mount ◽  
...  

Data is the central currency of science, but the nature of scientific data has changed dramatically with the rapid pace of technology. This change has led to the development of a wide variety of data formats, dataset sizes, data complexity, data use cases, and data sharing practices. Improvements in high throughput DNA sequencing, sustained institutional support for large sensor networks, and sky surveys with large-format digital cameras have created massive quantities of data. At the same time, the combination of increasingly diverse research teams and data aggregation in portals (e.g. for biodiversity data, GBIF or iDigBio) necessitates increased coordination among data collectors and institutions. As a consequence, “data” can now mean anything from petabytes of information stored in professionally-maintained databases, through spreadsheets on a single computer, to hand-written tables in lab notebooks on shelves. All remain important, but data curation practices must continue to keep pace with the changes brought about by new forms and practices of data collection and storage.


2015 ◽  
Author(s):  
Edmund Hart ◽  
Pauline Barmby ◽  
David LeBauer ◽  
François Michonneau ◽  
Sarah Mount ◽  
...  

Data is the central currency of science, but the nature of scientific data has changed dramatically with the rapid pace of technology. This change has led to the development of a wide variety of data formats, dataset sizes, data complexity, data use cases, and data sharing practices. Improvements in high throughput DNA sequencing, sustained institutional support for large sensor networks, and sky surveys with large-format digital cameras have created massive quantities of data. At the same time, the combination of increasingly diverse research teams and data aggregation in portals (e.g. for biodiversity data, GBIF or iDigBio) necessitates increased coordination among data collectors and institutions. As a consequence, “data” can now mean anything from petabytes of information stored in professionally-maintained databases, through spreadsheets on a single computer, to hand-written tables in lab notebooks on shelves. All remain important, but data curation practices must continue to keep pace with the changes brought about by new forms and practices of data collection and storage.


Author(s):  
Edmund Hart ◽  
Pauline Barmby ◽  
David LeBauer ◽  
François Michonneau ◽  
Sarah Mount ◽  
...  

Data is the central currency of science, but the nature of scientific data has changed dramatically with the rapid pace of technology. This change has led to the development of a wide variety of data formats, dataset sizes, data complexity, data use cases, and data sharing practices. Improvements in high throughput DNA sequencing, sustained institutional support for large sensor networks, and sky surveys with large-format digital cameras have created massive quantities of data. At the same time, the combination of increasingly diverse research teams and data aggregation in portals (e.g. for biodiversity data, GBIF or iDigBio) necessitates increased coordination among data collectors and institutions. As a consequence, “data” can now mean anything from petabytes of information stored in professionally-maintained databases, through spreadsheets on a single computer, to hand-written tables in lab notebooks on shelves. All remain important, but data curation practices must continue to keep pace with the changes brought about by new forms and practices of data collection and storage.


2018 ◽  
Vol 6 (3) ◽  
pp. 359-363
Author(s):  
A. Saxena ◽  
◽  
S. Sharma ◽  
S. Dangi ◽  
A. Sharma ◽  
...  

1998 ◽  
Author(s):  
Kai-Oliver Mueller ◽  
Cornelia Denz ◽  
Torsten Rauch ◽  
Thorsten Heimann ◽  
J. Trumpfheller ◽  
...  

Author(s):  
Huan Liu

The amounts of data become increasingly large in recent years as the capacity of digital data storage worldwide has significantly increased. As the size of data grows, the demand for data reduction increases for effective data mining. Instance selection is one of the effective means to data reduction. This article introduces basic concepts of instance selection, its context, necessity and functionality. It briefly reviews the state-of-the-art methods for instance selection. Selection is a necessity in the world surrounding us. It stems from the sheer fact of limited resources. No exception for data mining. Many factors give rise to data selection: data is not purely collected for data mining or for one particular application; there are missing data, redundant data, and errors during collection and storage; and data can be too overwhelming to handle. Instance selection is one effective approach to data selection. It is a process of choosing a subset of data to achieve the original purpose of a data mining application. The ideal outcome of instance selection is a model independent, minimum sample of data that can accomplish tasks with little or no performance deterioration.


2007 ◽  
Vol 43 (3) ◽  
pp. 1101-1111 ◽  
Author(s):  
Sebastien Tosi ◽  
Martin Power ◽  
Thomas Conway

Sign in / Sign up

Export Citation Format

Share Document