(Created page with " == Abstract == Accepted manuscript version. The final publication is available at Springer via <a href=http://dx.doi.org/10.1007/978-3-319-24462-4_22>http://dx.doi.org/10.10...")
 
m (Scipediacontent moved page Draft Content 294469788 to Bongo et al 2015a)
(No difference)

Revision as of 14:01, 14 October 2020

Abstract

Accepted manuscript version. The final publication is available at Springer via <a href=http://dx.doi.org/10.1007/978-3-319-24462-4_22>http://dx.doi.org/10.1007/978-3-319-24462-4_22</a>. Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many infrastructure systems for such data-intensive computing. However, in our experience, most biological data analysis pipelines do not leverage these systems. We give an overview of data-intensive computing infrastructure systems, and describe how we have leveraged these for: (i) scalable fault-tolerant computing for large-scale biological data; (ii) incremental updates to reduce the resource usage required to update large-scale compendium; and (iii) interactive data analysis and exploration. We provide lessons learned and describe problems we have encountered during development and deployment. We also provide a literature survey on the use of data-intensive computing systems for biological data processing. Our results show how unmodified biological data analysis tools can benefit from infrastructure systems for data-intensive computing.

Document type: Part of book or chapter of book

Full document

The URL or file path given does not exist.


Original document

The different versions of the original document can be found in:

Back to Top

Document information

Published on 01/01/2015

Volume 2015, 2015
DOI: 10.1007/978-3-319-24462-4_22
Licence: CC BY-NC-SA license

Document Score

0

Views 0
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?