Abstract

Research is increasingly becoming data-driven, and natural sciences are not an exception. In both biology and medicine, we are observing an exponential growth of structured data collections from experiments and population studies, enabling us to gain novel insights that would otherwise not be possible. However, these growing data sets pose a challenge for existing compute infrastructures since data is outgrowing limits within compute. In this work, we present the application of a novel approach, Memory-Driven Computing (MDC), in the life sciences. MDC proposes a data-centric approach that has been designed for growing data sizes and provides a composable infrastructure for changing workloads. In particular, we show how a typical pipeline for genomics data processing can be accelerated, and application modifications required to exploit this novel architecture. Furthermore, we demonstrate how the isolated evaluation of individual tasks misses significant overheads of typical pipelines in genomics data processing.


Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.1007/978-3-030-50743-5_17 under the license http://www.springer.com/tdm
https://link.springer.com/content/pdf/10.1007%2F978-3-030-50743-5_17.pdf,
https://academic.microsoft.com/#/detail/3035587886
Back to Top

Document information

Published on 01/01/2020

Volume 2020, 2020
DOI: 10.1007/978-3-030-50743-5_17
Licence: Other

Document Score

0

Views 0
Recommendations 0

Share this document

Keywords

claim authorship

Are you one of the authors of this document?