Abstract

Precision medicine brings the promise of more precise diagnosis and individualized therapeutic strategies from analyzing a cancer’s genomic signature. Technologies such as high-throughput sequencing enable cheaper data collection at higher speed, but rely on modern data analysis platforms to extract knowledge from these high dimensional datasets. Since this is a rapidly advancing field, new diagnoses and therapies often require tailoring of the analysis. These pipelines are therefore developed iteratively, continuously modifying analysis parameters before arriving at the final results. To enable reproducible results it is important to record all these modifications and decisions made during the analysis process.</jats:p><jats:p>We built a system, <jats:monospace>walrus</jats:monospace>, to support reproducible analyses for iteratively developed analysis pipelines. The approach is based on our experiences developing and using deep analysis pipelines to provide insights and recommendations for treatment in an actual breast cancer case. We designed <jats:monospace>walrus</jats:monospace> for the single servers or small compute clusters typically available for novel treatments in the clinical setting. <jats:monospace>walrus</jats:monospace> leverages software containers to provide reproducible execution environments, and integrates with modern version control systems to capture provenance of data and pipeline parameters.</jats:p><jats:p>We have used <jats:monospace>walrus</jats:monospace> to analyze a patient’s primary tumor and adjacent normal tissue, including subsequent metastatic lesions. Although we have used <jats:monospace>walrus</jats:monospace> for specialized analyses of whole-exome sequencing datasets, it is a general data analysis tool that can be applied in a variety of scientific disciplines. We have open sourced <jats:monospace>walrus</jats:monospace> along with example data analysis pipelines at <jats:ext-link xmlns:xlink="http://www.w3.org/1999/xlink" ext-link-type="uri" xlink:href="https://github.com/uit-bdps/walrus">github.com/uit-bdps/walrus.</jats:ext-link


Original document

The different versions of the original document can be found in:

http://dx.doi.org/10.1109/empdp.2019.8671623 under the license cc-by-nc-nd
https://academic.microsoft.com/#/detail/2810797492
http://dx.doi.org/10.1101/354811
https://biorxiv.org/content/biorxiv/early/2018/06/25/354811.full-text.pdf,
https://academic.microsoft.com/#/detail/2952658960



DOIS: 10.1101/354811 10.1109/empdp.2019.8671623

Back to Top

Document information

Published on 01/01/2018

Volume 2018, 2018
DOI: 10.1101/354811
Licence: Other

Document Score

0

Views 0
Recommendations 0

Share this document

Keywords

claim authorship

Are you one of the authors of this document?