Abstract

               <jats:sec>                   <jats:title>Motivation</jats:title>                   <jats:p>With its capacity for high-resolution data output in one region of interest, chromosome conformation capture combined with high-throughput sequencing (4C-seq) is a state-of-the-art next-generation sequencing technique that provides epigenetic insights, and regularly advances current medical research. However, 4C-seq data are complex and prone to biases, and while specialized programs exist, an unbiased, extensive benchmarking is still lacking. Furthermore, neither substantial datasets with fully characterized ground truth, nor simulation programs for realistic 4C-seq data have been published.</jats:p>                </jats:sec>                <jats:sec>                   <jats:title>Results</jats:title>                   <jats:p>We conducted a benchmarking study on 66 4C-seq samples from 20 datasets, and developed a novel 4C-seq simulation software, Basic4CSim, to allow for detailed comparisons of 4C-seq algorithms on 50 simulated datasets with 10–120 samples each. Simulations and benchmarking were adapted to address different characteristics of 4C-seq data. Simulated data were compared with published samples to validate simulation settings. We identified differences between 4C-seq algorithms in terms of precision, recall, interaction structure, and run time, and observed general trends. Novel differential pipeline versions of single-sample based 4C-seq algorithms were included in the benchmarking. While no single tool was optimally suited for both near-cis and far-cis, and both single-sample and differential analyses, choosing a high-performing algorithm variant did improve results considerably. For near-cis scenarios, r3Cseq, peakC and FourCSeq offered high precision, while fourSig demonstrated high overall F1 scores in far-cis analyses. Finally, 4C-seq simulations may aid in the development of improved analysis algorithms.</jats:p>                </jats:sec>                <jats:sec>                   <jats:title>Availability and implementation</jats:title>                   <jats:p>Basic4CSim is available at https://github.com/walter–ca/Basic4CSim.</jats:p>                </jats:sec>                <jats:sec>                   <jats:title>Supplementary information</jats:title>                   <jats:p>Supplementary data are available at Bioinformatics online.</jats:p>                </jats:sec

Document type: Article

Full document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document

Original document

The different versions of the original document can be found in:

http://academic.oup.com/bioinformatics/article-pdf/35/23/4938/31963529/btz426.pdf,
http://dx.doi.org/10.1093/bioinformatics/btz426 under the license cc-by-nc
https://www.ncbi.nlm.nih.gov/pubmed/31134276,
https://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics35.html#WalterSRD19,
https://academic.microsoft.com/#/detail/2947289109 under the license http://creativecommons.org/licenses/by-nc/4.0/
Back to Top

Document information

Published on 01/01/2019

Volume 2019, 2019
DOI: 10.1093/bioinformatics/btz426
Licence: Other

Document Score

0

Views 3
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?