Cloud-based computing has created new avenues for innovative research. In recent years, numerous cloud-based, data analysis projects within the biomedical domain have been implemented. As this field is likely to grow, there is a need for a unified platform for the developing and testing of advanced analytic and modeling tools that enables those tools to be easily reused for the analysis of biomedical data by a broad set of users with diverse technical skills. A cloud-based platform of this nature could greatly assist future research endeavors. In this paper, we take the first step towards building such a platform. We define an approach by which containerized analytic pipelines can be distributed for use on cloud-based or on-premise computing platforms. We demonstrate our approach by implementing a portable biomarker identification pipeline using a logistic regression model with elastic net regularization (LR-ENR) and running it on the Google cloud. We used this pipeline for the diagnosis of Parkinson’s disease based on a combination of clinical, demographic, and MRI-based features and for the identification of the most predictive biomarkers.
The different versions of the original document can be found in: