As data science becomes increasingly mainstream, there will be an ever-growing demand for data science tools that are more accessible, flexible, and scalable. In response to this demand, automated machine learning (AutoML) researchers have begun building systems that automate the process of designing and optimizing machine learning pipelines. In this chapter we present TPOT v0.3, an open source genetic programming-based AutoML system that optimizes a series of feature preprocessors and machine learning models with the goal of maximizing classification accuracy on a supervised classification task. We benchmark TPOT on a series of 150 supervised classification tasks and find that it significantly outperforms a basic machine learning analysis in 21 of them, while experiencing minimal degradation in accuracy on 4 of the benchmarks—all without any domain knowledge nor human input. As such, genetic programming-based AutoML systems show considerable promise in the AutoML domain.

Document type: Part of book or chapter of book

Full document

The PDF file did not load properly or your web browser does not support viewing PDF files. Download directly to your device: Download PDF document

Original document

The different versions of the original document can be found in:


https://link.springer.com/content/pdf/10.1007%2F978-3-030-05318-5_8.pdf under the license cc-by

http://link.springer.com/content/pdf/10.1007/978-3-030-05318-5_8,http://dx.doi.org/10.1007/978-3-030-05318-5_8 under the license https://creativecommons.org/licenses/by/4.0


Back to Top

Document information

Published on 31/12/18
Accepted on 31/12/18
Submitted on 31/12/18

Volume 2019, 2019
DOI: 10.1007/978-3-030-05318-5_8
Licence: CC BY-NC-SA license

Document Score


Views 1
Recommendations 0

Share this document

claim authorship

Are you one of the authors of this document?