WiPP: workflow for improved peak picking for gas chromatography-mass spectrometry (GC-MS) data
In: Metabolites Metabolites, MDPI, 2019, Jg. 9 (2019-07-24), Heft 9, p 171
Online
unknown
Zugriff:
High false positive rates in GC-MS metabolomics peak detection is a common issue that impedes automated analysis of large-scale datasets. There is a growing need for improving the reliability and scalability of data analysis workflows. Many algorithms are available for peak detection [1], a crucial step for the data analysis, but performance and outcome can differ widely depending on both algorithmic approach and data acquisition method. This makes it difficult to compare and contrast between algorithms without extensive manual intervention.nnWe present a workflow for improved peak picking (WiPP), a parameter optimizing, multi-algorithm peak detection workflow for GC-MS metabolomics, which automatically evaluates the quality of detected peaks using machine learning-based classification. First, the classifier is trained to distinguish between real compound related peaks and false positive peaks. Then the algorithm parameters are scored based on the quality of detected peaks and optimized accordingly. This procedure is repeated for two peak detection algorithms and subsequently both algorithms are run in parallel on the entire data set with the optimized parameters. The qualitative information returned by the classifier for every peak is then used to merge individual algorithm results into one final high confidence peak set.nnUsing this approach, we show that automated detection and evaluation of peak quality is improved. The additional quantitative and qualitative information generated by the classifier allows:nnO_LIa novel way to classify peaks based on seven classes and thus objectively to assess their qualitynC_LIO_LIimpartial performance comparison of different peak picking algorithmsnC_LIO_LIautomated parameter optimization for each individual peak picking algorithmnC_LIO_LIa final, improved high quality peak list to be generated for statistical or further analyses.nC_LInnIt achieves this while minimising the operator-time required by packaging this within a fully automated workflow. The modular design allows extension, adjustment and improvement of the workflow using different or additional peak detection algorithms and classifiers. Importantly, due to the fully automated implementation, the workflow is suitable for large-scale studies.nnThe pipeline supports mzML, mzData and NetCDF formats and is implemented in python using snakemake, a reproducible and scalable workflow management system, it is available on GitHub (https://github.com/bihealth/WiPP).
Titel: |
WiPP: workflow for improved peak picking for gas chromatography-mass spectrometry (GC-MS) data
|
---|---|
Autor/in / Beteiligte Person: | Guitton, Yann ; Sicard, Emilie ; Kirwan, Jennifer A. ; Borgsmüller, Nico ; Bruno Le Bizec ; Migné, Carole ; Giacomoni, Franck ; Pétéra, Mélanie ; Blanc, Eric ; Gloaguen, Yoann ; Royer, Anne-Lise ; Opialla, Tobias ; Beule, Dieter ; Durand, Stéphanie ; Pujos-Guillot, Estelle ; Core Unit Bioinformatics ; Berlin Institute of Health (BIH) ; Berlin Institute of Health Metabolomics Platform ; Max Delbrück Center for Molecular Medicine [Berlin] (MDC) ; Helmholtz-Gemeinschaft = Helmholtz Association ; Berlin Institute for Medical Systems Biology ; Charité - UniversitätsMedizin = Charité - University Hospital [Berlin] ; Université Clermont Auvergne [2017-2020] (UCA [2017-2020]) ; MetaboHUB ; Unité de Nutrition Humaine (UNH) ; Institut National de la Recherche Agronomique (INRA)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]) ; Laboratoire d'étude des Résidus et Contaminants dans les Aliments (LABERCA) ; Institut National de la Recherche Agronomique (INRA)-Ecole Nationale Vétérinaire, Agroalimentaire et de l'alimentation Nantes-Atlantique (ONIRIS) ; Max Delbrück Center for Molecular Medicine ; Charité - Universitätsmedizin Berlin / Charite - University Medicine Berlin ; 1019, unité de Nutrition Humaine ; Institut National de la Recherche Agronomique (INRA) ; Université Clermont Auvergne (UCA) ; Unité de Nutrition Humaine - Clermont Auvergne (UNH) ; Institut National de la Recherche Agronomique (INRA)-Université Clermont Auvergne (UCA) ; Institut National de la Recherche Agronomique (INRA)-École nationale vétérinaire, agroalimentaire et de l'alimentation Nantes-Atlantique (ONIRIS) |
Link: | |
Zeitschrift: | Metabolites Metabolites, MDPI, 2019, Jg. 9 (2019-07-24), Heft 9, p 171 |
Veröffentlichung: | Cold Spring Harbor Laboratory Press, 2019 |
Medientyp: | unknown |
ISSN: | 2218-1989 (print) |
Schlagwort: |
|
Sonstiges: |
|