metaboCombineR

Abstract

Motivation: Meticulous selection of chromatographic peak detection parameters and algorithms is a crucial step in preprocessing LC-MS data. However, as mass-to-charge ratio (m/z) and retention time shifts are larger between batches than within batches, finding apt parameters for all samples of a large-scale multi-batch experiment with the aim of minimizing information loss becomes a challenging task. Preprocessing independent batches individually can curtail said problems but requires a method for aligning and combining them for further downstream analysis.

Results: We present two methods for aligning and combining individually preprocessed batches in multi-batch LC-MS experiments. Our developed methods were tested on six sets of simulated and six sets of real datasets. Furthermore, by estimating the probabilities of peak insertion, deletion, and swap between batches in authentic datasets we demonstrate that retention order swaps are not rare in untargeted LC-MS data.

Paper can be found here
R package and tutorial are available at https://github.com/fMalinka/metaboCombineR
Raw data (in mzML format) of the six authentic LC-MS experiments are available for download:

metaboCombineR-rawdata-Klk8.zip (733M)
metaboCombineR-rawdata-Tmem60.zip (2.2G) - two experiments
metaboCombineR-rawdata-Trim9.zip (2.8G)
metaboCombineR-rawdata-Wiz.zip (3.4G) - two experiments

Contact: František Malinka, malinkaf(at)img.cas.cz

metaboCombineR

Batch alignment via retention orders for preprocessing large-scale multi-batch LC-MS experiments