Motivation: Meticulous selection of chromatographic peak detection parameters and algorithms is a crucial step in preprocessing LC-MS data. However, as mass-to-charge ratio (m/z) and retention time shifts are larger between batches than within batches, finding apt parameters for all samples of a large-scale multi-batch experiment with the aim of minimizing information loss becomes a challenging task. Preprocessing independent batches individually can curtail said problems but requires a method for aligning and combining them for further downstream analysis.
Results: We present two methods for aligning and combining individually preprocessed batches in multi-batch LC-MS experiments. Our developed methods were tested on six sets of simulated and six sets of real datasets. Furthermore, by estimating the probabilities of peak insertion, deletion, and swap between batches in authentic datasets we demonstrate that retention order swaps are not rare in untargeted LC-MS data.