Using pattern recognition entropy to select mass chromatograms to prepare total ion current chromatograms from raw liquid chromatography-mass spectrometry data
journal contribution
posted on 2023-05-19, 22:10 authored by Chatterjee, S, Major, GH, Brett PaullBrett Paull, Estrella Sanz RodriguezEstrella Sanz Rodriguez, Kaykhaii, M, Linford, MRThe total ion current chromatogram (TICC) obtained by liquid–chromatography-mass spectrometry (LC-MS) is often extremely complex and ‘noisy’ in appearance, particularly when an electrospray ionization source is used. Accordingly, meaningful qualitative and quantitative information can be obtained in LC-MS by data mining processes. Here, one or more higher-quality mass chromatograms can be identified/extracted/isolated and combined to form a TICC, wherein much of the background mass noise is eliminated, and quantitative data for chromatographic peaks can be obtained. Pattern Recognition Entropy (PRE) is a new application of Shannon’s statistical concept of entropy. PRE is both a pattern recognition tool and a summary statistic that can be used to identify information-containing mass chromatograms, where higher quality data (higher signal-to-noise mass chromatograms) usually have lower PRE values. Reduced TICCs are obtained by first calculating the PRE values of the component mass chromatograms. A plot of PRE value vs. m/z for the mass chromatograms is then generated, and the resulting band of PRE values is fit to a piecewise spline polynomial. The distribution of the differences between the individual PRE values and the spline fit is then used to select ‘good’ mass chromatograms. For the data set considered herein, best results were obtained with a threshold of 0.5 standard deviations below the average value (value of the spline). PRE reduces the number of component mass chromatograms significantly (by an order of magnitude) and at the same time preserves most of the chemical information that is collectively in them. It can also distinguish between mass chromatograms of chemically similar species. PRE is arguably a less computationally intensive alternative to the widely used CODA algorithm for variable reduction. It produces reduced TICCs of comparable if not higher quality, and it requires only a single user input for variable selection. Reduced TICCs generated by PRE can be smoothed to further improve their signal-to-noise ratios.
History
Publication title
Journal of Chromatography AVolume
1558Pagination
21-28ISSN
0021-9673Department/School
School of Natural SciencesPublisher
Elsevier Science BvPlace of publication
Po Box 211, Amsterdam, Netherlands, 1000 AeRights statement
Copyright 2018 Elsevier B.V.Repository Status
- Restricted