Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages

Example 5: Quantitation

General introduction to the FeatureFinder

For quantitation, the FeatureFinder tool is used. It extracts the features from raw or peak maps. The FeatureFinder offers different algorithms:

Algorithm Input data Description
simplest raw data This is an algorithm for feature detection in raw data. The following components are used:

Seeding:
A "seed" is a starting point for finding a feature. In this algorithm, all raw data points above the noise threshold are sorted with respect to intensity. The strongest unused data point is used as a seed. (class SimpleSeeder)

Extension:
A "region" is extended around the seed. The region grows in RT and m/z simultaneously. These data points are marked as used. (class SimpleExtender)

Modelling:
A theoretical "model" is fitted to the data points of the region. Both dimensions (RT, m/z) are considered separately. This algorithm uses a bi-Gaussian model for the elution profile, i.e., two "half" Gaussians are chosen to represent the data points to the left and right of the maximum intensity. This is done by maximum likelihood estimation (class BiGaussModel). The m/z dimension is represented by an isotope model, which is essentially a mixture of Gaussians, one for each isotopic peak, whose relative intensities are fixed according to the "averagine" atomic composition. All isotopic peaks have the same width. As a special case, charge zero represents a single Gaussian. This is useful as a null hypothesis or if isotopic peaks cannot be resolved. The MZ model is fitted by a simple enumeration scheme. (class IsotopeFitter1D)

If the model fits well to the data, we report a feature. Otherwise, the seed and the whole region is discarded, and its data points are marked as unused again. (class ModelFitter)

See the FeatureFinderAlgorithmSimplest Parameters page for a documented list of configuration options for this algorithm.

simple raw data

This is another algorithm for feature detection in raw data. It is similar to the "simplest" algorithm, except for the modeling.

Seeding: see "simplest"

Extension: see "simplest"

Modelling:
like in "simplest", but an exponentially modified Gaussian (EMG) is used for the elution profile. The least-squares fitting is done by the Levenberg-Marquardt algorithm. (class EMGFitter1D, class LmaGaussFitter1D)

See the FeatureFinderAlgorithmSimple Parameters page for a documented list of configuration options for this algorithm.

picked_peak peak data This is an experimental algorithm for feature detection based on peak data. In contrast to the other algorithms, it is based on peak/stick data, which makes it applicable even if no raw data is available. Another advantage is its speed due to the reduced amount of data after peak picking.

Seeding:
It identifies interesting regions by calculating a score for each peak based on

  • the significance of the intensity in the local environment
  • RT dimension: the quality of the mass trace in a local RT window
  • m/z dimension: the quality of fit to an averagine isotope model

Extension:
The extension is based on a heuristics -- the average slope of the mass trace for RT dimension, the best fit to averagine model in m/z dimension.

Modelling:
In model fitting, the retention time profile (Gaussian) of all mass traces is fitted to the data at the same time. After fitting, the data is truncated in RT and m/z dimension. The reported feature intensity is based on the fitted model, rather than on the (noisy) data.

See the FeatureFinderAlgorithmPicked Parameters page for a documented list of configuration options for this algorithm.

Isotope-labeled quantitation

Goal: You want to differentially quantify the features of an isotope-labeled HPLC-MS map.

The first step in this pipeline is to find the features of the HPLC-MS map. The FeatureFinder application calculates the features from a raw/peak map.

In the second step, the labeled pairs (light/heavy) are determined by the LabeledMatcher. The LabeledMatcher first determines all possible pairs according to a given optimal shift and deviations in RT and m/z. Then it resolves ambiguous pairs using a greedy-algorithm that prefers pairs with a higher score. The score of a pair is the product of:

TOPP_labeled_quant.png

Label-free quantitation

Goal: You want to differentially quantify the features of two or more label-free HPLC-MS map.

Mapping feature maps can be done with the MapAlignment tool. Please have a look at Example 4: Map alignment.

TOPP_labelfree_quant.png

References

Ole Schulz-Trieglaff, Rene Hussong, Clemens Gr�pl, Andreas Leinenbach, Andreas Hildebrandt, Christian Huber, Knut Reinert "Computational Quantification of Peptides from LC-MS data". Journal of Comptational Biology, 2008. to appear.

Ole Schulz-Trieglaff, Rene Hussong, Clemens Gr�pl, Andreas Hildebrandt, Knut Reinert "A Fast and Accurate Algorithm for the Quantification of Peptides from Mass Spectrometry data". In "Proceedings of the Eleventh Annual International Conference on Research in Computational Molecular Biology (RECOMB 2007)", pages 473-487, 2007.

Bettina Mayr, Oliver Kohlbacher, Knut Reinert, Marc Sturm, Clemens Gr�pl, Eva Lange, Christoph Klein, Christian Huber "Absolute Myoglobin Quantitation in Serum by Combining Two-Dimensional Liquid Chromatography-Electrospray Ionization Mass Spectrometry and Novel Data Analysis Algorithms". Journal of Proteome Research, volume 5, pages 414-421, 2006.

Clemens Gr�pl, Eva Lange, Knut Reinert, Oliver Kohlbacher, Marc Sturm, Christian G. Huber, Bettina M. Mayr, Christoph L. Klein "Algorithms for the automated absolute quantification of diagnostic markers in complex proteomics samples". In "Proceedings of the 1st International Symposium on Computational Life Science (CompLife05)", pages 151-163, 2005.


Generated Tue Apr 1 15:36:40 2008 -- using doxygen 1.5.4 OpenMS / TOPP 1.1