Runs the protein inference engine Fido.
pot. predecessor tools | ![]() ![]() | pot. successor tools |
PeptideIndexer (with annotate_proteins option) | ProteinQuantifier (via protein_groups parameter) | |
IDPosteriorErrorProbability (with prob_correct option) |
This tool wraps the protein inference algorithm Fido (http://noble.gs.washington.edu/proj/fido/). Fido uses a Bayesian probabilistic model to group and score proteins based on peptide-spectrum matches. It was published in:
Serang et al.: Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data (J. Proteome Res., 2010).
By default, this adapter runs the Fido variant with parameter estimation (FidoChooseParameters
), as recommended by the authors of Fido. However, it is also possible to run "pure" Fido by setting the prob:protein
, prob:peptide
and prob:spurious
parameters, if appropriate values are known (e.g. from a previous Fido run). Other parameters, except for log2_states
, are not applicable in this case.
Depending on the separate_runs
setting, data from input files containing multiple protein identification runs (e.g. several replicates or different search engines) will be merged (default) or annotated separately.
Input format:
Care has to be taken to provide suitable input data for this adapter. In the peptide/protein identification results (e.g. coming from a database search engine), the proteins have to be annotated with target/decoy meta data. To achieve this, run PeptideIndexer with the annotate_proteins
option switched on.
In addition, the scores for peptide hits in the input data have to be posterior probabilities - as produced e.g. by PeptideProphet in the TPP or by IDPosteriorErrorProbability (with the prob_correct
option switched on) in OpenMS. Inputs from IDPosteriorErrorProbability (without prob_correct
) or from ConsensusID are treated as special cases: Their posterior error probabilities (lower is better) are converted to posterior probabilities (higher is better) for processing.
Output format:
The output of this tool is an augmented version of the input: The protein groups and accompanying posterior probabilities inferred by Fido are stored as "indistinguishable protein groups", attached to the protein identification run(s) of the input data. Also attached are meta values recording the Fido parameters (Fido_prob_protein
, Fido_prob_peptide
, Fido_prob_spurious
).
The result can be passed to ProteinQuantifier via its protein_groups
parameter, to have the protein grouping taken into account during quantification.
Note that if the input contains multiple identification runs and separate_runs
is not set (the default), the identification data from all runs will be pooled for the Fido analysis and the result will only contain one (merged) identification run. This is the desired behavior if the protein grouping should be used by ProteinQuantifier.
The command line parameters of this tool are:
INI file documentation of this tool:
OpenMS / TOPP release 2.0.0 | Documentation generated on Wed Mar 30 2016 12:49:26 using doxygen 1.8.11 |