Filters protein identification engine results by different criteria.
This tool is used to filter the identifications found by a peptide/protein identification tool like Mascot. Different filters can be applied:
To enable any of the filters, just change their default value. All active filters will be applied in order.
-
score:pep:
This parameter specifies which score a peptide hit should have to be kept.
-
score:prot:
This parameter specifies which score a protein hit should have to be kept.
-
thresh:pep:
This parameter specifies which amount of the significance threshold should be reached by a peptide to be kept. If for example a peptide has score 30 and the significance threshold is 40, the peptide will only be kept by the filter if the significance threshold fraction is set to 0.75 or lower.
-
thresh:prot:
This parameter behaves in the same way as the peptide significance threshold fraction parameter. The only difference is that it is used to filter protein hits.
-
whitelist:proteins:
If you know which proteins are in the measured sample you can specify a FASTA file which contains the protein sequences of those proteins. All peptides which are not a substring of a protein contained in the sequences file will be filtered out. The filtering is based on the protein identifiers attached to the peptide hits. Protein Hits not matching any FASTA protein are also removed.
If you want filtering using the sequence alone, then use the flag WhiteList:by_seq_only.
-
blacklist:peptides:
For this option you specify an idXML file. All peptides that are present in both files (in-file and exclusion peptides file) will be dropped. Protein Hits are not affected.
-
rt:
To filter identifications according to their predicted retention times you have to set 'rt:p_value' and/or 'rt:p_value_1st_dim' larger than 0, depending which RT dimension you want to filter. This filter can only be applied to idXML files produced by RTPredict.
-
best:n_peptide_hits:
Only the best n peptide hits of a spectrum are kept. If two hits have the same score, their order is random.
-
best:n_protein_hits:
Only the best n protein hits of a spectrum are kept. If two hits have the same score, their order is random.
-
best:strict:
Only the best hit of a spectrum is kept. If there is more than one hit for a spectrum with the maximum score, then none of the hits will be kept. This is similar to n_peptide_hits=1, but if there are two or more highest scoring hits, none are kept.
The command line parameters of this tool are:
INI file documentation of this tool: