Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages
Public Member Functions | Protected Member Functions | Protected Attributes | Private Types | List of all members
SuffixArraySeqan Class Reference

Class that uses SEQAN library for a suffix array. It can be used to find peptide Candidates for a MS spectrum. More...

#include <OpenMS/DATASTRUCTURES/SuffixArraySeqan.h>

Inheritance diagram for SuffixArraySeqan:
SuffixArray WeightWrapper SuffixArrayTrypticSeqan

Public Member Functions

 SuffixArraySeqan (const String &st, const String &filename, const WeightWrapper::WEIGHTMODE weight_mode=WeightWrapper::MONO)
 constructor More...
 
 SuffixArraySeqan (const SuffixArraySeqan &source)
 copy constructor More...
 
virtual ~SuffixArraySeqan ()
 destructor More...
 
String toString ()
 converts suffix array to a printable string More...
 
void findSpec (std::vector< std::vector< std::pair< std::pair< SignedSize, SignedSize >, double > > > &candidates, const std::vector< double > &spec)
 the function that will find all peptide candidates for a given spectrum More...
 
bool save (const String &filename)
 saves the suffix array to disc More...
 
bool open (const String &filename)
 opens the suffix array More...
 
void setTolerance (double t)
 setter for tolerance More...
 
double getTolerance () const
 getter for tolerance More...
 
bool isDigestingEnd (const char aa1, const char aa2) const
 returns if an enzyme will cut after first character More...
 
void setTags (const std::vector< OpenMS::String > &tags)
 setter for tags More...
 
const std::vector< OpenMS::String > & getTags ()
 getter for tags More...
 
void setUseTags (bool use_tags)
 setter for use_tags More...
 
bool getUseTags ()
 getter for use_tags More...
 
void setNumberOfModifications (Size number_of_mods)
 setter for number of modifications More...
 
Size getNumberOfModifications ()
 getter for number of modifications More...
 
void printStatistic ()
 output for statistic More...
 
- Public Member Functions inherited from SuffixArray
 SuffixArray (const String &st, const String &filename)
 constructor taking the string and the filename for writing or reading More...
 
 SuffixArray (const SuffixArray &sa)
 copy constructor More...
 
virtual ~SuffixArray ()=0
 destructor More...
 
 SuffixArray ()
 constructor More...
 
- Public Member Functions inherited from WeightWrapper
 WeightWrapper ()
 constructor More...
 
 WeightWrapper (const WEIGHTMODE weight_mode)
 constructor More...
 
virtual ~WeightWrapper ()
 destructor More...
 
 WeightWrapper (const WeightWrapper &source)
 copy constructor More...
 
void setWeightMode (const WEIGHTMODE mode)
 Sets the weight mode (MONO or AVERAGE) More...
 
WEIGHTMODE getWeightMode () const
 Gets the weight mode (MONO or AVERAGE) More...
 
double getWeight (const AASequence &aa) const
 returns the weight of either mono or average value More...
 
double getWeight (const EmpiricalFormula &ef) const
 returns the weight of either mono or average value More...
 
double getWeight (const Residue &r, Residue::ResidueType res_type=Residue::Full) const
 returns the weight of either mono or average value More...
 

Protected Member Functions

void goNextSubTree_ (TIter &it, double &m, std::stack< double > &allm, std::stack< std::map< double, SignedSize > > &mod_map)
 overwriting goNextSubTree_ from seqan index_esa_stree.h for mass update during suffix array traversal More...
 
void goNextSubTree_ (TIter &it)
 goes to the next sub tree More...
 
void goNext_ (TIter &it, double &m, std::stack< double > &allm, std::stack< std::map< double, SignedSize > > &mod_map)
 overwriting goNext from seqan index_esa_stree.h for mass update during suffix array traversal More...
 
void parseTree_ (TIter &it, std::vector< std::pair< SignedSize, SignedSize > > &out_number, std::vector< std::pair< SignedSize, SignedSize > > &edge_length, std::vector< SignedSize > &leafe_depth)
 
SignedSize findFirst_ (const std::vector< double > &spec, double &m)
 binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. More...
 
SignedSize findFirst_ (const std::vector< double > &spec, double &m, SignedSize start, SignedSize end)
 binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. It searches recursively. More...
 

Protected Attributes

TIndex index_
 seqan suffix array More...
 
TIter it_
 seqan suffix array iterator More...
 
const Strings_
 reference to strings for which the suffix array is build More...
 
double masse_ [255]
 amino acid masses More...
 
SignedSize number_of_modifications_
 number of allowed modifications More...
 
std::vector< Stringtags_
 all tags More...
 
bool use_tags_
 if tags are used More...
 
double tol_
 tolerance More...
 

Private Types

typedef seqan::TopDown< seqan::ParentLinks<> > TIterSpec
 
typedef seqan::Index< seqan::String< char >, seqan::IndexEsa< TIterSpec > > TIndex
 
typedef seqan::Iter< TIndex, seqan::VSTree< TIterSpec > > TIter
 

Additional Inherited Members

- Public Types inherited from WeightWrapper
enum  WEIGHTMODE { AVERAGE = 0, MONO, SIZE_OF_WEIGHTMODE }
 

Detailed Description

Class that uses SEQAN library for a suffix array. It can be used to find peptide Candidates for a MS spectrum.

This class uses SEQAN suffix array. It can just be used for finding peptide Candidates for a given MS Spectrum within a certain mass tolerance. The suffix array can be saved to disc for reused so it has to be build just once.

Member Typedef Documentation

typedef seqan::Index<seqan::String<char>, seqan::IndexEsa<TIterSpec> > TIndex
private
typedef seqan::Iter<TIndex, seqan::VSTree<TIterSpec> > TIter
private
typedef seqan::TopDown<seqan::ParentLinks<> > TIterSpec
private

Constructor & Destructor Documentation

SuffixArraySeqan ( const String st,
const String filename,
const WeightWrapper::WEIGHTMODE  weight_mode = WeightWrapper::MONO 
)

constructor

Parameters
stconst string reference with the string for which the suffix array should be build
filenameconst string reference with filename for opening or saving the suffix array
weight_modeif not monoisotopic weight should be used, this parameters can be set to AVERAGE
Exceptions
FileNotFoundis thrown if the given file is not found
InvalidValueif the given suffix array string is invalid
SuffixArraySeqan ( const SuffixArraySeqan source)

copy constructor

virtual ~SuffixArraySeqan ( )
virtual

destructor

Member Function Documentation

SignedSize findFirst_ ( const std::vector< double > &  spec,
double m 
)
protected

binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance.

Parameters
specconst reference to spectrum
mmass
Returns
SignedSize with the index of the first occurrence
Note
requires that there is at least one occurrence
SignedSize findFirst_ ( const std::vector< double > &  spec,
double m,
SignedSize  start,
SignedSize  end 
)
protected

binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. It searches recursively.

Parameters
specconst reference to spectrum
mmass
startstart index
endend index
Returns
SignedSize with the index of the first occurrence
Note
requires that there is at least one occurrence
void findSpec ( std::vector< std::vector< std::pair< std::pair< SignedSize, SignedSize >, double > > > &  candidates,
const std::vector< double > &  spec 
)
virtual

the function that will find all peptide candidates for a given spectrum

Parameters
specconst reference of double vector describing the spectrum
candidatesoutput parameters which holds the candidates of the masses given in spec after call
Returns
a vector of SignedSize pairs.

For every mass within the spectrum all candidates described by as pairs of ints are returned. All masses are searched for the same time in just one suffix array traversal. In order to accelerate the traversal the skip and lcp table are used. The mass wont be calculated for each entry but it will be updated during traversal using a stack data structure

Implements SuffixArray.

Size getNumberOfModifications ( )
virtual

getter for number of modifications

Returns
number of modifications

Implements SuffixArray.

const std::vector<OpenMS::String>& getTags ( )
virtual

getter for tags

Returns
const reference to vector of strings

Implements SuffixArray.

double getTolerance ( ) const
virtual

getter for tolerance

Returns
double with tolerance

Implements SuffixArray.

bool getUseTags ( )
virtual

getter for use_tags

Returns
bool indicating whether tags are used or not

Implements SuffixArray.

void goNext_ ( TIter it,
double m,
std::stack< double > &  allm,
std::stack< std::map< double, SignedSize > > &  mod_map 
)
inlineprotected

overwriting goNext from seqan index_esa_stree.h for mass update during suffix array traversal

the suffix array is treated as a suffix tree. this function goes to the next node that has not been visited yet. During this traversal the mass will be updated using the stack with edge masses.

Parameters
itreference to the suffix array iterator
mreference to actual mass
allmreference to the stack with history of traversal
mod_mapinput parameters which specifies the modification masses allowed in the candidates
See also
goNextSubTree_
void goNextSubTree_ ( TIter it,
double m,
std::stack< double > &  allm,
std::stack< std::map< double, SignedSize > > &  mod_map 
)
inlineprotected

overwriting goNextSubTree_ from seqan index_esa_stree.h for mass update during suffix array traversal

the suffix array is treated as a suffix tree. this function skips the subtree under the actual node and goes directly to the next subtree that has not been visited yet. During this traversal the mass will be updated using the stack with edge masses.

Parameters
itreference to the suffix array iterator
mreference to actual mass
allmreference to the stack with history of traversal
mod_mapinput parameters which specifies the modification masses allowed in the candidates
See also
goNext
void goNextSubTree_ ( TIter it)
inlineprotected

goes to the next sub tree

Parameters
itreference to the suffix array iterator
See also
goNext
bool isDigestingEnd ( const char  aa1,
const char  aa2 
) const
virtual

returns if an enzyme will cut after first character

Parameters
aa1const char as first amino acid
aa2const char as second amino acid
Returns
bool describing if it is a digesting site

Implements SuffixArray.

Reimplemented in SuffixArrayTrypticSeqan.

bool open ( const String filename)
virtual

opens the suffix array

Parameters
filenameconst reference string describing the filename
Returns
bool if operation was successful
Exceptions
FileNotFoundis thrown if the given file could not be found

Implements SuffixArray.

void parseTree_ ( TIter it,
std::vector< std::pair< SignedSize, SignedSize > > &  out_number,
std::vector< std::pair< SignedSize, SignedSize > > &  edge_length,
std::vector< SignedSize > &  leafe_depth 
)
inlineprotected
void printStatistic ( )
virtual

output for statistic

Implements SuffixArray.

bool save ( const String filename)
virtual

saves the suffix array to disc

Parameters
filenameconst reference string describing the filename
Returns
bool if operation was successful
Exceptions
UnableToCreateFileis thrown if the output files could not be created

Implements SuffixArray.

void setNumberOfModifications ( Size  number_of_mods)
virtual

setter for number of modifications

Parameters
number_of_mods

Implements SuffixArray.

void setTags ( const std::vector< OpenMS::String > &  tags)
virtual

setter for tags

Parameters
tagsreference to vector of strings with tags
Note
sets use_tags = true

Implements SuffixArray.

void setTolerance ( double  t)
virtual

setter for tolerance

Parameters
tdouble with tolerance, only 0 or greater is allowed
Exceptions
InvalidValueis thrown if given tolerance is negative

Implements SuffixArray.

void setUseTags ( bool  use_tags)
virtual

setter for use_tags

Parameters
use_tagsindicating whether tags should be used or not

Implements SuffixArray.

String toString ( )
virtual

converts suffix array to a printable string

Implements SuffixArray.

Member Data Documentation

TIndex index_
protected

seqan suffix array

TIter it_
protected

seqan suffix array iterator

double masse_[255]
protected

amino acid masses

SignedSize number_of_modifications_
protected

number of allowed modifications

const String& s_
protected

reference to strings for which the suffix array is build

std::vector<String> tags_
protected

all tags

double tol_
protected

tolerance

bool use_tags_
protected

if tags are used


OpenMS / TOPP release 2.0.0 Documentation generated on Wed Mar 30 2016 12:49:30 using doxygen 1.8.11