Home  · Classes  · Annotated Classes  · Modules  · Members  · Namespaces  · Related Pages

SuffixArraySeqan Class Reference

#include <OpenMS/DATASTRUCTURES/SuffixArraySeqan.h>

Inheritance diagram for SuffixArraySeqan:

SuffixArray SuffixArrayTrypticSeqan

List of all members.


Detailed Description

Class that uses SEQAN library for a suffix array. It can be used to find peptide Candidates for a MS spectrum.

This class uses SEQAN suffix array. It can just be used for finding peptide Candidates for a given MS Spectrum within a certain mass tolerance. The suffix array can be saved to disc for reused so it has to be build just once.

Public Member Functions

 SuffixArraySeqan (const String &st, const String &sa_file_name) throw (Exception::InvalidValue,Exception::FileNotFound)
 constructor
 SuffixArraySeqan (const SuffixArraySeqan &source)
 copy constructor
virtual ~SuffixArraySeqan ()
 destructor
String toString ()
 converts suffix array to a printable string
void findSpec (std::vector< std::vector< std::pair< std::pair< int, int >, float > > > &candidates, const std::vector< double > &spec) throw (Exception::InvalidValue)
 the function that will find all peptide candidates for a given spectrum
bool save (const String &file_name) throw (Exception::UnableToCreateFile)
 saves the suffix array to disc
bool open (const String &file_name) throw (Exception::FileNotFound)
 opens the suffix array
void setTolerance (double t) throw (Exception::InvalidValue)
 setter for tolerance
double getTolerance () const
 getter for tolerance
bool isDigestingEnd (const char aa1, const char aa2) const
 returns if an enzyme will cut after first character
void setTags (const std::vector< OpenMS::String > &tags) throw (OpenMS::Exception::InvalidValue)
 setter for tags
const std::vector
< OpenMS::String > & 
getTags ()
 getter for tags
void setUseTags (bool use_tags)
 setter for use_tags
bool getUseTags ()
 getter for use_tags
void setNumberOfModifications (unsigned int number_of_mods)
 setter for number of modifications
unsigned int getNumberOfModifications ()
 getter for number of modifications
template<typename TIndex, typename TSpec>
void goNextSubTree (seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &it, double &m, std::stack< double > &allm, std::stack< std::map< double, int > > &mod_map)
 overwriting goNextSubTree from seqan index_esa_stree.h for mass update during suffix array traversal
template<typename TIndex, typename TSpec>
void goNextSubTree (seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &it)
 goes to the next sub tree
template<typename TIndex, typename TSpec>
void goNext (seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &it, double &m, std::stack< double > &allm, std::stack< std::map< double, int > > &mod_map)
 overwriting goNext from seqan index_esa_stree.h for mass update during suffix array traversal
template<typename TIndex, typename TSpec>
void parseTree (seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &it, std::vector< std::pair< int, int > > &out_number, std::vector< std::pair< int, int > > &edge_length, std::vector< int > &leafe_depth)
void printStatistic ()
 output for statistic

Protected Member Functions

int findFirst_ (const std::vector< double > &spec, double &m)
 binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance.
int findFirst_ (const std::vector< double > &spec, double &m, int start, int end)
 binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. it searches recursivly.

Protected Attributes

TIndex index_
 seqan suffix array
seqan::Iter< TIndex,
seqan::VSTree< seqan::TopDown
< seqan::ParentLinks
< seqan::Preorder > > > > * 
it_
 seqan suffix array iterator
const Strings_
 reference to strings for which the suffix array is build
double masse_ [255]
 amino acid masses
int number_of_modifications_
 number of allowed modifications
std::vector< Stringtags_
 all tags
bool use_tags_
 if tags are used
double tol_
 tolerance

Private Types

typedef seqan::Index
< seqan::String< char >,
seqan::Index_ESA<> > 
TIndex


Member Typedef Documentation

typedef seqan::Index<seqan::String<char>, seqan::Index_ESA<> > TIndex [private]


Constructor & Destructor Documentation

SuffixArraySeqan ( const String st,
const String sa_file_name 
) throw (Exception::InvalidValue,Exception::FileNotFound)

constructor

Parameters:
st const string reference with the string for which the suffix array should be build
saFileName const string reference with filename for opening or saving the suffix array

SuffixArraySeqan ( const SuffixArraySeqan source  ) 

copy constructor

virtual ~SuffixArraySeqan (  )  [virtual]

destructor


Member Function Documentation

String toString (  )  [virtual]

converts suffix array to a printable string

Implements SuffixArray.

void findSpec ( std::vector< std::vector< std::pair< std::pair< int, int >, float > > > &  candidates,
const std::vector< double > &  spec 
) throw (Exception::InvalidValue) [virtual]

the function that will find all peptide candidates for a given spectrum

Parameters:
spec const reference of double vector describing the spectrum
Returns:
a vector of int pairs.
for every mass within the spectrum all candidates described by as pairs of ints are returned. All masses are searched for the same time in just one suffix array traversal. In order to accelerate the traversal the skip and lcp table are used. The mass wont be calculated for each entry but it will be updated during traversal using a stack datastructure

Implements SuffixArray.

bool save ( const String file_name  )  throw (Exception::UnableToCreateFile) [virtual]

saves the suffix array to disc

Parameters:
filename const reference string describing the filename
Returns:
bool if operation was succesful

Implements SuffixArray.

bool open ( const String file_name  )  throw (Exception::FileNotFound) [virtual]

opens the suffix array

Parameters:
filename const reference string describing the filename
Returns:
bool if operation was succesful

Implements SuffixArray.

void setTolerance ( double  t  )  throw (Exception::InvalidValue) [virtual]

setter for tolerance

Parameters:
t double with tolerance

Implements SuffixArray.

double getTolerance (  )  const [virtual]

getter for tolerance

Returns:
double with tolerance

Implements SuffixArray.

bool isDigestingEnd ( const char  aa1,
const char  aa2 
) const [virtual]

returns if an enzyme will cut after first character

Parameters:
aa1 const char as first aminoacid
aa2 const char as second aminoacid
Returns:
bool descibing if it is a digesting site

Implements SuffixArray.

Reimplemented in SuffixArrayTrypticSeqan.

void setTags ( const std::vector< OpenMS::String > &  tags  )  throw (OpenMS::Exception::InvalidValue) [virtual]

setter for tags

Parameters:
tags reference to vector of strings with tags
Note:
sets use_tags = true

Implements SuffixArray.

const std::vector<OpenMS::String>& getTags (  )  [virtual]

getter for tags

Returns:
const reference to vector of strings

Implements SuffixArray.

void setUseTags ( bool  use_tags  )  [virtual]

setter for use_tags

Parameters:
use_tags indicating whether tags should be used or not

Implements SuffixArray.

bool getUseTags (  )  [virtual]

getter for use_tags

Returns:
bool indicating whether tags are used or not

Implements SuffixArray.

void setNumberOfModifications ( unsigned int  number_of_mods  )  [virtual]

setter for number of modifications

Parameters:
number_of_mods 

Implements SuffixArray.

unsigned int getNumberOfModifications (  )  [virtual]

getter for number of modifications

Returns:
number of modifications

Implements SuffixArray.

void goNextSubTree ( seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &  it,
double &  m,
std::stack< double > &  allm,
std::stack< std::map< double, int > > &  mod_map 
) [inline]

overwriting goNextSubTree from seqan index_esa_stree.h for mass update during suffix array traversal

the suffix array is treated as a suffix tree. this function skips the subtree under the actual node and goes directly to the next subtree that has not been visited yet. During this traversal the mass will be updated using the stack with edge masses.

Parameters:
it reference to the suffix array iterator
m reference to actual mass
allm reference to the stack with history of traversal
See also:
goNext

void goNextSubTree ( seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &  it  )  [inline]

goes to the next sub tree

Parameters:
it reference to the suffix array iterator
See also:
goNext

void goNext ( seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &  it,
double &  m,
std::stack< double > &  allm,
std::stack< std::map< double, int > > &  mod_map 
) [inline]

overwriting goNext from seqan index_esa_stree.h for mass update during suffix array traversal

the suffix array is treated as a suffix tree. this function goes to the next node that has not been visited yet. During this traversal the mass will be updated using the stack with edge masses.

Parameters:
it reference to the suffix array iterator
m reference to actual mass
allm reference to the stack with history of traversal
See also:
goNextSubTree

void parseTree ( seqan::Iter< TIndex, seqan::VSTree< seqan::TopDown< seqan::ParentLinks< TSpec > > > > &  it,
std::vector< std::pair< int, int > > &  out_number,
std::vector< std::pair< int, int > > &  edge_length,
std::vector< int > &  leafe_depth 
) [inline]

void printStatistic (  )  [virtual]

output for statistic

Implements SuffixArray.

int findFirst_ ( const std::vector< double > &  spec,
double &  m 
) [protected]

binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance.

Parameters:
spec const reference to spectrum
m mass
Returns:
int with the index of the first occurence
Note:
requires that there is at least one occurence

int findFirst_ ( const std::vector< double > &  spec,
double &  m,
int  start,
int  end 
) [protected]

binary search for finding the index of the first element of the spectrum that matches the desired mass within the tolerance. it searches recursivly.

Parameters:
spec const reference to spectrum
m mass
start start index
end end index
Returns:
int with the index of the first occurence
Note:
requires that there is at least one occurence


Member Data Documentation

TIndex index_ [protected]

seqan suffix array

seqan::Iter<TIndex, seqan::VSTree<seqan::TopDown<seqan::ParentLinks<seqan::Preorder> > > >* it_ [protected]

seqan suffix array iterator

const String& s_ [protected]

reference to strings for which the suffix array is build

double masse_[255] [protected]

amino acid masses

int number_of_modifications_ [protected]

number of allowed modifications

std::vector<String> tags_ [protected]

all tags

bool use_tags_ [protected]

if tags are used

double tol_ [protected]

tolerance


The documentation for this class was generated from the following file:
Generated Tue Apr 1 15:36:44 2008 -- using doxygen 1.5.4 OpenMS / TOPP 1.1