#include <OpenMS/ANALYSIS/SVM/SVMWrapper.h>
This class can be used for svm predictions. You can either perform classification or regression and choose certain kernel fuctions and additional parameters. Furthermore the models can be saved and loaded and we support also a new kernel function that was specially designed for learning with small sequences of different lengths.
Public Member Functions | |
SVMWrapper () | |
standard constructor | |
~SVMWrapper () | |
destructor | |
void | setParameter (SVM_parameter_type type, Int value) |
You can set the parameters of the svm:. | |
void | setParameter (SVM_parameter_type type, DoubleReal value) |
sets the double parameters of the svm | |
Int | train (struct svm_problem *problem) |
trains the svm | |
void | saveModel (std::string modelFilename) const throw (Exception::UnableToCreateFile) |
saves the svm model | |
void | loadModel (std::string modelFilename) |
loads the model | |
void | predict (struct svm_problem *predictProblem, std::vector< DoubleReal > &predicted_rts) |
predicts the labels using the trained model | |
Int | getIntParameter (SVM_parameter_type type) |
You can get the actual int- parameters of the svm. | |
DoubleReal | getDoubleParameter (SVM_parameter_type type) |
You can get the actual double- parameters of the svm. | |
void | predict (const std::vector< svm_node * > &vectors, std::vector< DoubleReal > &predicted_rts) |
predicts the labels using the trained model | |
DoubleReal | performCrossValidation (svm_problem *problem, const std::map< SVM_parameter_type, DoubleReal > &start_values, const std::map< SVM_parameter_type, DoubleReal > &step_sizes, const std::map< SVM_parameter_type, DoubleReal > &end_values, UInt number_of_partitions, UInt number_of_runs, std::map< SVM_parameter_type, DoubleReal > &best_parameters, bool additive_step_size=true, bool output=false, String performances_file_name="performances.txt", bool mcc_as_performance_measure=false) |
Performs a CV for the data given by 'problem'. | |
DoubleReal | getSVRProbability () |
Returns the probability parameter sigma of the fitted laplace model. | |
void | getSignificanceBorders (svm_problem *data, std::pair< DoubleReal, DoubleReal > &borders, DoubleReal confidence=0.95, UInt number_of_runs=10, UInt number_of_partitions=5, DoubleReal step_size=0.01, UInt max_iterations=1000000) |
calculates the significance borders of the error model and stores them in 'borders' | |
DoubleReal | getPValue (DoubleReal sigma1, DoubleReal sigma2, std::pair< DoubleReal, DoubleReal > point) |
calculates a p-value for a given data point using the model parameters | |
void | getDecisionValues (svm_problem *data, std::vector< DoubleReal > &decision_values) |
stores the prediction values for the encoded data in 'decision_values' | |
void | scaleData (svm_problem *data, Int max_scale_value=-1) |
Scales the data such that every coloumn is scaled to [-1, 1]. | |
svm_problem * | computeKernelMatrix (svm_problem *problem1, svm_problem *problem2) |
computes the kernel matrix using the actual svm parameters and the given data | |
void | setTrainingSample (svm_problem *training_sample) |
This is used for being able to perform predictions with non libsvm standard kernels. | |
void | getSVCProbabilities (struct svm_problem *problem, std::vector< DoubleReal > &probabilities, std::vector< DoubleReal > &prediction_labels) |
This function fills probabilities with the probability estimates for the first class. | |
void | setWeights (const std::vector< Int > &weight_labels, const std::vector< DoubleReal > &weights) |
Sets weights for the classes in C_SVC (see libsvm documentation for further details). | |
Static Public Member Functions | |
static void | createRandomPartitions (svm_problem *problem, UInt number, std::vector< svm_problem * > &partitions) |
You can create 'number' equally sized random partitions. | |
static svm_problem * | mergePartitions (const std::vector< svm_problem * > &problems, UInt except) |
You can merge partitions excuding the partition with index 'except'. | |
static void | getLabels (svm_problem *problem, std::vector< DoubleReal > &labels) |
Stores the stored labels of the encoded SVM data at 'labels'. | |
static DoubleReal | kernelOligo (const svm_node *x, const svm_node *y, const std::vector< DoubleReal > &gauss_table, DoubleReal sigma_square=0, UInt max_distance=50) |
calculates the oligo kernel value for the encoded sequences 'x' and 'y' | |
static void | calculateGaussTable (UInt border_length, DoubleReal sigma, std::vector< DoubleReal > &gauss_table) |
Protected Member Functions | |
void | destroyProblem_ (svm_problem *problem) |
Private Member Functions | |
UInt | getNumberOfEnclosedPoints_ (DoubleReal m1, DoubleReal m2, const std::vector< std::pair< DoubleReal, DoubleReal > > &points) |
void | initParameters_ () |
Initializes the svm with standard parameters. | |
Private Attributes | |
svm_parameter * | param_ |
svm_model * | model_ |
DoubleReal | sigma_ |
std::vector< DoubleReal > | sigmas_ |
std::vector< DoubleReal > | gauss_table_ |
std::vector< std::vector < DoubleReal > > | gauss_tables_ |
UInt | kernel_type_ |
UInt | border_length_ |
svm_problem * | training_set_ |
svm_problem * | training_problem_ |
SVMWrapper | ( | ) |
standard constructor
~SVMWrapper | ( | ) |
destructor
void setParameter | ( | SVM_parameter_type | type, | |
Int | value | |||
) |
You can set the parameters of the svm:.
KERNEL_TYPE: can be LINEAR for the linear kernel RBF for the rbf kernel POLY for the polynomial kernel SIGMOID for the sigmoid kernel DEGREE: the degree for the polynomial- kernel and the locality- improved kernel
C: the C parameter of the svm
void setParameter | ( | SVM_parameter_type | type, | |
DoubleReal | value | |||
) |
sets the double parameters of the svm
Int train | ( | struct svm_problem * | problem | ) |
trains the svm
The svm is trained with the data stored in the 'svm_problem' structure.
void saveModel | ( | std::string | modelFilename | ) | const throw (Exception::UnableToCreateFile) |
saves the svm model
The model of the trained svm is saved into 'modelFilename'. Throws an exception if the model cannot be saved.
void loadModel | ( | std::string | modelFilename | ) |
loads the model
The svm- model is loaded. After this, the svm is ready for prediction.
void predict | ( | struct svm_problem * | predictProblem, | |
std::vector< DoubleReal > & | predicted_rts | |||
) |
predicts the labels using the trained model
The prediction process is started and the results are stored in 'predicted_rts'.
Int getIntParameter | ( | SVM_parameter_type | type | ) |
You can get the actual int- parameters of the svm.
KERNEL_TYPE: can be LINEAR for the linear kernel RBF for the rbf kernel POLY for the polynomial kernel SIGMOID for the sigmoid kernel
DEGREE: the degree for the polynomial- kernel and the locality- improved kernel
SVM_TYPE: the SVm type of the svm: can be NU_SVR or EPSILON_SVR
DoubleReal getDoubleParameter | ( | SVM_parameter_type | type | ) |
You can get the actual double- parameters of the svm.
C: the C parameter of the svm P: the P parameter of the svm (sets the epsilon in epsilon-svr) NU: the nu parameter in nu-SVR GAMMA: for POLY, RBF and SIGMOID
static void createRandomPartitions | ( | svm_problem * | problem, | |
UInt | number, | |||
std::vector< svm_problem * > & | partitions | |||
) | [static] |
You can create 'number' equally sized random partitions.
This function creates 'number' equally sized random partitions and stores them in 'partitions'.
static svm_problem* mergePartitions | ( | const std::vector< svm_problem * > & | problems, | |
UInt | except | |||
) | [static] |
You can merge partitions excuding the partition with index 'except'.
void predict | ( | const std::vector< svm_node * > & | vectors, | |
std::vector< DoubleReal > & | predicted_rts | |||
) |
predicts the labels using the trained model
The prediction process is started and the results are stored in 'predicted_rts'.
static void getLabels | ( | svm_problem * | problem, | |
std::vector< DoubleReal > & | labels | |||
) | [static] |
Stores the stored labels of the encoded SVM data at 'labels'.
DoubleReal performCrossValidation | ( | svm_problem * | problem, | |
const std::map< SVM_parameter_type, DoubleReal > & | start_values, | |||
const std::map< SVM_parameter_type, DoubleReal > & | step_sizes, | |||
const std::map< SVM_parameter_type, DoubleReal > & | end_values, | |||
UInt | number_of_partitions, | |||
UInt | number_of_runs, | |||
std::map< SVM_parameter_type, DoubleReal > & | best_parameters, | |||
bool | additive_step_size = true , |
|||
bool | output = false , |
|||
String | performances_file_name = "performances.txt" , |
|||
bool | mcc_as_performance_measure = false | |||
) |
Performs a CV for the data given by 'problem'.
DoubleReal getSVRProbability | ( | ) |
Returns the probability parameter sigma of the fitted laplace model.
The libsvm is used to fit a laplace model to the prediction values by performing an internal cv using the training set if setParameter(PROBABILITY, 1) was invoked before using train. Look for your libsvm documentation for more details. The model parameter sigma is returned by this method. If no model was fitted during training zero is returned.
static DoubleReal kernelOligo | ( | const svm_node * | x, | |
const svm_node * | y, | |||
const std::vector< DoubleReal > & | gauss_table, | |||
DoubleReal | sigma_square = 0 , |
|||
UInt | max_distance = 50 | |||
) | [static] |
calculates the oligo kernel value for the encoded sequences 'x' and 'y'
This kernel function calculates the oligo kernel value [Meinicke 04] for the sequences 'x' and 'y' that had been encoded by the encodeOligoBorder... function of the LibSVMEncoder class.
void getSignificanceBorders | ( | svm_problem * | data, | |
std::pair< DoubleReal, DoubleReal > & | borders, | |||
DoubleReal | confidence = 0.95 , |
|||
UInt | number_of_runs = 10 , |
|||
UInt | number_of_partitions = 5 , |
|||
DoubleReal | step_size = 0.01 , |
|||
UInt | max_iterations = 1000000 | |||
) |
calculates the significance borders of the error model and stores them in 'borders'
DoubleReal getPValue | ( | DoubleReal | sigma1, | |
DoubleReal | sigma2, | |||
std::pair< DoubleReal, DoubleReal > | point | |||
) |
calculates a p-value for a given data point using the model parameters
Uses the model parameters to calculate the p-value for 'point' which has the data entries: measured, predicted retention time.
void getDecisionValues | ( | svm_problem * | data, | |
std::vector< DoubleReal > & | decision_values | |||
) |
stores the prediction values for the encoded data in 'decision_values'
This function can be used to get the prediction values of the data if a model is already trained by the train() method. For regression the result is the same as for the method predict. For classification this function returns the distance from the separating hyperplane. For multiclass classification the decision_values vector will be empty.
void scaleData | ( | svm_problem * | data, | |
Int | max_scale_value = -1 | |||
) |
Scales the data such that every coloumn is scaled to [-1, 1].
Scales the x[][].value values of the svm_problem* structure. If the second parameter is omitted, the data is scaled to [-1, 1]. Otherwise the data is scaled to [0, max_scale_value]
static void calculateGaussTable | ( | UInt | border_length, | |
DoubleReal | sigma, | |||
std::vector< DoubleReal > & | gauss_table | |||
) | [static] |
svm_problem* computeKernelMatrix | ( | svm_problem * | problem1, | |
svm_problem * | problem2 | |||
) |
computes the kernel matrix using the actual svm parameters and the given data
This function can be used to compute a kernel matrix. 'problem1' and 'problem2' are used together wit the oligo kernel function (could be extended if you want to use your own kernel functions).
void setTrainingSample | ( | svm_problem * | training_sample | ) |
This is used for being able to perform predictions with non libsvm standard kernels.
void getSVCProbabilities | ( | struct svm_problem * | problem, | |
std::vector< DoubleReal > & | probabilities, | |||
std::vector< DoubleReal > & | prediction_labels | |||
) |
This function fills probabilities with the probability estimates for the first class.
The libSVM function svm_predict_probability is called to get probability estimates for the positive class. Since this is only used for binary classification it is sufficient for every test example to report the probability of the test example belonging to the positive class. Probability estimates have to be turned on during training (svm.setParameter(PROBABILITY, 1)), otherwise this method will fill the 'probabilities' vector with -1s.
void setWeights | ( | const std::vector< Int > & | weight_labels, | |
const std::vector< DoubleReal > & | weights | |||
) |
Sets weights for the classes in C_SVC (see libsvm documentation for further details).
void destroyProblem_ | ( | svm_problem * | problem | ) | [protected] |
UInt getNumberOfEnclosedPoints_ | ( | DoubleReal | m1, | |
DoubleReal | m2, | |||
const std::vector< std::pair< DoubleReal, DoubleReal > > & | points | |||
) | [private] |
void initParameters_ | ( | ) | [private] |
Initializes the svm with standard parameters.
svm_parameter* param_ [private] |
svm_model* model_ [private] |
DoubleReal sigma_ [private] |
std::vector<DoubleReal> sigmas_ [private] |
std::vector<DoubleReal> gauss_table_ [private] |
std::vector<std::vector<DoubleReal> > gauss_tables_ [private] |
UInt kernel_type_ [private] |
UInt border_length_ [private] |
svm_problem* training_set_ [private] |
svm_problem* training_problem_ [private] |
Generated Tue Apr 1 15:36:42 2008 -- using doxygen 1.5.4 | OpenMS / TOPP 1.1 |