mlpack
1.0.12
|
Public Member Functions | |
CosineTree (const arma::mat &dataset) | |
CosineTree constructor for the root node of the tree. More... | |
CosineTree (CosineTree &parentNode, const std::vector< size_t > &subIndices) | |
CosineTree constructor for nodes other than the root node of the tree. More... | |
CosineTree (const arma::mat &dataset, const double epsilon, const double delta) | |
Construct the CosineTree and the basis for the given matrix, and passed 'epsilon' and 'delta' parameters. More... | |
~CosineTree () | |
Destroy the cosine tree and all of its children (take care of the memory allocations too). More... | |
void | BasisVector (arma::vec &bVector) |
Set the basis vector of the node. More... | |
arma::vec & | BasisVector () |
Get the basis vector of the node. More... | |
size_t | BinarySearch (arma::vec &cDistribution, double value, size_t start, size_t end) |
Sample a column based on the cumulative Length-Squared distribution of the cosine node, and a randomly generated value in the range [0, 1]. More... | |
void | CalculateCentroid () |
Calculate centroid of the columns present in the node. More... | |
void | CalculateCosines (arma::vec &cosines) |
Calculate cosines of the columns present in the node, with respect to the sampled splitting point. More... | |
arma::vec & | Centroid () |
Get pointer to the centroid vector. More... | |
size_t | ColumnSampleLS () |
Sample a point from the Length-Squared distribution of the cosine node. More... | |
void | ColumnSamplesLS (std::vector< size_t > &sampledIndices, arma::vec &probabilities, size_t numSamples) |
Sample 'numSamples' points from the Length-Squared distribution of the cosine node. More... | |
void | ConstructBasis (CosineNodeQueue &treeQueue) |
Constructs the final basis matrix, after the cosine tree construction. More... | |
void | CosineNodeSplit () |
This function splits the cosine node into two children based on the cosines of the columns contained in the node, with respect to the sampled splitting point. More... | |
double | FrobNormSquared () const |
Get the Frobenius norm squared of columns in the node. More... | |
const arma::mat & | GetDataset () const |
Get pointer to the dataset matrix. More... | |
void | GetFinalBasis (arma::mat &finalBasis) |
Returns the basis of the constructed subspace. More... | |
void | L2Error (const double error) |
Set the Monte Carlo error. More... | |
double | L2Error () const |
Get the Monte Carlo error. More... | |
CosineTree * | Left () |
Get pointer to the left child of the node. More... | |
void | ModifiedGramSchmidt (CosineNodeQueue &treeQueue, arma::vec ¢roid, arma::vec &newBasisVector, arma::vec *addBasisVector=NULL) |
Calculates the orthonormalization of the passed centroid, with respect to the current vector subspace. More... | |
double | MonteCarloError (CosineTree *node, CosineNodeQueue &treeQueue, arma::vec *addBasisVector1=NULL, arma::vec *addBasisVector2=NULL) |
Estimates the squared error of the projection of the input node's matrix onto the current vector subspace. More... | |
size_t | NumColumns () const |
Get number of columns of input matrix in the node. More... | |
CosineTree * | Right () |
Get pointer to the right child of the node. More... | |
size_t | SplitPointIndex () const |
Get the column index of split point of the node. More... | |
std::vector< size_t > & | VectorIndices () |
Get the indices of columns in the node. More... | |
Private Attributes | |
arma::mat | basis |
Subspace basis of the input dataset. More... | |
arma::vec | basisVector |
Orthonormalized basis vector of the node. More... | |
arma::vec | centroid |
Centroid of columns of input matrix in the node. More... | |
const arma::mat & | dataset |
Matrix for which cosine tree is constructed. More... | |
double | delta |
Cumulative probability for Monte Carlo error lower bound. More... | |
double | epsilon |
Error tolerance fraction for calculated subspace. More... | |
double | frobNormSquared |
Frobenius norm squared of columns in the node. More... | |
std::vector< size_t > | indices |
Indices of columns of input matrix in the node. More... | |
double | l2Error |
Monte Carlo error for this node. More... | |
arma::vec | l2NormsSquared |
L2-norm squared of columns in the node. More... | |
CosineTree * | left |
Left child of the node. More... | |
size_t | numColumns |
Number of columns of input matrix in the node. More... | |
CosineTree * | parent |
Parent of the node. More... | |
CosineTree * | right |
Right child of the node. More... | |
size_t | splitPointIndex |
Index of split point of cosine node. More... | |
Definition at line 32 of file cosine_tree.hpp.
mlpack::tree::CosineTree::CosineTree | ( | const arma::mat & | dataset | ) |
CosineTree constructor for the root node of the tree.
It initializes the necessary variables required for splitting of the node, and building the tree further. It takes a pointer to the input matrix and calculates the relevant variables using it.
dataset | Matrix for which cosine tree is constructed. |
mlpack::tree::CosineTree::CosineTree | ( | CosineTree & | parentNode, |
const std::vector< size_t > & | subIndices | ||
) |
CosineTree constructor for nodes other than the root node of the tree.
It takes in a pointer to the parent node and a list of column indices which mentions the columns to be included in the node. The function calculate the relevant variables just like the constructor above.
parentNode | Pointer to the parent cosine node. |
subIndices | Pointer to vector of column indices to be included. |
mlpack::tree::CosineTree::CosineTree | ( | const arma::mat & | dataset, |
const double | epsilon, | ||
const double | delta | ||
) |
Construct the CosineTree and the basis for the given matrix, and passed 'epsilon' and 'delta' parameters.
The CosineTree is constructed by splitting nodes in the direction of maximum error, stored using a priority queue. Basis vectors are added from the left and right children of the split node. The basis vector from a node is the orthonormalized centroid of its columns. The splitting continues till the Monte Carlo estimate of the input matrix's projection on the obtained subspace is less than a fraction of the norm of the input matrix.
dataset | Matrix for which the CosineTree is constructed. |
epsilon | Error tolerance fraction for calculated subspace. |
delta | Cumulative probability for Monte Carlo error lower bound. |
mlpack::tree::CosineTree::~CosineTree | ( | ) |
Destroy the cosine tree and all of its children (take care of the memory allocations too).
|
inline |
Set the basis vector of the node.
Definition at line 192 of file cosine_tree.hpp.
References basisVector.
|
inline |
Get the basis vector of the node.
Definition at line 195 of file cosine_tree.hpp.
References basisVector.
size_t mlpack::tree::CosineTree::BinarySearch | ( | arma::vec & | cDistribution, |
double | value, | ||
size_t | start, | ||
size_t | end | ||
) |
Sample a column based on the cumulative Length-Squared distribution of the cosine node, and a randomly generated value in the range [0, 1].
Binary search is more efficient than searching linearly for the same. This leads a significant speedup when there are large number of columns to choose from and when a number of samples are to be drawn from the distribution.
cDistribution | Cumulative LS distibution of columns in the node. |
value | Randomly generated value in the range [0, 1]. |
start | Starting index of the distribution interval to search in. |
end | Ending index of the distribution interval to search in. |
void mlpack::tree::CosineTree::CalculateCentroid | ( | ) |
Calculate centroid of the columns present in the node.
The calculated centroid is used as a basis vector for the cosine tree being constructed.
void mlpack::tree::CosineTree::CalculateCosines | ( | arma::vec & | cosines | ) |
Calculate cosines of the columns present in the node, with respect to the sampled splitting point.
The calculated cosine values are useful for splitting the node into its children.
cosines | Vector to store the cosine values in. |
|
inline |
Get pointer to the centroid vector.
Definition at line 189 of file cosine_tree.hpp.
References centroid.
size_t mlpack::tree::CosineTree::ColumnSampleLS | ( | ) |
Sample a point from the Length-Squared distribution of the cosine node.
The function uses 'l2NormsSquared' to calculate the cumulative probability distribution of the column vectors. The sampling is based on a randomly generated value in the range [0, 1].
void mlpack::tree::CosineTree::ColumnSamplesLS | ( | std::vector< size_t > & | sampledIndices, |
arma::vec & | probabilities, | ||
size_t | numSamples | ||
) |
Sample 'numSamples' points from the Length-Squared distribution of the cosine node.
The function uses 'l2NormsSquared' to calculate the cumulative probability distribution of the column vectors. The sampling is based on a randomly generated values in the range [0, 1].
void mlpack::tree::CosineTree::ConstructBasis | ( | CosineNodeQueue & | treeQueue | ) |
Constructs the final basis matrix, after the cosine tree construction.
treeQueue | Priority queue of cosine nodes. |
void mlpack::tree::CosineTree::CosineNodeSplit | ( | ) |
This function splits the cosine node into two children based on the cosines of the columns contained in the node, with respect to the sampled splitting point.
The function also calls the CosineTree constructor for the children.
|
inline |
Get the Frobenius norm squared of columns in the node.
Definition at line 207 of file cosine_tree.hpp.
References frobNormSquared.
|
inline |
Get pointer to the dataset matrix.
Definition at line 177 of file cosine_tree.hpp.
References dataset.
|
inline |
Returns the basis of the constructed subspace.
Definition at line 174 of file cosine_tree.hpp.
References basis.
|
inline |
Set the Monte Carlo error.
Definition at line 183 of file cosine_tree.hpp.
References l2Error.
Referenced by mlpack::tree::CompareCosineNode::operator()().
|
inline |
|
inline |
Get pointer to the left child of the node.
Definition at line 198 of file cosine_tree.hpp.
References left.
void mlpack::tree::CosineTree::ModifiedGramSchmidt | ( | CosineNodeQueue & | treeQueue, |
arma::vec & | centroid, | ||
arma::vec & | newBasisVector, | ||
arma::vec * | addBasisVector = NULL |
||
) |
Calculates the orthonormalization of the passed centroid, with respect to the current vector subspace.
treeQueue | Priority queue of cosine nodes. |
centroid | Centroid of the node being added to the basis. |
newBasisVector | Orthonormalized centroid of the node. |
addBasisVector | Address to additional basis vector. |
double mlpack::tree::CosineTree::MonteCarloError | ( | CosineTree * | node, |
CosineNodeQueue & | treeQueue, | ||
arma::vec * | addBasisVector1 = NULL , |
||
arma::vec * | addBasisVector2 = NULL |
||
) |
Estimates the squared error of the projection of the input node's matrix onto the current vector subspace.
A normal distribution is fit using weighted norms of projections of samples drawn from the input node's matrix columns. The error is calculated as the difference between the Frobenius norm of the input node's matrix and lower bound of the normal distribution.
node | Node for which Monte Carlo estimate is calculated. |
treeQueue | Priority queue of cosine nodes. |
addBasisVector1 | Address to first additional basis vector. |
addBasisVector2 | Address to second additional basis vector. |
|
inline |
Get number of columns of input matrix in the node.
Definition at line 204 of file cosine_tree.hpp.
References numColumns.
|
inline |
Get pointer to the right child of the node.
Definition at line 201 of file cosine_tree.hpp.
References right.
|
inline |
Get the column index of split point of the node.
Definition at line 210 of file cosine_tree.hpp.
References indices, and splitPointIndex.
|
inline |
Get the indices of columns in the node.
Definition at line 180 of file cosine_tree.hpp.
References indices.
|
private |
Subspace basis of the input dataset.
Definition at line 220 of file cosine_tree.hpp.
Referenced by GetFinalBasis().
|
private |
Orthonormalized basis vector of the node.
Definition at line 234 of file cosine_tree.hpp.
Referenced by BasisVector().
|
private |
Centroid of columns of input matrix in the node.
Definition at line 232 of file cosine_tree.hpp.
Referenced by Centroid().
|
private |
Matrix for which cosine tree is constructed.
Definition at line 214 of file cosine_tree.hpp.
Referenced by GetDataset().
|
private |
Cumulative probability for Monte Carlo error lower bound.
Definition at line 218 of file cosine_tree.hpp.
|
private |
Error tolerance fraction for calculated subspace.
Definition at line 216 of file cosine_tree.hpp.
|
private |
Frobenius norm squared of columns in the node.
Definition at line 242 of file cosine_tree.hpp.
Referenced by FrobNormSquared().
|
private |
Indices of columns of input matrix in the node.
Definition at line 228 of file cosine_tree.hpp.
Referenced by SplitPointIndex(), and VectorIndices().
|
private |
Monte Carlo error for this node.
Definition at line 240 of file cosine_tree.hpp.
Referenced by L2Error().
|
private |
L2-norm squared of columns in the node.
Definition at line 230 of file cosine_tree.hpp.
|
private |
|
private |
Number of columns of input matrix in the node.
Definition at line 238 of file cosine_tree.hpp.
Referenced by NumColumns().
|
private |
Parent of the node.
Definition at line 222 of file cosine_tree.hpp.
|
private |
|
private |
Index of split point of cosine node.
Definition at line 236 of file cosine_tree.hpp.
Referenced by SplitPointIndex().