CMLL : C++'s Machine Learning Library
A Machine Learning library build using just standard C++ from scratch. CMLL uses Standard Template Library with C++17 Standards. The computational requirements are also written from roots. File Handling and manipulation tools are also included making it a fully independent library.
The precompiled binary for x86 and x64 systems can be found here.
The includes are available here.
The source can be found here. Alternatively the repository can be pulled to load the Visual Studio project for building and debugging.
For functional level changes, refer to source
Current (version 0.1.0) :
-
Added new algorithms
- Gaussian Naive Bayes Classifier
- Multinomial Naive Bayes Classifier
- Bernoulli Naive Bayes Classifier
-
Ridge Classifier now supports Multi class classification using One-vs-all technique
-
Added utility checks for Matrices for error handling
-
Faster algorithms, objects are passed by refernce instead of relying on return value optimizations
Version 0.0.3
-
Added new algorithms
- Ridge Regression
- Ridge regression
- KMeans clustering
-
Logistic Regression now converges faster using Newton Raphson's method
Version 0.0.2
-
File Handler now can read file with multi type such as string and numbers. Strings are automatically label Encoded
-
Robust Exceptional Handling
-
Added New algorithm
- K Nearest Neighbors Regressor
- K Nearest Neighbors Classifier
/*
* Creating a Ridge Classifier model in CMLL
*/
#include<iostream>
#include<vector>
#include<Linear/Linear.h> // for ridge classifier
int main()
{
// Sample dataset
std::vector<std::vector<double>> X = { {10,12,23,123},
{13,15,43,223},
{02,12,72,321},
{1,2,13,402},
{110,112,8,553}
};
std::vector<std::vector<double>> y = { {0},{1},{0},{1},{0} };
//Building the model
cmll::linear::RidgeClassifier clf;
clf.model(X,y);
// Predicting
std::vector<std::vector<double>> yPred(X.size(),std::vector<double>(1)); // variable to store predicted values
clf.predict(X,yPred);
//Evaluating
std::cout<<clf.score(y,yPred);
return 0;
}
/*
* Data loading and preprocessing
*/
#include<iostream>
#include<vector>
#include<Data/handler.h> // For file reading
#include<utils/Preprocessing.h> // For preprocessing
#include<utils/Utils.h> // for utility checks
#include<exception>
int main()
{
// Reading a csv file
cmll::data::Handler Features, Labels;
cmll::data::read(Features, "Salary_Features.csv");
cmll::data::read(Labels, "Salary_Labels.csv");
// Array to store their values
std::vector<std::vector<double>> X, y;
Features.values(X);
Labels.values(y);
// Clearing Features and Labels as they are no longer required
Features.clear();
Labels.clear();
// Using utility checks to make sure they are safe to be used.
//Checking if the vectors have Nan values
if (cmll::utils::check::hasNaN(X) || cmll::utils::check::hasNaN(y))
{
std::cout << "Dataset has NaN values!";
return 0;
}
// Checking if X and Y are in correct shapes
try
{
cmll::utils::check::Xy(X, y); // throws invalid length if not required length
}
catch (const std::length_error& e)
{
std::cout << e.what();
return 0;
}
// Splitting X and y into train and test sets by 80% and 20% ratio
std::vector<std::vector<double>> XTrain, XTest, yTrain, yTest;
cmll::preprocessing::split(X, XTrain, XTest, 0.8);
cmll::preprocessing::split(y, yTrain, yTest, 0.8);
}