This project aims to provide researchers and developers basic tools for manipulation of datasets, implementation and test of ML algorithms
and some already implemented methods.
It's not intended to be just a collection of algorithms, but also to auxiliate and create a pattern in future ML algorithms implementations
through a set of interconected modules that can be used in most ML projects.
You can find the documentation at the project page: UFJF-MLTK.
And for examples and other information you can access the wiki.
In order to make the project available for the majority of users and to be cross-platform, the project was adapted to CMake and Meson,the most wide used build systems. Therefore, there are two install methods for the project that can be seen below.
Requirements
- meson or cmake
- g++ >= 8
- c++ >= 17
- gnuplot >= 5 (only for visualization module)
CMake
mkdir build
cd build
cmake ..
make
sudo make install
Meson
meson build
meson compile -C build
meson install -C build
After that, the library will be available system wide and it can be used as any library.
The framework is intended to make easier the usage of machine learning algorithms in C++, in the following example we output the 10-fold cross validation accuracy of the kNN algorithm with 3 neighbors, as we can see, we can do it with few lines of code.
main.cpp
#include <ufjfmltk/Core.hpp>
#include <ufjfmltk/Validation.hpp>
#include <ufjfmltk/Classifier.hpp>
int main(){
mltk::Data<double> data("iris.data");
mltk::classifier::KNNClassifier<double> knn(data, 3);
std::cout << "Dataset size: " << data.size() << std::endl;
std::cout << "Dataset dimension: " << data.dim() << std::endl;
std::cout << "KNN accuracy: ";
std::cout << mltk::validation::kfold(data, knn, 10, 42, 0).accuracy
<< "%" << std::endl;
}
Compiling:
g++ -std=c++17 main.cpp -o main -lufjfmltk
This program outputs the following:
Dataset size: 150
Dataset dimension: 4
KNN accuracy: 100%
- Data manipulation
- Artificial datasets
- Data visualization
- Classifiers (Primal and Dual)
- Ensemble
- Regression
- Validation (K-Fold Cross-Validation)
- Feature Selection
- Documentation
Mateus Coutinho Marim ([email protected])
Saulo Moraes Villela ([email protected])
Alessandreia Marta de Oliveira Julio ([email protected])
Universidade Federal de Juiz de Fora
Departamento de Ciência da Computação