This software was made to study and test several machine learning algorithms for data mining tasks.
The dataset used is SMS Spam Collection Data Set.
Some of the algorithms provided by WEKA were used for the pre-processing, classification and evaluation of the data.
ARFFBuilder
class parses the original SMS Spam Collection Data Set to an ARFF file, which is the format used by WEKA. Both files are provided.
SpamClassifier
class implements the classification of the SMS text messages and the training and evaluation of a classifier.
The PDF file includes the results of the study and an explanation of the software (only in Spanish).
Every .dat file represents a FilteredClassifier
. When you train a classifier on the SMSSpamCollection dataset, the software saves the trained model into a .dat file.
You can download the latest release. The zip file contains the required files to run the application and some trained models.