Skip to content

A spam filter for detecting spam sms messages using ML

Notifications You must be signed in to change notification settings

nata1y/SMS-Spam-Detection

 
 

Repository files navigation

SMS Spam Detection Using Machine Learning

This project is used a starting point for the course Release Engineering for Machine Learning Applications (REMLA) taught at the Delft University of Technology by Prof. Luís Cruz and Prof. Sebastian Proksch.

The codebase was originally adapted from: https://github.com/rohan8594/SMS-Spam-Detection

Instructions for Compiling

a) Clone repo.

$ git clone https://gitlab.com/nata1y/SMS-Spam-Detection
$ cd SMS-Spam-Detection
$ mkdir output
$ mkdir dataset

The easiest way to run our project is using the instructions located in b3!

b) Install dependencies.

$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

b2) Alternatively, use Docker for dependencies and volumes.

$ docker build --progress plain . -t docker-sms
$ docker run -it --rm -v ${PWD}:/root/project -p "8080:8080" docker-sms
~# $ cd project

c) Run various scripts

$ python train_model/get_data.py
$ python train_model/read_data.py
$ python train_model/text_preprocessing.py
$ python train_model/text_classification.py

d) Serve the model as a REST API

NOTE: add host="0.0.0.0" parameter to app.run call in deploy_model/serve_model.py. (default 127.0.0.1 does not work in Docker)

$ python deploy_model/serve_model.py

b3) Or, use docker-compose and automatically train and host.

Be aware, the regression model is trained in this step and takes a while.

For Linux:

$ docker-compose -f docker-compose.train.yml build
$ docker-compose -f docker-compose.train.yml up -d && ./get_training_data.sh && docker-compose -f docker-compose.train.yml down

For Windows:

$ docker-compose -f docker-compose.train.yml build
$ docker-compose -f docker-compose.train.yml up -d && ./get_training_data.bat && docker-compose -f docker-compose.train.yml down

From now on, use this command to run the system without retraining everything.

docker-compose up --build

e) Production endpoint

Retrieves and splits the dataset from the first 1000 labels on which the model is trained. Generate the drifts based on the incoming data for experimentation. Get the predictions via HTTP requests from the model like in an actual deployment setup.

NOTE: to get predictions from inside another docker container use docker run -it --rm -v "$(pwd)":/root/project --net=host docker-sms, since the port is already opened for the server, but you want to connect to its local network. OR: if you use docker-compose run docker exec -it <container_id> bash to run the deploy script.

$ python production_endpoint/get_data.py
$ python production_endpoint/generate_drifts.py
$ python production_endpoint/get_predictions.py

You can test the API using the following:

$ curl -X POST "http://127.0.0.1:8080/predict" -H  "accept: application/json" -d "{sms: hello world!}"
or
$ curl -X POST "http://127.0.0.1:8080/predict" -H  "Content-Type: application/json" -d '{"sms": "hello world!"}'

Alternatively, you can access the UI using your browser: http://127.0.0.1:8080/apidocs To view Prometheus you can navigate to http://127.0.0.1:9090 To view Grafana you can navigate to http://127.0.0.1:3000/. You then need to link the Prometheus api as dataset by setting the host as http://prometheus:9090/ and add the relavent metrics to a new dashboard. These settings are then saved for future use.

About

A spam filter for detecting spam sms messages using ML

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.3%
  • Dockerfile 1.4%
  • Other 0.3%