Skip to content

Latest commit

 

History

History
36 lines (22 loc) · 2.17 KB

README.md

File metadata and controls

36 lines (22 loc) · 2.17 KB

Identification Of Disaster Tweets

Natural language processing (NLP) model to classify whether a given tweet is related to a disaster or not. The goal is to create a system that can automatically distinguish between tweets that discuss real disasters ( earthquakes, wildfires, floods ) and those that do not.After training the model, it was able to recognize the test data correctly with 82% accuracy.

Overview

  • Data Collection: The datasets contains a set of tweets which have been divided into a training and a test set. The training set contains a target column identifying whether the tweet pertains to a real disasteror not. https://www.kaggle.com/competitions/nlp-getting-started/data

  • Exploratory Data Analysis (EDA): It involves visually and statistically exploring our dataset to understand its structure, patterns, relationships, and potential issues.(missing value,balanced,discover trend)

  • Data Preprocessing: Clean and preprocess the text data. This typically involves tokenization, removing special characters, converting text to lowercase, and removing stopwords.

  • Feature extraction: Transform the processed text data into numerical features that can be used by ML model Some common techniques for text feature extraction.In this project, we applied BOW and TF-IDF models on train and test datasets.

  • Model Selection: In this project, we evaluated three models Logistic Regression, Naive Bayes and SVM on two BOW and TF-IDF methods to find out which method works better on our dataset.

  • Model Training: Finally, according to the obtained results, the BOW method worked better on the dataset, so we trained the models with the vectors obtained from this method.

  • Model Evaluation: After training, we evaluated the model with test data, and the logistic regression model performed better than other models on this dynasty with 82% accuracy.

🔗 Contact

Gmail linkedin