This project aims to predict website ad clicks using various data mining techniques. It is created by Amitesh Tripathi.
The goal of this project is to build a predictive model that can accurately predict if a website ad will be clicked or not. This is achieved by using a variety of data mining techniques, including data preprocessing, feature engineering, exploratory data analysis (EDA), and modeling. The project concludes with an evaluation of the model's performance.
-
DM_Project_Report_Group16_CTR.docx
: The project report provides a detailed explanation of the problem setting, data sources, data description, data collection, data processing, data exploration, data visualization, data mining model selection, evaluation of data mining models, and results. -
Group_16_Project_CTR_Prediction.ipynb
: The Jupyter notebook includes all the code for the project. It is divided into several sections:-
Importing Libraries: This section includes all the necessary libraries required for data analysis and modeling.
-
Importing and Exploring the Dataset: In this section, the dataset is loaded and a preliminary exploration is performed to understand the structure of the data.
-
Data Pre-processing and Feature Engineering: This section involves cleaning the data, handling missing values, and transforming features to make them suitable for modeling.
-
Exploratory Data Analysis (EDA): In this section, the data is analyzed in depth to identify patterns, trends, and outliers.
-
Feature Selection & Feature Scaling: This section involves selecting relevant features for modeling and scaling the features to ensure they contribute equally to the model.
-
Data Modeling: This section involves building the predictive model using appropriate algorithms.
-
Model Evaluation: In this section, the performance of the model is evaluated using appropriate metrics.
-
Conclusion: This section provides a summary of the project and the results obtained.
-
- Clone the repository.
- Open the
Group_16_Project_CTR_Prediction.ipynb
file in Jupyter notebook. - Run all cells in the notebook to perform the data analysis and generate the model.
The project uses Python for data analysis and modeling, as indicated by the .ipynb
Jupyter notebook.