Skip to content

Predict probability of a potential sale using classifier models on a website analytics data set.

License

Notifications You must be signed in to change notification settings

RohanKarthikeyan/ODSCEurope-2021

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

ODSC Europe 2021 Data Science Challenge

Aim: Predict the probability of a potential sale on a website analytics data set.

☑️ What did I do?

  • First, I performed some simple exploratory data anlysis (EDA) tasks: descriptive statistics using describe(); and obtaining the number of unique values in the columns using nunique.
    • This helped me find some outliers in the data, which I removed.
  • Secondly, I performed visual EDA tasks, e.g., a correlation heatmap, that helped me gain 5 key insights into the nature of the given data set.
  • Thirdly, I preprocessed the data by performing:
    • One-hot encoding on the categorical variables; and
    • Missing value imputation by creating 'missing value' indicators.
  • Fourthly, I performed feature engineering, more specifically, mutual information, to help extract the relative potential of the features as a predictor of the target, considered by itself.
  • Fifthly, I created new features and performed scaling (though it was not required, ouch!)
  • Lastly, I trained a Logistic regression model among 2 other models to achieve an ROC-AUC of 0.92.

S.D.G.

About

Predict probability of a potential sale using classifier models on a website analytics data set.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published