Dataset for the project was taken from kaggle which had survey questionnaire answered by individuals across the globe. Dataset includes around one million rows depicting users answer to particular questions in the range of 1 to 5 where 1 means strongly disagreeing to the situation and 5 means strongly agreeing to the situation which helped us determining the personality type of the person. Dataset was cleaned and null values were removed from dataset. For better manageability the negative questions (having don’t and not) were converted to positive questions Exploratory data analytics was performed to perceive insights about the dataset.
K Means Clustering was applied on the dataset in order to determine the individual’s personality. Then with the help of Factorial Analysis it was determined to divide the 50 variables into 5 factors as they showed the maximum variance of these 50 variables. A new data frame having 1 million responses across 5 factors was constructed and clustering was applied on these factors.
5 Clusters were made in order to determine which cluster represented which trait of person.
Reasearch paper which I studied, To know why Factor Aalysis, When Factor Analysis and some test to confirm it
##Future Work Deployment with Beautiful UI.