Clustering-ML

Problem Statement

Scaler is an online tech-versity offering intensive computer science & Data Science courses through live classes delivered by tech leaders and subject matter experts. The meticulously structured program enhances the skills of software professionals by offering a modern curriculum with exposure to the latest technologies. It is a product by InterviewBit.

You are working as a data scientist with the analytics vertical of Scaler, focused on profiling the best companies and job positions to work for from the Scaler database. You are provided with the information for a segment of learners and tasked to cluster them on the basis of their job profile, company, and other features. Ideally, these clusters should have similar characteristics.

Dataset:

Dataset Link: scaler_kmeans.csv

Data Dictionary:

‘Unnamed 0’ - Index of the dataset

Email_hash - Anonymised Personal Identifiable Information (PII)

Company_hash - This represents an anonymized identifier for the company, which is the current employer of the learner.

orgyear - Employment start date

CTC - Current CTC

Job_position - Job profile in the company

CTC_updated_year - Year in which CTC got updated (Yearly increments, Promotions)

Concept Used:

Manual Clustering

Unsupervised Clustering - K- means, Hierarchical Clustering

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
NC_Clustering.ipynb		NC_Clustering.ipynb
README.md		README.md
scaler_clustering.csv		scaler_clustering.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Clustering-ML

About

Releases

Packages

Languages

Niteshchawla/Clustering-ML

Folders and files

Latest commit

History

Repository files navigation

Clustering-ML

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages