Skip to content

collegevine/ds-hiring

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Take Home Problem

We've supplied some data on students and their admissions results at CV University, which can be found in train.csv and test.csv. The data contains the following columns:

  • gpa: The student's GPA
  • sat: The student's combined (out of 1600) SAT score
  • ethnicity: The student's ethnicity
  • ap_scores: A semicolon-separated list of the student's AP test scores
  • essay_strength: A value representing the strength of the student's essays on a 1-5 scale
  • family_income: The family income of the student
  • accepted: Whether or not the student was accepted to CV University (1 = Accepted, 0 = Rejected)

Apply your normal data scientific modeling process to this data to build a model to predict the chances (probability) of each student in the test set being accepted. For CV users, knowing their chances is more useful than knowing whether or not we think they will get in to a certain school, so we predict probabilities instead of outcomes in our chancing model.

Feel free to use any methods, models, etc. that you wish, but please don't spend more than two hours on this problem (and ideally less). Don't worry about writing up lengthy notes, spending time make graphics visually stunning, or doing anything extremely time or computation intensive. We're just trying to get an idea for how you do data science -- we're not expecting perfection in a couple hours.

Please put together a notebook (Rmarkdown, Jupyter, etc.) showing your analysis and a csv file with your model's predictions, and plan on talking through it in your interview. Finally, please, please, please let us know if you have any questions.

Have fun, and happy modeling!

About

CollegeVine resources for data science hiring

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published