This project aims to predict the survival of passengers on the Titanic using machine learning models. It involves analyzing the Titanic dataset, which contains information about passengers' demographics and whether they survived or not.
The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during its maiden voyage, the Titanic struck an iceberg and tragically sank. Many lives were lost, but some passengers and crew members survived. This project uses data science and machine learning techniques to explore the factors that influenced survival and build predictive models.
The dataset used in this project is sourced from Kaggle and contains the following columns:
- PassengerId: Unique identifier for each passenger
- Survived: 0 if the passenger did not survive, 1 if they survived (target variable)
- Pclass: Ticket class (1st, 2nd, or 3rd class)
- Name: Passenger's name
- Sex: Passenger's gender
- Age: Passenger's age
- SibSp: Number of siblings or spouses aboard
- Parch: Number of parents or children aboard
- Ticket: Ticket number
- Fare: Fare paid for the ticket
- Cabin: Cabin number
- Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
The project is organized as follows:
data/
: Contains the dataset files.notebooks/
: google colab notebooks for data exploration, preprocessing, and modeling.src/
: Python source code for utility functions and data preprocessing.models/
: Saved machine learning model.requirements.txt
: List of Python packages required to run the code.