Diabetes Prediction Project

Overview

This project uses machine learning to predict diabetes based on health metrics. It employs the Support Vector Machine (SVM) with a linear kernel for classification.

Dataset

The dataset used is diabetes.csv, containing the health information of individuals. The target variable is Outcome, where:

0: Non-diabetic
1: Diabetic

Features

Pregnancies: Number of times pregnant
Glucose: Plasma glucose concentration in a 2-hour oral glucose tolerance test
BloodPressure: Diastolic blood pressure (mm Hg)
SkinThickness: Triceps skin fold thickness (mm)
Insulin: 2-Hour serum insulin (mu U/ml)
BMI: Body mass index (weight in kg/(height in m)^2)
DiabetesPedigreeFunction: Diabetes pedigree function
Age: Age in years

Files

diabetes.csv: Dataset file containing health metrics
diabetes_prediction.ipynb: Jupyter notebook with detailed analysis and code
requirements.txt: List of Python dependencies

Installation

To run this project locally, follow these steps:

Clone the repository:

git clone https://github.com/your-username/diabetes-prediction.git
cd diabetes-prediction

Installing dependencies:
```
pip install -r requirements.txt
```

Workflow

Data Loading and Preprocessing:
- Load the dataset (diabetes.csv).
- Separate features and target variable (Outcome).
Feature Standardization:
- Standardize the features using StandardScaler to ensure each feature contributes equally to the model.
Model Training:
- Split the dataset into training and testing sets (80%-20%).
- Train an SVM model with a linear kernel on the training data.
Model Evaluation:
- Evaluate the model on both training and testing data using the accuracy score.
Prediction:
- Perform predictions on new input data to determine if the person is diabetic or not.

Example Prediction

# Example prediction for new input data
input_sample = np.array([5, 166, 72, 19, 175, 22.7, 0.6, 51]).reshape(1, -1)
input_df = pd.DataFrame(input_sample, columns=X.columns)  # Assuming X has feature names
input_standardized = scaler.transform(input_df)
prediction = clf.predict(input_standardized)

if prediction[0] == 0:
    print("Prediction: Person is not diabetic")
else:
    print("Prediction: Person is diabetic")

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Diabetes+Prediction+Project.ipynb		Diabetes+Prediction+Project.ipynb
README.md		README.md
diabetes.csv		diabetes.csv
diabetes_prediction.py		diabetes_prediction.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diabetes Prediction Project

Overview

Dataset

Features

Files

Installation

Workflow

Example Prediction

About

Releases

Packages

Languages

Tusharb331/Diabetes-Prediction

Folders and files

Latest commit

History

Repository files navigation

Diabetes Prediction Project

Overview

Dataset

Features

Files

Installation

Workflow

Example Prediction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages