Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Parkinson's Disease Detection #1

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Detecting_parkinsons_disease.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"metadata":{"kernelspec":{"language":"python","display_name":"Python 3","name":"python3"},"language_info":{"pygments_lexer":"ipython3","nbconvert_exporter":"python","version":"3.6.4","file_extension":".py","codemirror_mode":{"name":"ipython","version":3},"name":"python","mimetype":"text/x-python"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"code","source":"# install necessary packages ( install first time only )\n# !pip install numpy pandas sklearn xgboost --upgrade","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# This Python 3 environment comes with many helpful analytics libraries installed\n# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python\n# For example, here's several helpful packages to load\n\nimport numpy as np # linear algebra\nimport pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)\n\n# Input data files are available in the read-only \"../input/\" directory\n# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory\n\nimport os\nfor dirname, _, filenames in os.walk('/kaggle/input'):\n for filename in filenames:\n print(os.path.join(dirname, filename))\n\n# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using \"Save & Run All\" \n# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"* Install Necessary packages here","metadata":{}},{"cell_type":"code","source":"# os packages\nimport os, sys","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Data Collection","metadata":{}},{"cell_type":"code","source":"# let’s read the data into a DataFrame \n\ndf = pd.read_csv('/kaggle/input/parkinsons.data')\ndf.tail() # shows the last 5 rows\n\n# head() <= Use for first 5 rows","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# descrive the data\n\ndf.describe()","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# To know how many rows and cols and NA values\n\ndf.info()","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"- we can see here there are 135 records and 24 columns available in this dataset","metadata":{}},{"cell_type":"code","source":"# shape of the dataset \n\ndf.shape","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Feature Enginiearing\n","metadata":{}},{"cell_type":"code","source":"# get the all features except \"status\"\n\nfeatures = df.loc[:, df.columns != 'status'].values[:, 1:] # values use for array format\n\n\n\n# get status values in array format\n\nlabels = df.loc[:, 'status'].values\n\n","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# to know how many values for 1 and how many for 0 labeled status\n\ndf['status'].value_counts()","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"\n# import MinMaxScaler class from sklearn.preprocessing\n\nfrom sklearn.preprocessing import MinMaxScaler","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"\n# Initialize MinMax Scaler classs for -1 to 1\n\nscaler = MinMaxScaler((-1, 1))\n\n# fit_transform() method fits to the data and\n# then transforms it.\n\nX = scaler.fit_transform(features)\ny = labels\n\n# Show X and y here\n# print(X, y)","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# import train_test_split from sklearn. \n\nfrom sklearn.model_selection import train_test_split","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"code","source":"# split the dataset into training and testing sets with 20% of testings\n\nx_train, x_test, y_train, y_test=train_test_split(X, y, test_size=0.15)\n","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Model Training\n","metadata":{}},{"cell_type":"code","source":"# Load an XGBClassifier and train the model\n\nfrom xgboost import XGBClassifier\nfrom sklearn.metrics import accuracy_score","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"\n\n\n* To Know more about **[\"Xtreme Gradient Boosting Algorithm\"](https://data-flair.training/blogs/gradient-boosting-algorithm/)**\n","metadata":{}},{"cell_type":"code","source":"# make a instance and fitting the model\n\nmodel = XGBClassifier()\nmodel.fit(x_train, y_train) # fit with x and y train\n","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Model Prediction\n","metadata":{}},{"cell_type":"code","source":"# Finnaly pridict the model\n\ny_prediction = model.predict(x_test)\n\nprint(\"Accuracy Score is\", accuracy_score(y_test, y_prediction) * 100)","metadata":{"trusted":true},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":"# Summary","metadata":{}},{"cell_type":"markdown","source":"<p>\nIn this Python machine learning project, we learned to detect the presence of Parkinson’s Disease in individuals using various factors. We used an XGBClassifier for this and made use of the sklearn library to prepare the dataset. This gives us an accuracy of <b> 96.66%</b>, which is great considering the number of lines of code in this python project.\n</p>","metadata":{}}]}