This Deep Learning project intents preddict if a patient will miss an medical appointment or not. This project was developed for the deep learning subject of the University of Brasília, Gama campus.
No-show medical appointments are a issue that affects every health public system in the world. One study found that a hospital reported 62 no-show appointments per day resulting in an estimated cost of $3 million. Another study shows that the average cost of no-show per patient was $196 in 2008.
So, altought this theme is vastly debated, our main goal is to build a model that can help clinics and public hospitals to reduce no-shows.
Assuming that adding some new features to this dataset can improve its accuracy, we merged a weather timeseries dataset to check what was the weather at appointment day. We had this insight by reading some articles debating this no-show issue (citation needed).
- Whe used this amazing tools and libraries:
- Python 3.7;
- Jupyter Notebook;
- Pandas;
- Keras and Tensorflow.
- Get a historical weather series - more
- Get the dataset - more
- Add day average temperature feature - more
- Develop training model - more
- Exploratory data analysis - more
- Resampling tests on the model - more (which didn't worked)
- Generate model version - more
We also used a dataset found at Kaggle and scraped some weather data of Inmet.
This work is divided by three main notebooks.
- Dataset Treatment - Here we merge our datasets, fixes some columns and filter some data.
- Exploratory Data Analysis - Here we show our study, check some correlations and do some fancy things like Self Organizing Maps.
- Modeling - Here we develop and train our model
- At first, we had some issues trying to improve RNN's preprocessing.
- We used a dataset from www.kaggle.com that gave us some good info about the patient, like its condition and neighbohood
- We tried to join two data sets into one. The first one is the "No show appointment" and the second one "weather forecast", which would bring info about the forecast for each day of the appointments that we have. We ran some scripts to join it right, and worked great! But the info about temperature didn't impact the results as much as we had expected.
- Our model reached an accuracy of 70%. If we had even more info about this appointments, we could work a way to improve even higher the accuracy that we already have.
- We also tried to resample the dataset to get better results, but the accuracy dropped by 20%.
Before getting some work done, you need to run the following:
-
Install requirements:
pip install -r requirements.txt
-
To run jupyter notebooks you should open \nootebook folder, or it may not run.
See our CONTRIBUTING.md file
Icaro Oliveira @icarooliv
Gustavo Carvalho @gustavocarvalho1002
Rodrigo Dadamos @rdadamos