DiagnoAI

DiagnoAI is a tool to detect a disease from a text description of the patient's symptoms and daily condition. It is based on a transformer model called BERT, fine-tuned for 24 common diseases.

1. Dataset

We created a dataset containing 24 disease and 50 manually written descriptions of the symptoms (in english) for each disease. The disease names, symptoms and precautions where chosen Disease Symptom Prediction dataset [1] from Kaggle.

Hence, a total 1200 descriptions were created, out of which 80% was used for model training and remaining 20% for validation and testing purposes. An example of a data instance:

Description : There are small red spots all over my body that I can't explain. It's worrying me. I feel extremely tired and experience a mild fever every night.

Disease: Chicken Pox

2. Model Training

Because of limited data, we decided to fine tune a pretrained language model. We chose the pre trained BERT model from Hugging Face and its corresponding tokenizer for tokenizing the sentences. TensorFlow was used as the base framework for loading and training the model.

Upon experimentation, we found that unfreezing the BERT layer helped acheive a better training and validation accuracy. Hence, we decided to go keep the complete model trainable.

The model was trained with the following parameters:

Loss function: SparseCategoricalCrossentropy Optimizer: Adam Learning Rate: 0.00003 Epochs: 5

3. Testing

After training, we acheived a training accuracy of 100.00% and vadlidation accuracy of 98.33%. Although, the misclassification rate is quite low, we can't be completely sure of the model's predictions as it trained on a relatively smaller corpus.

We plan to increase the dataset in future so that the model can generalize better and not suffer from overconfidence.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitGraph		.gitGraph
presentation		presentation
static		static
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
pred_utils.py		pred_utils.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiagnoAI

Contents

1. Dataset

2. Model Training

3. Testing

4. References

About

Releases

Packages

Contributors 2

Languages

FaizalKarim280280/DiagnoAI

Folders and files

Latest commit

History

Repository files navigation

DiagnoAI

Contents

1. Dataset

2. Model Training

3. Testing

4. References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages