Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

07. Model for Stain5 #10

Open
EchteRobert opened this issue Jun 6, 2022 · 2 comments
Open

07. Model for Stain5 #10

EchteRobert opened this issue Jun 6, 2022 · 2 comments
Assignees

Comments

@EchteRobert
Copy link
Collaborator

EchteRobert commented Jun 6, 2022

In this issue I will post all the results on Stain5 with various models and with various evaluation metrics. I first trained a model on 5 plates from Stain2, Stain3, and Stain4, for a total of 15 training plates. I then use this model to run inference directly on Stain5. After that I fine-tune the model (transfer learning) using 3 plates from Stain5 and 10, 20, 40, and 80% of the available compounds for training/fine-tuning. I also tested this with 1 plate and the same fractions, but multiple plates are required for generalizing to the feature patterns of Stain5.
I fine-tune the model by training for 100 epochs and then taking the best validation mAP model. There are perhaps better ways to do this, but this is a first proof-of-concept experiment.

Main takeaways

  • As expected, the model does not directly translate to Stain5. This is not due plate/batch effects as those are much smaller. Instead, it is due to different experimental conditions which cause a shift in the features distributions, so that the model is no longer able to correctly aggregated the single cell data. See General data analysis #7 (comment) for the hierarchical cluster map showing that Stain2, Stain3, and Stain4 are in a different cluster than Stain5 altogether.
  • Secondly, fine-tuning the model does increase the performance on plates that are similar to the training data (i.e. using confocal plates will increase performance on confocal plates and using widefield will increase performance on widefield, not both at the same time). However, in order to generalize to unseen compounds I need to use more than 3 plates. This is the same issue I was having before with training the models: using 1 training plate does not generalize to unseen plates and using 3 training plates does not generalize to unseen compounds. So I probably need to use 5 or more plates to generalize to this type of data.

Results

Out of distribution model
plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM Batch
BR00120530 0.26 0.28 0.23 0.39 58.9 58.9 CondA PE
BR00120530confocal 0.03 0.29 0.03 0.4 5.6 56.7 CondA PE
BR00120526confocal 0.03 0.29 0.02 0.36 3.3 58.9 CondA Thermo
BR00120526 0.35 0.28 0.38 0.37 72.2 56.7 CondA Thermo
BR00120536confocal 0.06 0.25 0.05 0.37 17.8 55.6 CondB PE
BR00120536 0.02 0.25 0.03 0.35 4.4 54.4 CondB PE
BR00120532confocal 0.05 0.24 0.06 0.35 8.9 50 CondB Thermo
BR00120532 0.15 0.23 0.18 0.38 36.7 50 CondB Thermo
BR00120274 0.17 0.23 0.21 0.34 31.1 54.4 CondC PE
BR00120274confocal 0.03 0.21 0.03 0.36 4.4 52.2 CondC PE
BR00120270 0.25 0.26 0.29 0.38 55.6 48.9 CondC Thermo
BR00120270confocal 0.02 0.26 0.03 0.39 2.2 52.2 CondC Thermo
Fine-tuned model 10%
plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM Batch
BR00120530 0.22 0.28 0.2 0.39 43.3 58.9 CondA PE
BR00120530confocal 0.03 0.29 0.03 0.4 5.6 56.7 CondA PE
BR00120526confocal 0.03 0.29 0.02 0.36 3.3 58.9 CondA Thermo
BR00120526 0.3 0.28 0.36 0.37 53.3 56.7 CondA Thermo
BR00120536confocal 0.06 0.25 0.05 0.37 10 55.6 CondB PE
BR00120536 0.03 0.25 0.04 0.35 5.6 54.4 CondB PE
BR00120532confocal 0.05 0.24 0.07 0.35 11.1 50 CondB Thermo
BR00120532 0.13 0.23 0.18 0.38 27.8 50 CondB Thermo
BR00120274 0.16 0.23 0.2 0.34 27.8 54.4 CondC PE
BR00120274confocal 0.03 0.21 0.04 0.36 5.6 52.2 CondC PE
BR00120270 0.23 0.26 0.28 0.38 42.2 48.9 CondC Thermo
BR00120270confocal 0.03 0.26 0.03 0.39 4.4 52.2 CondC Thermo
Fine-tuned model 20%
plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM Batch
BR00120530 0.22 0.28 0.23 0.39 57.8 58.9 CondA PE
BR00120530confocal 0.03 0.29 0.02 0.4 5.6 56.7 CondA PE
BR00120526confocal 0.02 0.29 0.02 0.36 2.2 58.9 CondA Thermo
BR00120526 0.32 0.28 0.31 0.37 75.6 56.7 CondA Thermo
BR00120536confocal 0.05 0.25 0.05 0.37 17.8 55.6 CondB PE
BR00120536 0.09 0.25 0.03 0.35 23.3 54.4 CondB PE
BR00120532confocal 0.05 0.24 0.06 0.35 10 50 CondB Thermo
BR00120532 0.27 0.23 0.21 0.38 61.1 50 CondB Thermo
BR00120274 0.22 0.23 0.21 0.34 53.3 54.4 CondC PE
BR00120274confocal 0.03 0.21 0.04 0.36 8.9 52.2 CondC PE
BR00120270 0.33 0.26 0.34 0.38 75.6 48.9 CondC Thermo
BR00120270confocal 0.03 0.26 0.04 0.39 3.3 52.2 CondC Thermo
Fine-tuned model 40%
plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM Batch
Fine-tune plates
BR00120526 0.35 0.28 0.34 0.37 78.9 56.7 CondA Thermo
BR00120532 0.27 0.23 0.21 0.38 57.8 50 CondB Thermo
BR00120270 0.36 0.26 0.33 0.38 78.9 48.9 CondC Thermo
Hold-out plates
BR00120530 0.26 0.28 0.22 0.39 65.6 58.9 CondA PE
BR00120530confocal 0.03 0.29 0.02 0.4 3.3 56.7 CondA PE
BR00120526confocal 0.02 0.29 0.03 0.36 3.3 58.9 CondA Thermo
BR00120536confocal 0.05 0.25 0.04 0.37 17.8 55.6 CondB PE
BR00120536 0.06 0.25 0.04 0.35 14.4 54.4 CondB PE
BR00120532confocal 0.05 0.24 0.06 0.35 15.6 50 CondB Thermo
BR00120274 0.25 0.23 0.22 0.34 54.4 54.4 CondC PE
BR00120274confocal 0.03 0.21 0.03 0.36 4.4 52.2 CondC PE
BR00120270confocal 0.03 0.26 0.03 0.39 6.7 52.2 CondC Thermo
Fine-tuned model 80%
plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM Batch
Fine-tune plates
BR00120526 0.37 0.28 0.39 0.37 85.6 56.7 CondA Thermo
BR00120532 0.3 0.23 0.26 0.38 78.9 50 CondB Thermo
BR00120270 0.39 0.26 0.36 0.38 87.8 48.9 CondC Thermo
Hold-out plates
BR00120530 0.25 0.28 0.23 0.39 63.3 58.9 CondA PE
BR00120530confocal 0.03 0.29 0.03 0.4 8.9 56.7 CondA PE
BR00120526confocal 0.03 0.29 0.03 0.36 2.2 58.9 CondA Thermo
BR00120536confocal 0.06 0.25 0.06 0.37 23.3 55.6 CondB PE
BR00120536 0.05 0.25 0.03 0.35 26.7 54.4 CondB PE
BR00120532confocal 0.05 0.24 0.08 0.35 17.8 50 CondB Thermo
BR00120274 0.26 0.23 0.26 0.34 60 54.4 CondC PE
BR00120274confocal 0.03 0.21 0.04 0.36 3.3 52.2 CondC PE
BR00120270confocal 0.03 0.26 0.03 0.39 8.9 52.2 CondC Thermo
@EchteRobert EchteRobert self-assigned this Jun 6, 2022
@EchteRobert
Copy link
Collaborator Author

Number of cells per well per plate histogram

Stain5_cells

@EchteRobert
Copy link
Collaborator Author

EchteRobert commented Jun 13, 2022

Model trained on Stain5

I trained a model using 5 plates from Stain5, leaving only one plate out (the rest of the plates are confocal). The goal is to see if the same training approach as for Stain234 can be used to generalize to Stain5 plates.

Main takeaways

  • The model is not generalizing to Stain5 as easily as it was to the other Stain experiments. Previously models trained on 3 training plates were already able to generalize to most validation compounds. Here, this is not the case even with 5 training plates. There are two main differences which may be making it harder for the model to identify the profiles:
    • The compound concentration is nearly halved (3um --> 1.875um)
    • The cell seeding is halved (2/2.5k cells/well --> 1k cells/well)
  • There is something weird going on with plate BR00120536, as it is a training plate, but the model is not able to learn it's representations. I will not investigate this issue further as it currently not a priority.
Stain5 results
plate Training mAP model Training mAP BM Validation mAP model Validation mAP BM PR model PR BM Batch
Training plates
BR00120526 0.58 0.28 0.39 0.37 100 56.7 CondA Thermo
BR00120536 0.17 0.25 0.03 0.35 56.7 54.4 CondB PE
BR00120532 0.45 0.23 0.25 0.38 90 50 CondB Thermo
BR00120274 0.49 0.23 0.24 0.34 87.8 54.4 CondC PE
BR00120270 0.53 0.26 0.35 0.38 97.8 48.9 CondC Thermo
Validation plates
BR00120530 0.32 0.28 0.22 0.39 78.9 58.9 CondA PE
BR00120530confocal 0.03 0.29 0.02 0.4 3.3 56.7 CondA PE
BR00120526confocal 0.03 0.29 0.02 0.36 3.3 58.9 CondA Thermo
BR00120536confocal 0.05 0.25 0.06 0.37 17.8 55.6 CondB PE
BR00120532confocal 0.05 0.24 0.08 0.35 14.4 50 CondB Thermo
BR00120274confocal 0.03 0.21 0.04 0.36 11.1 52.2 CondC PE
BR00120270confocal 0.03 0.26 0.03 0.39 10 52.2 CondC Thermo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant