You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have a corpus of about 500,000 protein sequences and would like to apply them to existing models like this one for predicting the evolution of monoclonal antibody binding to an epitope.
How could I add my sequences to the models referred in this repo to then use the modified model for such task? Thanks.
The text was updated successfully, but these errors were encountered:
Hi, this very much depends on whether you have functional binding data for these 500k sequences. If you do, then you can format them in a csv file with the sequence and their measured binding data and just use that as your first round data to the model. If you just have the sequences without any functional binding data, then you can only fine-tune the base PLM (ESM2 in this case), please refer to fine-tuning of ESM2 on their official github and follow the advice there. After you complete the fine-tuning, you can use your model as the base layer PLM to generate embedding for EVOLVEpro.
Hi, I have a corpus of about 500,000 protein sequences and would like to apply them to existing models like this one for predicting the evolution of monoclonal antibody binding to an epitope.
How could I add my sequences to the models referred in this repo to then use the modified model for such task? Thanks.
The text was updated successfully, but these errors were encountered: