Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using SIFTS for the alignment of SEQRES to UniProt sequence #188

Closed
lafita opened this issue May 11, 2017 · 2 comments
Closed

Using SIFTS for the alignment of SEQRES to UniProt sequence #188

lafita opened this issue May 11, 2017 · 2 comments
Labels
Milestone

Comments

@lafita
Copy link
Member

lafita commented May 11, 2017

I have been using SIFTS recently and I have realised that we could use the map of SEQRES and UniProt sequence provided as the alignment, rather than computing it ourselves.

This is particularly important to handle the special cases, like artificially designed proteins. One example is 4DOU, a fusion of three chains. Our alignment is not correct, since we correctly map one of the three chains to the UniProt sequence, but the other two are incorrect. If we used the SIFTS alignment (ftp://ftp.ebi.ac.uk/pub/databases/msd/sifts/xml/4dou.xml.gz), all the residues of the SEQRES could have been matched to UniProt residues, and the evolutionary score could be more reliably computed.

@lafita lafita added the minor label May 11, 2017
@lafita lafita added this to the 3.1 milestone May 11, 2017
@josemduarte
Copy link
Contributor

We do use SIFTS to get the mapped uniprot id and region but then we do our own alignment.

I can see the alignment is not so good for that 4DOU case. That's in part an issue introduced in 3 related to some issues in biojava.

In principle I agree that we could just use the SIFTS alignment as it is given. The problem is that for user input (non-deposited files) we still have to align ourselves. So SIFTS only solves part of the problem.

@lafita
Copy link
Member Author

lafita commented May 11, 2017

Ok I see. Well this happens in a very minor number of cases, and engineered proteins are not that interesting for EPPIC. I just submitted the issue because I thought about it.

@lafita lafita modified the milestones: 4.0, 3.1 May 11, 2017
@josemduarte josemduarte modified the milestones: 4.0, 3.2 Feb 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants