-
Notifications
You must be signed in to change notification settings - Fork 647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deprecate sequence_alignment() and replace deprecated Bio.pairwise2 with Bio.Align.PairwiseAligner #3951
deprecate sequence_alignment() and replace deprecated Bio.pairwise2 with Bio.Align.PairwiseAligner #3951
Changes from 1 commit
1564039
efed979
cd03bb8
163c20a
cd55591
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -187,13 +187,14 @@ | |
import os.path | ||
import warnings | ||
import logging | ||
import collections | ||
|
||
import numpy as np | ||
|
||
import Bio.SeqIO | ||
import Bio.AlignIO | ||
import Bio.Align | ||
import Bio.Align.Applications | ||
import Bio.pairwise2 | ||
|
||
import MDAnalysis as mda | ||
import MDAnalysis.lib.qcprot as qcp | ||
|
@@ -973,7 +974,7 @@ def sequence_alignment(mobile, reference, match_score=2, mismatch_penalty=-1, | |
|
||
The residues in `reference` and `mobile` will be globally aligned. | ||
The global alignment uses the Needleman-Wunsch algorithm as | ||
implemented in :mod:`Bio.pairwise2`. The parameters of the dynamic | ||
implemented in :mod:`Bio.Align.PairwiseAligner`. The parameters of the dynamic | ||
programming algorithm can be tuned with the keywords. The defaults | ||
should be suitable for two similar sequences. For sequences with | ||
low sequence identity, more specialized tools such as clustalw, | ||
|
@@ -1004,22 +1005,46 @@ def sequence_alignment(mobile, reference, match_score=2, mismatch_penalty=-1, | |
|
||
See Also | ||
-------- | ||
BioPython documentation for `pairwise2`_. Alternatively, use | ||
BioPython documentation for `PairwiseAligner`_. Alternatively, use | ||
:func:`fasta2select` with :program:`clustalw2` and the option | ||
``is_aligned=False``. | ||
|
||
.. _`pairwise2`: http://biopython.org/DIST/docs/api/Bio.pairwise2-module.html | ||
|
||
.. _`PairwiseAligner`: | ||
https://biopython.org/docs/latest/api/Bio.Align.html#Bio.Align.PairwiseAligner | ||
|
||
|
||
.. versionadded:: 0.10.0 | ||
|
||
""" | ||
aln = Bio.pairwise2.align.globalms( | ||
reference.residues.sequence(format="string"), | ||
mobile.residues.sequence(format="string"), | ||
match_score, mismatch_penalty, gap_penalty, gapextension_penalty) | ||
# choose top alignment | ||
return aln[0] | ||
.. versionchanged:: 2.4.0 | ||
Replace use of deprecated :func:`Bio.pairwise2.align.globalms` with | ||
:class:`Bio.Align.PairwiseAligner`. | ||
|
||
""" | ||
aligner = Bio.Align.PairwiseAligner( | ||
mode="global", | ||
match_score=match_score, | ||
mismatch_score=mismatch_penalty, | ||
open_gap_score=gap_penalty, | ||
extend_gap_score=gapextension_penalty) | ||
aln = aligner.align(reference.residues.sequence(format="Seq"), | ||
mobile.residues.sequence(format="Seq")) | ||
# choose top alignment with highest score | ||
topalignment = aln[0] | ||
|
||
# reconstruct the results tuple that used to be of type Bio.pairwise2.Alignment | ||
AlignmentTuple = collections.namedtuple( | ||
"Alignment", | ||
["seqA", "seqB", "score", "start", "end"]) | ||
# extract sequences (there's no obvious way to get the character | ||
# representation with gaps by other means from the new | ||
# Bio.Align.PairwiseAlignment instance) | ||
seqA, _, seqB, _ = topalignment.format().split("\n") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This only worked until Bio 1.79. Then they changed the format. I knew that parsing str was iffy but luckily our tests showed me right away how iffy that was. I am now trying to find another way to get our old output. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Turns out, it's trivial seqA = topalignment[0]
seqB = topalignment[1] ... testing now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Needs biopython >= 1.80. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess we'll need to offer both options and just do a version check? |
||
|
||
# start/stop are not particularly meaningful and there's no obvious way to | ||
# get the old pairwise2 start/stop from the new PairwiseAligner output. | ||
return AlignmentTuple(seqA, seqB, topalignment.score, | ||
0, max(reference.n_residues, mobile.n_residues)) | ||
|
||
orbeckst marked this conversation as resolved.
Show resolved
Hide resolved
|
||
def fasta2select(fastafilename, is_aligned=False, | ||
ref_resids=None, target_resids=None, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main argument for pairwisealigner is
scoring
do we want/need to pass that in somehow?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are deprecating it then no, let's not add features. At this stage I'd just want to avoid the code suddenly breaking because pairwise2 is gone (and getting rid of DeprecationWarning).