Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distance_to_locus is inf for loci on opposite strands, feature or a bug? #280

Open
rraadd88 opened this issue Feb 15, 2023 · 2 comments
Open

Comments

@rraadd88
Copy link
Contributor

Locus object's distance_to_locus attribute calculates the distance between the loci on opposite strands, as inf.

Demo code:

from pyensembl.locus import Locus
## loci
locus1=Locus(contig="1", start=1, end=2,strand='+')
locus2=Locus(contig="1", start=4, end=5,strand='-')
## distance
print(locus1.distance_to_locus(locus2))
## change the strands to be the same
locus2.strand='+'
## distance
print(locus1.distance_to_locus(locus2))
### output
# inf
# 2

Because unifying the strands gives the correct distance, I wonder if this is a bug, or if it is supposed to be a feature for e.g. maybe to discourage calculating distances between loci on opposite strands.

@iskandr
Copy link
Contributor

iskandr commented Feb 15, 2023

This was intentional since "distance" on opposite strands wasn't useful for my usecases when I started working on PyEnsembl. Since making it strand-invariant changes the semantics, would you be OK with a different helper function?

I'm not sure what to call it distance_to_locus_on_either_strand, strand_invariant_distance_to_locus, ... ?

@rraadd88
Copy link
Contributor Author

Ok, thank you for your answer.

As for the name of the new helper function, to make it easy to find, I would personally like if it would have the distance_to_locus prefix, as in distance_to_locus_on_either_strand. That way, this variant function could be easily found once one knows the name of the original function.

Adding a new helper function would do it. But how about adding a parameter, called ignore_strands for example, to the existing function instead? Could that be easier/simpler? Asking because when I came across this issue, intuitively, the first thing I did was to look for the parameter that could allow me to calculate the distance in a strand-invariant way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants