Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goci 2607 tw remove bkgd trait links #3

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

twhetzel
Copy link
Collaborator

The script is written to remove rows from the STUDY_EFO_TRAIT table for traits the curators have marked as "background traits" in the provided data file (study_background_traits-ALL.txt).

This is to address: https://www.ebi.ac.uk/panda/jira/browse/GOCI-2607

for line in file_data:
formatted_line = line.split('\t')

study_accession = formatted_line[2].strip()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line assumes this script will only be run on already published studies with accession IDs. If this assumption is correct there's nothing to do.

if delimiter in background_column:
background_column = background_column.split(delimiter)

for background_trait in background_column:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unnecessary code duplication. You don't need to check if the delimiter is in the string or not.

string = 'acute graft vs. host disease || donor genotype effect measurement' 

delimiter = '||'
for trait in string.split(delimiter):
    print("Trait: {}".format(trait.strip()))
# Trait: acute graft vs. host disease
# Trait: donor genotype effect measurement
 
delimiter = '&&'
for trait in string.split(delimiter):
    print("Trait: {}".format(trait.strip()))
# Trait: acute graft vs. host disease || donor genotype effect measurement    

Works both cases.


for background_trait in background_column:
# Get the EFO_ID
background_trait_id = efo_map[background_trait.strip().lower()]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion: given the file is (more or less) manually created, I would add a test to check if the trait name is in the hash. But that's not particularly important.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants