-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Goci 2607 tw remove bkgd trait links #3
base: master
Are you sure you want to change the base?
Conversation
for line in file_data: | ||
formatted_line = line.split('\t') | ||
|
||
study_accession = formatted_line[2].strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line assumes this script will only be run on already published studies with accession IDs. If this assumption is correct there's nothing to do.
if delimiter in background_column: | ||
background_column = background_column.split(delimiter) | ||
|
||
for background_trait in background_column: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessary code duplication. You don't need to check if the delimiter is in the string or not.
string = 'acute graft vs. host disease || donor genotype effect measurement'
delimiter = '||'
for trait in string.split(delimiter):
print("Trait: {}".format(trait.strip()))
# Trait: acute graft vs. host disease
# Trait: donor genotype effect measurement
delimiter = '&&'
for trait in string.split(delimiter):
print("Trait: {}".format(trait.strip()))
# Trait: acute graft vs. host disease || donor genotype effect measurement
Works both cases.
|
||
for background_trait in background_column: | ||
# Get the EFO_ID | ||
background_trait_id = efo_map[background_trait.strip().lower()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion: given the file is (more or less) manually created, I would add a test to check if the trait name is in the hash. But that's not particularly important.
The script is written to remove rows from the STUDY_EFO_TRAIT table for traits the curators have marked as "background traits" in the provided data file (study_background_traits-ALL.txt).
This is to address: https://www.ebi.ac.uk/panda/jira/browse/GOCI-2607