Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README updated for independent samples module #51

Merged
merged 2 commits into from
Jul 13, 2021

Conversation

runjin326
Copy link

@runjin326 runjin326 commented Jul 13, 2021

Purpose/implementation Section

Update README.md file to include description of the new "each-cohort" and "all-cohorts" input args that we implemented.

What was your approach?

Paragraphs describing these two params are included.

What GitHub issue does your pull request address?

d3b-center/ticket-tracker-OPC#89

Directions for reviewers. Tell potential reviewers what kind of feedback you are soliciting.

Please read to see whether the description is clear enough

Is there anything that you want to discuss further?

No

Is the analysis in a mature enough form that the resulting figure(s) and/or table(s) are ready for review?

Yes

Documentation Checklist

  • This analysis module has a README and it is up to date.
  • This analysis is recorded in the table in analyses/README.md and the entry is up to date.
  • The analytical code is documented and contains comments.

Copy link

@logstar logstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for updating the README! I only have one question about the revised analysis description, which is listed below.

@@ -6,23 +6,50 @@ Many analyses that involve mutation frequencies or co-occurence require that all
However, the PBTA+GMKF data set includes many cases where multiple speciments were taken from a single individual.
This analysis creates lists of samples such that there are no cases where more than one specimen is included from each individual.

As different analyses may require different sets of data, we actually generate a few different sets, stored in the `results` subdirectory:
As different analyses may require different sets of data, we actually generate a few different sets, stored in the `results` subdirectory. We also run the analyses based on different 'independent_level', either 'each-cohort' or 'all-cohorts'. When running with 'each-cohort', we call independent samples for each cohort+cancer_type - and same samples in different cohorts are called "independent". When running with 'all-cohorts', we call independent samples regardless of cohort or cancer_type - and same samples in different cohorts are considered the same.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify "same samples in different cohorts"? I thought all Kids_First_Biospecimen_IDs are unique, and each Kids_First_Biospecimen_ID has only one cohort, so there cannot be any sample in different cohorts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@logstar, thanks Yuanchao! Sorry by "samples" I meant "participants" - I have specified above.

@runjin326 runjin326 requested a review from logstar July 13, 2021 16:09
Copy link

@logstar logstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarifications @runjin326 .

This PR looks good to me.

@logstar logstar merged commit baf1f20 into dev Jul 13, 2021
@logstar logstar deleted the independent-samples-readme branch July 13, 2021 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants