Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K-mer association with mixed effects model #265

Open
komaltilwani53 opened this issue May 1, 2024 · 6 comments
Open

K-mer association with mixed effects model #265

komaltilwani53 opened this issue May 1, 2024 · 6 comments

Comments

@komaltilwani53
Copy link

Hi Team,

Appreciate if someone can advise me on this issue:
Previous steps:

  1. SNP and COG association with fixed effects model

Pyseer having issue: K-mer association with mixed effects model

Command used:
pyseer --lmm --phenotypes phenotypes.txt --kmers fsm_kmers.txt.gz --similarity phylogeny_K.tsv --output-patterns kmer_patterns.txt --cpu 12 > cdi_kmers.txt

Standard output: None

Standard Error file:
Read 602 phenotypes
Detected binary phenotype
Setting up LMM
Similarity matrix has dimension (602, 602)
Analysing 602 samples found in both phenotype and similarity matrix
h^2 = 0.00
No observations of TTTNNNNNN in selected samples
No observations of TTTNNNNNNN in selected samples
No observations of TTTNNNNNNNN in selected samples
No observations of TTTNNNNNNNNN in selected samples
No observations of TTTNNNNNNNNNN in selected samples
No observations of TTTNNNNNNNNNNN in selected samples
No observations of TTTNNNNNNNNNNNN in selected samples

Environment Verified and Test cases executed as shared in tutorial.

Please let me know in case if any further inputs required to investigate this issue

@johnlees
Copy link
Collaborator

johnlees commented May 1, 2024

Sorry I'm not clear from the above what is the issue you are having?

@komaltilwani53
Copy link
Author

komaltilwani53 commented Jun 5, 2024

Hi Team,

Appreciate if someone can advise me on this :

These include the result files; the heritibitly score is o, but the Q Q plot is siginificant. This contradicts itself or shouldn't be taken into account for more analysis.
why the heritibility score is 0

  1. SNP and COG association with fixed effects model
    Read 602 phenotypes
    Detected binary phenotype
    Structure matrix has dimension (602, 602)
    Analysing 602 samples found in both phenotype and structure matrix
    4701 loaded variants
    2902 pre-filtered variants
    1799 tested variants
    1799 printed variants

  2. K-mer association with mixed effects model
    Read 602 phenotypesDetected binary phenotypeSetting up LMMSimilarity matrix has dimension (602, 602)
    Analysing 602 samples found in both phenotype and similarity matrix h^2 = 0.00
    91704889 loaded variants
    5125486 pre-filtered variants
    86579403 tested variants
    86579403 printed variant
    (Patterns: 83884146
    Threshold: 5.96E-10 )

Q-Q plot

image

@mgalardini
Copy link
Owner

If I understand correctly, you are asking why you are seeing an heritability estimate of 0, but a number of significant unitigs? Depending on the distribution of your phenotype that is entirely possible (also because it's binary, I think). I don't know your dataset and so this is a little more than guessing

@komaltilwani53
Copy link
Author

Please find attached the phenotypic file I have been using. I would be grateful if you could review it and let me know what you think. Do I need to take the findings from this file into account for my analysis?  heritibility score of 0 have a substantial cause
q-q plot is significant

I've been attempting to use Pyseer to create a Manhattan plot from the snp.plot GWAS output. But I haven't been able to locate any noteworthy peaks.

In addition, I would appreciate it if you could provide any techniques or approaches that would enhance my analysis and enable me to get favorable outcomes.

GWAS_Analysis.zip

@mgalardini
Copy link
Owner

It seems to me that the ratio between positive (1) and negative (0) phenotypes is quite unbalanced (~30 / ~600), which might be a problem. also, from the manhattan plot that you sent I don't see any variant passing the 1E-10 threshold, so maybe you could pick a reference in which the threshold passing variants map to?

Other than that I don't have any particular suggestion

@johnlees
Copy link
Collaborator

johnlees commented Jun 6, 2024

I wouldn't read too much into h^2 from pyseer, especially with the phenotype as described, it may be heavily biased. You should use another tool if you want to estimate it more accurately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants