Tutorial on artifacts #88

kbroman · 2018-11-22T15:04:09Z

Add a tutorial on identifying artifacts, particularly in multiparent populations.

single outlier
missing genotype
especially narrow LOD peaks
crazy coefficient estimates
at a SNP that is monomorphic in the population, but where there is still some variation in the genotype probabilities.

HannahVMeyer · 2023-06-22T15:04:09Z

Hi - thanks for a great package and providing genotype data for the CC, this has really helped our analyses.

I have come across some very strange effect size estimates and found this open issue. Do you have any advice on how to debug this/what the cause might be?

Below are visualizations of the output from scan1 and scan1coef run on 42 CC strains with 2-4 replicates each of a continuous, but 0-1 bound phenotype (derived from a proportion), loco kinship and batch as covariate. I used the genotype information from https://raw.githubusercontent.com/rqtl/qtl2data/main/CC/cc.zip. For simplicity I show only chr18 (weird effect coefficients) and chr19 (coefficients as might be expected), but this weird pattern is also seen on other chromosomes.

As suggested in this issue, I also tried using the clean_genotypes function. This changed the coefficient estimates (see plot below) but it still doesn't look right:

I am using qtl2_0.32 on R 4.2.1

Many thanks!

kbroman · 2023-06-22T15:34:36Z

I don't much like scan1coef(). The artifacts are happening in a position without particularly large LOD scores; to me, there's not much reason to be interested in the estimated effects at a position that is not really the QTL.

I would pull out the genotype probabilities at the estimated QTL position and use fit1() to get the estimated effects at that position.

HannahVMeyer · 2023-06-22T16:16:38Z

Many thanks, Karl! I only showed the plots above for convenience of screenshotting, the actual phenotype has a LOD > 6, significant via permutation testing.

Thanks for pointing me to fit1, I hadn't come across that function in the tutorials I followed. What I liked about the output scan1coef was the ability to plot the estimated coefficients underneath the LOD plot. Is there a similar nice visualisation that comes for fit1?

As for interpreting the fit1 results (and apologies if this is obvious): can I interpret the coef as the effect size per genotype? By then checking the genotypes of the individual strains, I can infer which strain(s) were driving the associations based on genotype?

kbroman · 2023-06-22T16:51:05Z

I use ciplot() from my broman package.

The estimated effects are for the additive effects coefficients in a linear regression, shifted so that they sum to 0.

HannahVMeyer · 2023-06-22T20:14:26Z

Thanks very much for the pointers and the speedy reply!

kbroman added the enhancement label Nov 22, 2018

kbroman added documentation and removed enhancement labels Apr 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tutorial on artifacts #88

Tutorial on artifacts #88

kbroman commented Nov 22, 2018 •

edited

Loading

HannahVMeyer commented Jun 22, 2023 •

edited

Loading

kbroman commented Jun 22, 2023

HannahVMeyer commented Jun 22, 2023

kbroman commented Jun 22, 2023

HannahVMeyer commented Jun 22, 2023

Tutorial on artifacts #88

Tutorial on artifacts #88

Comments

kbroman commented Nov 22, 2018 • edited Loading

HannahVMeyer commented Jun 22, 2023 • edited Loading

kbroman commented Jun 22, 2023

HannahVMeyer commented Jun 22, 2023

kbroman commented Jun 22, 2023

HannahVMeyer commented Jun 22, 2023

kbroman commented Nov 22, 2018 •

edited

Loading

HannahVMeyer commented Jun 22, 2023 •

edited

Loading