Statistical analyses #16

mathieuboudreau · 2020-11-23T17:55:50Z

This issue originated from a meeting between me, @agahkarakuzu, @matteomancini, and @stikov, which we're moving here so that anyone can get involved with the discussion.

The idea is to identify if the data collected by the challenge may be open for some potentially interesting statistical analysis, and to discuss how to best implement these (and in particular, using open-source tools). Also, we could also maybe identify some statistical analyses that would be interesting but that we don't have sufficient data for, and leave that for an open challenge for people to collect more data for.

I think we should start by describing the datasets we have at hand and some of the remaining corrections or post-processing steps that should be done, and then explore some statistical analysis ideas that are well suited for this dataset and doesn't overlap with other similar studies (such as Banes et al. 2017 that used the NIST phantom on multisites but with much stricter protocol implementation rules that our current challenge, which was to investigate the differences or robustness against cross-site implementations). We also have some human datasets to compare with, which could also be explored (human<->human and/or NIST<->human).

mathieuboudreau · 2020-11-30T19:40:25Z

@agahkarakuzu until @matteomancini accepts the invitation to the repo, would you mind writing down some of your thoughts (and those that were discussed at our initial meeting)? Just everything that you remember or comes to mind would be fine.

matteomancini · 2020-12-01T15:00:10Z

I think that the structure of the data (T1 values estimated across sites and scanners, with several additional details available) would be well suited from a mixed/fixed (depending on the hypothesis) effects linear model.
Depending on how many aspects we want to take into account, we would formulate the fundamental model as:
measuredT1 ~ groundtruthT1 + scannerModel + (1|researchSite)
In this case (which is one of the possible implementations, in an R-esque syntax), scannerModel is a fixed factor (e.g. we expect that influences the measuredT1 outcome) and researchSite is a random one (e.g. grouping factor we want to take into account). This kind of framework would allow to do several considerations: see how much the goodness of fit changes when going from e.g. taking into account just the measured and ground-truth T1 values; study interactions; etc.
A not-so-serious example to see how it works out in the wild (it's R, but there ready-to-use tools also in MATLAB and Python):
https://ourcodingclub.github.io/tutorials/mixed-models/

mathieuboudreau · 2021-01-06T15:22:53Z

Summary of what we have so far:

NIST

Registered & labelled ROIs.
Database with site info, temp, acquisition info, ROI voxels, etc
Many of the datasets in the database are duplicate data Magnitude + Complex
DIfferent scanners
2 phantoms
Not temperature corrected (yet)
Some of the datasets are the same phantom but different sites
Some of the datasets was acquired at different sites with different protocols
Some of the datasets was acquired at different sites but with same protocol
Some sites acquired multiple acquisitions of the same phantom, but different protocols/times (e.g. see below. Scan-rescan, 4 point vs 14 point, short TR/Long TR, different scanners, etc)
A few outliers that may need to be either cleaned (label ROIs), corrected (Philips), or removed.

Human

Labelled manual ROIS (not registered)
Database with site info, temp, acquisition info ROI voxels, etc
Many of the datasets in the database are duplicate data Magnitude + Complex
DIfferent scanners
Some sites acquired multiple acquisitions of the same phantom, but different protocols/times (e.g. see below. Large multi-subject GE vs philips, 20 channel vs 64 channel, more?)

Interesting datasets combinations, but maybe not enough time to analyse yet

Intra- and inter-vendor analyses
(NIST) Phillip’s large multi-site NST dataset
(NIST) mrel_usc
- Day1/Day2 scan-rescan
- Same day, two MR’s
- Short TR long TR
(NIST) niloufar_hfmc
- 4 point vs 14 point
(NIST) wang_MDanderson
- Day1/Day2 scan-rescan
(NIST) Ngmaforo_ucla
- Prisma vs Skyra
(Human) mrel_usc
- 6 subjects
(Human) jorgejovicich_cimec
- 20 channel vs 64 channels
(Human) luisconcha_UNAM
- Large multi-subject GE vs Philips datasets

mathieuboudreau · 2021-01-06T15:23:18Z

Language to use

Very likey R + RShiny for visualisations

mathieuboudreau · 2021-01-06T15:50:54Z

Statistical analyses proposals

Question 1: Compare all the scans from Philips germany (same phantom, same scanner, copied protocols) and compare them with all the scans from the Montreal sites (same phantom, variable protocol implementations & scanners)
Question 2: Take 1 scan from each submission (or even site, maybe) and compare them together to determine if T1 values for each sphere agrees with reference pretty decently.
- And/or compare the worse scans from each site for fairness
Question 3: Dependency of systematic deviations on reference T1 values (i.e. is there a common distribution pattern across all the combinations when we plot them per scan).
- Investigate if simulations based on the protocol used can explain the shared variance observed across T1s vs the ground truth.

To do

Reformulate above questions into well-written statistical hypotheses.
Propose how to analyse them in R (what tool, method, etc)

mathieuboudreau · 2023-02-14T16:02:01Z

I think that the structure of the data (T1 values estimated across sites and scanners, with several additional details available) would be well suited from a mixed/fixed (depending on the hypothesis) effects linear model. Depending on how many aspects we want to take into account, we would formulate the fundamental model as: measuredT1 ~ groundtruthT1 + scannerModel + (1|researchSite) In this case (which is one of the possible implementations, in an R-esque syntax), scannerModel is a fixed factor (e.g. we expect that influences the measuredT1 outcome) and researchSite is a random one (e.g. grouping factor we want to take into account). This kind of framework would allow to do several considerations: see how much the goodness of fit changes when going from e.g. taking into account just the measured and ground-truth T1 values; study interactions; etc. A not-so-serious example to see how it works out in the wild (it's R, but there ready-to-use tools also in MATLAB and Python): https://ourcodingclub.github.io/tutorials/mixed-models/

@agahkarakuzu an idea for you to maybe replace Figure 7 with, or to supplement Figure 7. This is something that Juan didn't feel comfortable doing (nor I), but since you have more stats experience it could be easy to do with the pre-saved pandas ROI & config details databases I have in this repo.

agahkarakuzu · 2023-02-14T16:28:47Z

Beyond the comfort in implementing this, I think the main issue is whether we need this or what it adds to our analysis.

Ground truth T1 is a direct determinator of the T1 measured using the gold standard IR (strongly autocorrelated), so they should not be on the opposite sides of a linear model. t1_deviation ~ scanner + site would be an alternative. Is there any data that can extend this beyond scanner and site to say us something grouped plots cannot given the limited number of samples?

mathieuboudreau · 2023-02-17T19:14:56Z

Beyond the comfort in implementing this, I think the main issue is whether we need this or what it adds to our analysis.

So, when considering this question, I think there are 4 main things to consider that I'm aware of currently:

Nikola had it in mind that it was something he'd like to see done with this dataset when we initially met about doing statistical analyses a few years ago with you and Matteo.
A co-author asked if we had something similar to this during the review of the ISMRM abstract.

3. [Bane et al. 2018](https://onlinelibrary.wiley.com/doi/10.1002/mrm.26903)'s multicenter standard phantom study on T1 mapping used a general linear mixed model to asses factors influencing the accuracy of the measurements (looking at scanner/vendor/protocols as potential factors). [Keenan's 2021 multi-center phantom study](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0252966) also looked into if manufacturers were a predictor using ANOVAs. 4. We can easily anticipate that the reviewers to ask for this during the first round of reviews.

Is there any data that can extend this beyond scanner and site to say us something grouped plots cannot given the limited number of samples?

Phantom version could be another (hypothesizing that there might actually be difference between phantoms; the other two studies mentioned above used a single phantom). Maybe also "submitter"/"implementor" of the protocol (since some phantoms were shared between submissions. One thing I didn't collect in the JSON but may be present in the DICOMS was pre-scan settings, which as you know, would likely be a signifiant factor if the wrong settings are used.

mathieuboudreau assigned agahkarakuzu Nov 23, 2020

agahkarakuzu changed the title ~~Statistical analysises~~ Statistical analyses Nov 23, 2020

mathieuboudreau assigned matteomancini and jvelazquez-reyes Jan 6, 2021

mathieuboudreau added the Discussion label Jan 6, 2021

This was referenced Jan 13, 2021

Stats: Question 1 #19

Open

Stats: Question 2 #20

Open

Stats: Question 3 #21

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Statistical analyses #16

Statistical analyses #16

mathieuboudreau commented Nov 23, 2020

mathieuboudreau commented Nov 30, 2020

matteomancini commented Dec 1, 2020

mathieuboudreau commented Jan 6, 2021 •

edited by agahkarakuzu

Loading

mathieuboudreau commented Jan 6, 2021

mathieuboudreau commented Jan 6, 2021 •

edited

Loading

mathieuboudreau commented Feb 14, 2023

agahkarakuzu commented Feb 14, 2023

mathieuboudreau commented Feb 17, 2023

Statistical analyses #16

Statistical analyses #16

Comments

mathieuboudreau commented Nov 23, 2020

mathieuboudreau commented Nov 30, 2020

matteomancini commented Dec 1, 2020

mathieuboudreau commented Jan 6, 2021 • edited by agahkarakuzu Loading

Summary of what we have so far:

NIST

Human

Interesting datasets combinations, but maybe not enough time to analyse yet

mathieuboudreau commented Jan 6, 2021

Language to use

mathieuboudreau commented Jan 6, 2021 • edited Loading

Statistical analyses proposals

To do

mathieuboudreau commented Feb 14, 2023

agahkarakuzu commented Feb 14, 2023

mathieuboudreau commented Feb 17, 2023

mathieuboudreau commented Jan 6, 2021 •

edited by agahkarakuzu

Loading

mathieuboudreau commented Jan 6, 2021 •

edited

Loading