MRG: Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit #11141

hoechenberger · 2022-09-07T13:00:23Z

It's quite common that users want to know the variance explained by all (or individual) ICA components. Such a question just recently showed up on the forum.

The PR implements ICA.get_explained_variance_ratio() to allow retrieval of the proportion of variance of the original data explained by ICA components.

MWE:

import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / 'MEG' / 'sample' / 'sample_audvis_raw.fif'

raw = mne.io.read_raw_fif(sample_fname, preload=True)
raw.crop(tmax=60)
raw.pick_types(eeg=True, stim=True)
raw.filter(l_freq=0.1, h_freq=None)

events = mne.find_events(raw)
epochs = mne.Epochs(raw=raw, events=events, baseline=None, preload=True)
evoked = epochs.average()

ica = mne.preprocessing.ICA(n_components=15, method='picard')
ica.fit(raw)

for inst in (raw, epochs, evoked):
    print(f'\nProcessing {type(inst)} data…')
    for components in (0, [0, 1], None):
        explained_var = get_explained_var_ratio(
            ica=ica, inst=inst, components=components
        )
        explained_var_percent = round(100 * explained_var, 1)
        print(
            f'ICA component(s) {components} explain(s): '
            f'{explained_var_percent}% of variance in original data'
        )

Produces:

Processing <class 'mne.io.fiff.raw.Raw'> data…
ICA component(s) 0 explain(s): 26.2% of variance in original data
ICA component(s) [0, 1] explain(s): 29.6% of variance in original data
ICA component(s) None explain(s): 86.1% of variance in original data

Processing <class 'mne.epochs.Epochs'> data…
ICA component(s) 0 explain(s): 46.2% of variance in original data
ICA component(s) [0, 1] explain(s): 61.0% of variance in original data
ICA component(s) None explain(s): 95.2% of variance in original data

Processing <class 'mne.evoked.EvokedArray'> data…
ICA component(s) 0 explain(s): 31.3% of variance in original data
ICA component(s) [0, 1] explain(s): 35.7% of variance in original data
ICA component(s) None explain(s): 93.9% of variance in original data

mne/preprocessing/ica.py

agramfort

wait a minute this is PCA components explained variance not ICA explained variance !

also ICA components are not orthogonal so the notion of explained variance for a group of components is fishy. At best you can see how much variance you keep by including only component but since all components are not orthogonal to each other the sum will not be 1.

Danger zone...

hoechenberger · 2022-09-10T10:52:25Z

I implemented the algorithm used in EEGLAB in this MWE:

# %%
from numpy.typing import ArrayLike
import mne


def get_explained_var(
    *,
    ica: mne.preprocessing.ICA,
    inst: mne.io.BaseRaw | mne.Epochs | mne.Evoked,
    components: ArrayLike,
) -> float:
    # Algorithm follows:
    # https://sccn.ucsd.edu/pipermail/eeglablist/2014/009134.html
    inst_recon = ica.apply(
        inst=raw.copy(),
        include=[components],
        exclude=[],
        n_pca_components=0,  # XXX check this
        verbose=False
    )
    diff = inst_recon.get_data() - raw.get_data()
    mean_var_diff = diff.var(axis=1, ddof=0).mean()
    mean_var_orig = inst.get_data().var(axis=1, ddof=0).mean()

    var_explained_ratio = 1 - mean_var_diff / mean_var_orig
    return var_explained_ratio


sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / 'MEG' / 'sample' / 'sample_audvis_raw.fif'

raw = mne.io.read_raw_fif(sample_fname, preload=True)
raw.crop(tmax=60)
raw.pick_types(eeg=True)

ica = mne.preprocessing.ICA(n_components=30, method='picard')
ica.fit(raw)

for components in (0, [0, 1], range(0, 10)):
    explained_var = get_explained_var(ica=ica, inst=raw, components=components)
    explained_var_percent = round(100 * explained_var, 1)
    print(
        f'ICA component(s) {components} explain(s): '
        f'{explained_var_percent}% of variance in original data'
    )

ICA component(s) 0 explain(s): 45.0% of variance in original data
ICA component(s) [0, 1] explain(s): 50.3% of variance in original data
ICA component(s) range(0, 10) explain(s): 77.1% of variance in original data

agramfort · 2022-09-10T11:38:51Z

Ok this looks correct. Basically it does more than the current private function as it also works with more than one component. If you pass only one component it matches our private function ?

drammock · 2022-09-10T12:16:04Z

Nice!

inst=raw.copy(),

Should be inst=inst.copy() right? Also should use BaseEpochs in the type hint.

hoechenberger · 2022-09-11T15:57:17Z

Thanks for the feedback, everybody!

New draft, which I will turn into a method of the ICA class soon:

from numpy.typing import ArrayLike
import mne


def get_explained_var_ratio(
    *,
    ica: mne.preprocessing.ICA,
    inst: mne.io.BaseRaw | mne.BaseEpochs | mne.Evoked,
    components: ArrayLike | int | None,
) -> float:
    """Retrieve the proportion of variance explained by independent components.

    A value similar (but not equivalent) to EEGLAB's ``pvaf`` (percent variance
    accounted for) will be calculated for the specified component(s).

    Parameters
    ----------
    ica : mne.preprocessing.ICA
        A fitted ICA instance.
    inst : mne.io.BaseRaw | mne.BaseEpochs | mne.Evoked
        The uncleaned data.
    components : ArrayLike | int | None
        The component(s) for which to do the calculation. If more than one
        component is specified, explained variance will be calculated jointly
        across all supplied components. If ``None``, uses all available
        components.

    Returns
    -------
    float
        The fraction of variance in ``inst`` that can be explained by the
        ICA components.

    Notes
    -----
    Since ICA components cannot be assumed to be aligned orthogonally, the sum
    of the proportion of variance explained by all components may not be equal
    to 1. In certain edge cases, the proportin of variance explained by a
    component may even be negative.

    .. versionadded:: 1.1
    """
    # The algorithm implemented here should be equivalent to
    # https://sccn.ucsd.edu/pipermail/eeglablist/2014/009134.html

    # Reconstruct ("back-project") the data using only the specified ICA
    # components. Don't make use of potential "spare" PCA components in this
    # process – we're only interested in the contribution of the ICA
    # components!
    if components is None:
        components = range(ica.n_components_)

    inst_recon = ica.apply(
        inst=inst.copy(),
        include=[components],
        exclude=[],
        n_pca_components=0,
        verbose=False,
    )
    data_recon = inst_recon.get_data(picks=ica.ch_names)
    data_orig = inst.get_data(picks=ica.ch_names)
    data_diff = data_orig - data_recon

    # To estimate the data variance, we first compute the variance across
    # channels at each time point, and then we average these variances.
    mean_var_diff = data_diff.var(axis=0).mean()
    mean_var_orig = data_orig.var(axis=0).mean()

    var_explained_ratio = 1 - mean_var_diff / mean_var_orig
    return var_explained_ratio

MWE:

import mne

sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / 'MEG' / 'sample' / 'sample_audvis_raw.fif'

raw = mne.io.read_raw_fif(sample_fname, preload=True)
raw.crop(tmax=60)
raw.pick_types(eeg=True, stim=True)
raw.filter(l_freq=0.1, h_freq=None)

events = mne.find_events(raw)
epochs = mne.Epochs(raw=raw, events=events, baseline=None, preload=True)
evoked = epochs.average()

ica = mne.preprocessing.ICA(n_components=15, method='picard')
ica.fit(raw)

for inst in (raw, epochs, evoked):
    print(f'\nProcessing {type(inst)} data…')
    for components in (0, [0, 1], None):
        explained_var = get_explained_var_ratio(
            ica=ica, inst=inst, components=components
        )
        explained_var_percent = round(100 * explained_var, 1)
        print(
            f'ICA component(s) {components} explain(s): '
            f'{explained_var_percent}% of variance in original data'
        )

Output:

Processing <class 'mne.io.fiff.raw.Raw'> data…
ICA component(s) 0 explain(s): 26.2% of variance in original data
ICA component(s) [0, 1] explain(s): 29.6% of variance in original data
ICA component(s) None explain(s): 86.1% of variance in original data

Processing <class 'mne.epochs.Epochs'> data…
ICA component(s) 0 explain(s): 46.2% of variance in original data
ICA component(s) [0, 1] explain(s): 61.0% of variance in original data
ICA component(s) None explain(s): 95.2% of variance in original data

Processing <class 'mne.evoked.EvokedArray'> data…
ICA component(s) 0 explain(s): 31.3% of variance in original data
ICA component(s) [0, 1] explain(s): 35.7% of variance in original data
ICA component(s) None explain(s): 93.9% of variance in original data

drammock · 2022-09-11T16:42:36Z

Nice! If the implementation ends up having a logging line, then I suggest adding .__name__ to this:

 print(f'\nProcessing {type(inst)} data…')

hoechenberger · 2022-09-11T16:44:13Z

Nice! If the implementation ends up having a logging line, then I suggest adding .__name__ to this:
 print(f'\nProcessing {type(inst)} data…')

Nice idea!

agramfort · 2022-09-11T19:39:07Z

@hoechenberger let me know when you push this so I can comment on the lines. Will save me time. 🙏

hoechenberger · 2022-09-13T14:48:52Z

hoechenberger · 2022-09-13T14:56:53Z

@agramfort This is ready for review.

agramfort · 2022-09-13T18:32:32Z

mne/preprocessing/ica.py

+            The fraction of variance in ``inst`` that can be explained by the
+            ICA components. If only a single ``ch_type`` was given, a float
+            will be returned. Otherwise, a dictionary with channel types as
+            keys and explained variance ratios as values.


honestly I would always return a dict as otherwise we'll have for example to branch in the pipeline code etc. wdyt?

I was a bit hesitant to do that because if you have for example only EEG data, you'd need to do:

var_explained = ica.get_explained_variance_ratio(inst)['eeg']

which I felt was kind of odd … but you're of course right, always returning a dict would make things more consistent, so that's probably better.

@drammock @cbrnr WDYT?

@agramfort I've implemented your suggestion, and we're always returning a dict now

hoechenberger · 2022-09-13T20:34:20Z

agramfort

just a last question. Otherwise +1 for MRG

thx @hoechenberger

agramfort · 2022-09-13T21:06:49Z

mne/preprocessing/ica.py

+        # this process – we're only interested in the contribution of the ICA
+        # components!
+        kwargs = dict(
+            inst=inst.copy(),


is the copy really needed?

Unfortunately, yes, because we use ica.apply() below and this works in place. We could do without the copy if instead of using ica.apply(), we'd "manually" do the matrix multiplication. But ica.apply() does quite a few additional things and I'm worried I'd forget something important 😅

I trust that if we discover the copy() here causes issues, we can optimize things later on.

agramfort · 2022-09-14T09:26:16Z

+1 for MRG when CIs complete. thx @hoechenberger

hoechenberger · 2022-09-14T10:31:49Z

GH Action status is not reported somehow, but all CI runs for this PR passed, according to https://github.com/mne-tools/mne-python/actions/workflows

@larsoner @agramfort Could you please manually merge?

agramfort reviewed Sep 7, 2022

View reviewed changes

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

hoechenberger changed the title ~~Introduce ICA.explained_variance_ to easily retrieve relative explained variances after a fit~~ Introduce ICA.explained_variance_ratio_ to easily retrieve relative explained variances after a fit Sep 7, 2022

drammock reviewed Sep 7, 2022

View reviewed changes

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

hoechenberger marked this pull request as ready for review September 7, 2022 15:13

agramfort reviewed Sep 7, 2022

View reviewed changes

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

hoechenberger force-pushed the ica-explained-variance branch from 1d8d840 to f11ccf5 Compare September 7, 2022 15:41

drammock approved these changes Sep 7, 2022

View reviewed changes

This comment was marked as outdated.

Sign in to view

larsoner reviewed Sep 7, 2022

View reviewed changes

mne/preprocessing/ica.py Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

hoechenberger changed the title ~~Introduce ICA.explained_variance_ratio_ to easily retrieve relative explained variances after a fit~~ MRG: Introduce ICA.explained_variance_ratio_ to easily retrieve relative explained variances after a fit Sep 7, 2022

agramfort requested changes Sep 7, 2022

View reviewed changes

This comment was marked as outdated.

Sign in to view

hoechenberger marked this pull request as draft September 7, 2022 19:54

hoechenberger changed the title ~~MRG: Introduce ICA.explained_variance_ratio_ to easily retrieve relative explained variances after a fit~~ Introduce ICA.explained_variance_ratio_ to easily retrieve relative explained variances after a fit Sep 7, 2022

This comment was marked as outdated.

Sign in to view

hoechenberger force-pushed the ica-explained-variance branch from 19e88c7 to e22750a Compare September 12, 2022 16:54

hoechenberger added 5 commits September 13, 2022 15:10

Restructure

bbcd19d

Docstring

b499dcf

Merge branch 'main' into ica-explained-variance

dd49c6c

Update tutorial

eda143a

Update tests

5cb3cfd

hoechenberger added this to the 1.2 milestone Sep 13, 2022

hoechenberger changed the title ~~Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit~~ MRG: Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit Sep 13, 2022

agramfort reviewed Sep 13, 2022

View reviewed changes

hoechenberger force-pushed the ica-explained-variance branch from 051f83c to 8a20f31 Compare September 13, 2022 19:37

Always return a dict

bd2ccc2

hoechenberger force-pushed the ica-explained-variance branch from 8a20f31 to bd2ccc2 Compare September 13, 2022 19:39

More test coverage and better type validation messages

33e3a28

Be more explicit [skip azp][skip actions]

0857cee

agramfort reviewed Sep 13, 2022

View reviewed changes

agramfort approved these changes Sep 14, 2022

View reviewed changes

agramfort enabled auto-merge (squash) September 14, 2022 09:25

hoechenberger closed this Sep 14, 2022

auto-merge was automatically disabled September 14, 2022 10:29
Pull request was closed

hoechenberger reopened this Sep 14, 2022

hoechenberger enabled auto-merge (squash) September 14, 2022 10:32

Merge remote-tracking branch 'upstream/main' into ica-explained-variance

b4ec360

hoechenberger merged commit 333fd05 into mne-tools:main Sep 14, 2022

hoechenberger deleted the ica-explained-variance branch September 14, 2022 14:32

hoechenberger mentioned this pull request Oct 31, 2023

BUG: Fix bug with Report.add_ica component number #12156

Merged

rcassani mentioned this pull request Feb 19, 2024

Update projector structure. And add ICA explained variance brainstorm-tools/brainstorm3#685

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MRG: Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit #11141

MRG: Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit #11141

hoechenberger commented Sep 7, 2022 •

edited

Loading

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

agramfort left a comment

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

hoechenberger commented Sep 10, 2022

agramfort commented Sep 10, 2022 via email

drammock commented Sep 10, 2022

hoechenberger commented Sep 11, 2022 •

edited

Loading

drammock commented Sep 11, 2022

hoechenberger commented Sep 11, 2022

agramfort commented Sep 11, 2022

hoechenberger commented Sep 13, 2022

hoechenberger commented Sep 13, 2022

agramfort Sep 13, 2022

hoechenberger Sep 13, 2022

hoechenberger Sep 13, 2022

hoechenberger commented Sep 13, 2022

agramfort left a comment

agramfort Sep 13, 2022

hoechenberger Sep 14, 2022

agramfort commented Sep 14, 2022

hoechenberger commented Sep 14, 2022 •

edited

Loading

MRG: Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit #11141

MRG: Introduce ICA.get_explained_variance_ratio() to easily retrieve relative explained variances after a fit #11141

Conversation

hoechenberger commented Sep 7, 2022 • edited Loading

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

agramfort left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

hoechenberger commented Sep 10, 2022

agramfort commented Sep 10, 2022 via email

drammock commented Sep 10, 2022

hoechenberger commented Sep 11, 2022 • edited Loading

drammock commented Sep 11, 2022

hoechenberger commented Sep 11, 2022

agramfort commented Sep 11, 2022

hoechenberger commented Sep 13, 2022

hoechenberger commented Sep 13, 2022

agramfort Sep 13, 2022

Choose a reason for hiding this comment

hoechenberger Sep 13, 2022

Choose a reason for hiding this comment

hoechenberger Sep 13, 2022

Choose a reason for hiding this comment

hoechenberger commented Sep 13, 2022

agramfort left a comment

Choose a reason for hiding this comment

agramfort Sep 13, 2022

Choose a reason for hiding this comment

hoechenberger Sep 14, 2022

Choose a reason for hiding this comment

agramfort commented Sep 14, 2022

hoechenberger commented Sep 14, 2022 • edited Loading

hoechenberger commented Sep 7, 2022 •

edited

Loading

hoechenberger commented Sep 11, 2022 •

edited

Loading

hoechenberger commented Sep 14, 2022 •

edited

Loading