[ENH] Add support for ragged connections with multivariate methods with padding #142

tsbinns · 2023-07-24T10:14:29Z

Following discussions in #125.

I have added support for handling 'ragged' connections (i.e. where the number of seeds and targets differs across and within connections) with the multivariate methods MIC, MIM, and state-space GC introduced in #138. This affects: the indices attribute; and the patterns attribute (associated with MIC, with values returned for each channel in the seeds and targets). The connectivity results themselves are unaffected due to the built-in dimensionality reduction, and so return a single connectivity spectrum regardless of the number of seeds and targets. I have written an example which explains the problem and the workaround in more detail as @adam2392 suggested:

mne-connectivity/examples/handling_ragged_arrays.py

Lines 2 to 7 in 73792c3

    
           ========================================================= 
        
           Working with ragged indices for multivariate connectivity 
        
           ========================================================= 
        
           This example demonstrates how multivariate connectivity involving different 
        
           numbers of seeds and targets can be handled in MNE-Connectivity.

In short: ragged arrays are difficult to work with and cannot be saved using the h5netcdf engine. To get around this problem we can pad the arrays for the missing entries with some known value to make them 'full'.

Padding `indices`

For padding indices, I chose the value -1. Because the entries need to be integers to be interpretable for indexing, we cannot represent values such as np.nan or np.inf in the arrays, so I picked -1 as an invalid channel index. E.g. say we want to look at two connections, the first between 2 seeds and 3 targets, and the second between 4 seeds and 1 target; indices could have the form:

ragged_seeds = [[0, 1 ], [5, 6, 7, 8]]
ragged_targets = [[2, 3, 4], [9 ]]
ragged_indices = (ragged_seeds, ragged_targets)

The ragged seeds/targets thus have shape [cons, **variable**]. This can be passed to the connectivity functions, and will internally be padded to:

padded_seeds = [[0, 1, -1, -1], [5, 6, 7, 8]]
padded_targets = [[2, 3, 4, -1], [9, -1, -1, -1]]
padded_indices = (padded_seeds, padded_targets)

The padded seeds/targets now have shape [cons, max_channels], which can be stored in the connectivity objects. The consistent shape makes them easier to handle and also compatible with saving.

Padding `patterns`

For padding patterns, I chose the invalid value np.nan. patterns have shape [2, cons, channels, freqs (, times)] (where the 2 represents entries for the seeds and targets, respectively). Accordingly, if the number of channels differs across the seed and target within a connection, or across connections, patterns will be ragged with shape [2, cons, **variable**, freqs (, times)].

By padding the missing entries with np.nan along the channel dimension, we can make the array shape consistent with [2, cons, max_channels, freqs (, times)] (e.g. [2, 2, 4, freqs (, times)] using the indices in the above example). Extracting only those valid entries from the patterns for a given connection is very simple, as I show in the example linked above.

Other changes

In line with these changes, I updated the relevant unit tests in test_spectral.py, and also edited the multivariate connectivity examples to take advantage of the new support for ragged connections. I also included two new helper functions check_multivariate_indices and seed_target_multivariate_indices (analgous to the existing functions but just for the new multivariate indices format; also with unit tests).

Overall I think this approach is quite clean: end-users can give ragged indices to a helper function which will do the padding, or pass them directly to the connectivity methods which will handle everything. Extracting the relevant information from padded indices and patterns is also trivial. Finally, the internal implementation is simple, so maintenance should not be problematic.

Please let me know your thoughts!
Thomas

adam2392 · 2023-08-09T15:30:58Z

mne_connectivity/utils/utils.py

+    Returns
+    -------
+    indices : tuple of array of array of int
+        The indices padded with the invalid channel index ``-1``.


Is -1 ever going to be used?

E.g. -1 stands for usually the last index -1 position in Python. But in our context, is this fine? Or is np.nan better?

I'm jw. If I'm wrong, then that's fine.

I agree that -1 is technically a valid index and np.nan would be best, however np.nan is a float and cannot be represented in an array of ints.

If we switched to an array of floats, np.nan could be represented, however numpy does not like floats as indices, so we would have to convert to ints every time we want to index channels.

One alternative is numpy masked arrays (https://numpy.org/doc/stable/reference/maskedarray.html) however I'm not sure how this would be handled when saving using h5netcdf. Perhaps another option could be to choose a very unlikely integer like -99,999.

Ultimately I would be happy to go with whatever you feel is most appropriate.

I'll defer to you. I think it's something we can consider changing in the future if it's an issue for users.

I also find the padding with -1 to be unexpected. Without knowing the ramifications offhand, I would think that a masked array, a scipy sparse array, or possibly the pandas nullable integer dtype would be preferable. @tsbinns mentioned saving to netcdf as a possible constraint/limitation... will the tuple of seed/target indices themselves need to be saved out in that format? Or just the resultant connectivity objects (which presumably keeps the indices in a class attribute)? Asking because how we represent the indices could conceivably be different between what is given to the user by this function & passed by the user into the connectivity funcs, versus how they are stored internally inside the resulting connectivity objects.

mne_connectivity/utils/utils.py

adam2392

I will try to review this in-depth next week

tsbinns · 2023-08-14T15:29:23Z

@adam2392 I updated the documentation for the util functions: check_indices; check_multivariate_indices; seed_target_indices; and seed_target_multivariate_indices. Is this the sort of thing you had in mind?

mne_connectivity/spectral/epochs.py

mne_connectivity/spectral/tests/test_spectral.py

adam2392 · 2023-08-24T15:09:12Z

mne_connectivity/spectral/tests/test_spectral.py

+    indices = (np.array([[0, 1]]), np.array([[2]]))
+    conn_mvc = spectral_connectivity_epochs(
+        epochs, method="mic", indices=indices, sfreq=sfreq, fmin=10, fmax=40)
+    conn_mvc.save(tmp_path / 'foo_mvc.nc')


Should we move this to a separate test and also test roundtrip IO? I.e. reading in the file after saving results in everything being the same as what you would expect?

Yes, good idea. I have added a new test for saving the multivariate results and checking their integrity:

mne-connectivity/mne_connectivity/spectral/tests/test_spectral.py

Lines 1286 to 1315 in 409c2c6

def test_multivar_save_load(tmp_path):

"""Test saving and loading results of multivariate connectivity."""

rng = np.random.RandomState(0)

n_epochs, n_chs, n_times, sfreq, f = 10, 4, 2000, 1000., 20.

data = rng.randn(n_epochs, n_chs, n_times)

sig = np.sin(2 * np.pi * f * np.arange(1000) / sfreq) * np.hanning(1000)

data[:, :, 500:1500] += sig

info = create_info(n_chs, sfreq, 'eeg')

tmin = -1

epochs = EpochsArray(data, info, tmin=tmin)

tmp_file = os.path.join(tmp_path, 'foo_mvc.nc')

non_ragged_indices = (np.array([[0, 1]]), np.array([[2, 3]]))

ragged_indices = (np.array([[0, 1]]), np.array([[2]]))

for indices in [non_ragged_indices, ragged_indices]:

con = spectral_connectivity_epochs(

epochs, method=['mic', 'mim', 'gc', 'gc_tr'], indices=indices,

sfreq=sfreq, fmin=10, fmax=30)

for this_con in con:

this_con.save(tmp_file)

read_con = read_connectivity(tmp_file)

assert_array_almost_equal(this_con.get_data(),

read_con.get_data('raveled'))

if this_con.attrs['patterns'] is not None:

assert_array_almost_equal(np.array(this_con.attrs['patterns']),

np.array(read_con.attrs['patterns']))

# split `repr` before the file size (`~23 kB` for example)

a = repr(this_con).split('~')[0]

b = repr(read_con).split('~')[0]

assert a == b

When adding this test, I realised that read_connectivity returns an object whose indices attribute is an array, whereas it is a tuple of arrays/lists in the original object.

Because indices is an array, a later check in _check_data_consistency that indices is a tuple fails (L538), and so it tries to check if the array is a string (L553), which leads to an elementwise comparison that fails. With my NumPy version (1.24.3) I get the following warning which causes pytest to fail: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison, so this should cause a straight-up error in newer NumPy versions.

mne-connectivity/mne_connectivity/base.py

Lines 538 to 561 in 409c2c6

if isinstance(indices, tuple):

# check that the indices passed in are of the same length

if len(indices[0]) != len(indices[1]):

raise ValueError(f'If indices are passed in '

f'then they must be the same '

f'length. They are right now '

f'{len(indices[0])} and '

f'{len(indices[1])}.')

# indices length should match the data length

if len(indices[0]) != data_len:

raise ValueError(

f'The number of indices, {len(indices[0])} '

f'should match the raveled data length passed '

f'in of {data_len}.')

elif indices == 'symmetric':

expected_len = ((n_nodes + 1) * n_nodes) // 2

if data_len != expected_len:

raise ValueError(f'If "indices" is "symmetric", then '

f'connectivity data should be the '

f'upper-triangular part of the matrix. There '

f'are {data_len} estimated connections. '

f'But there should be {expected_len} '

f'estimated connections.')

The same issue also occurs for the pre-existing bivariate connectivity measures, however since reading saved connectivity objects is only tested where indices=None, this warning is never seen.

A solution is to convert indices from an array to a tuple of arrays/lists when reading connectivity objects in _xarray_to_conn (i.e. return it to the same format as it was in the original connectivity object).

mne-connectivity/mne_connectivity/io.py

Lines 56 to 58 in 409c2c6

# convert indices numpy arrays to a tuple of arrays

if isinstance(array.attrs['indices'], np.ndarray):

array.attrs['indices'] = tuple(array.attrs['indices'])

I'm inclined for the soln. that changes existing code as little as possible. I think the soln. you posed of converting indices to an array of tuple of arrays/lists sounds good.

Can you also add a separate unit-test for this when indices != None for existing methods and the new multivariate methods to demonstrate how this works?

Thanks for investigating!

Added a test for checking that the type and values of indices is retained after saving for: bivariate and multivariate connectivity; with spectral_connectivity_epochs and spectral_connectivity_times; with indices = None and indices != None.

mne-connectivity/mne_connectivity/spectral/tests/test_spectral.py

Lines 1318 to 1360 in b7fcf12

def test_spectral_connectivity_indices_maintained(tmp_path):

"""Test that indices values and type is maintained after saving.

If `indices` is None, `indices` in the returned connectivity object should

be None, otherwise, `indices` should be a tuple. The type of `indices` and

its values should be retained after saving and reloading.

"""

rng = np.random.RandomState(0)

n_epochs, n_chs, n_times, sfreq, f = 5, 4, 2000, 1000., 20.

data = rng.randn(n_epochs, n_chs, n_times)

sig = np.sin(2 * np.pi * f * np.arange(1000) / sfreq) * np.hanning(1000)

data[:, :, 500:1500] += sig

info = create_info(n_chs, sfreq, 'eeg')

tmin = -1

epochs = EpochsArray(data, info, tmin=tmin)

freqs = np.arange(10, 31)

tmp_file = os.path.join(tmp_path, 'foo_mvc.nc')

bivar_indices = (np.array([0, 1]), np.array([2, 3]))

multivar_indices = (np.array([[0, 1]]), np.array([[2, 3]]))

indices = [None, bivar_indices, None, multivar_indices]

methods = ['coh', 'coh', 'mic', 'mic']

for this_indices, this_method in zip(indices, methods):

con_epochs = spectral_connectivity_epochs(

epochs, method=this_method, indices=this_indices, sfreq=sfreq,

fmin=10, fmax=30)

con_time = spectral_connectivity_time(

epochs, freqs, method=this_method, indices=this_indices,

sfreq=sfreq)

for con in [con_epochs, con_time]:

con.save(tmp_file)

read_con = read_connectivity(tmp_file)

if this_indices is not None:

# check indices of same type (tuples)

assert (isinstance(con.indices, tuple) and

isinstance(read_con.indices, tuple))

# check indices have same values

assert np.all(np.array(con.indices) ==

np.array(read_con.indices))

else:

assert con.indices is None and read_con.indices is None

adam2392

I left a few questions on areas that seemed like potential code issues. Overall, I don't see any major issues and think this should be mergable soon.

FYI for making the reviewing process easier in general, is limiting the code diff wrt code that isn't relevant to this change.

Also, after this is merged, as we discussed in #125 are you interested in joining the maintenance team and getting core-dev rights to mne-connectivity?

tsbinns · 2023-08-28T14:32:29Z

@adam2392 Thanks very much for the detailed feedback; I have made some updates accordingly.

Also, after this is merged, as we discussed in #125 are you interested in joining the maintenance team and getting core-dev rights to mne-connectivity?

Yes please, I am very happy to help out and make sure this can be maintained!

Signed-off-by: Adam Li <[email protected]>

adam2392 · 2023-09-01T20:18:55Z

I think this mostly LGTM, but I have a question regarding indices that I didn't spot earlier. The unit-test you added in test_spectral.py should be refactored to maybe use pytest.mark.parametrize:

@pytest.mark.parametrize("method", ["coh", "plv", "pli", "wpli", "ciplv", "mic", "mim"])
@pytest.mark.parametrize("indices", [None, 
                                     (np.array([0, 1]), np.array([2, 3])),
                                     (np.array([[0, 1]]), np.array([[2, 3]]))
                                     ])
def test_spectral_connectivity_indices_roundtrip_io(tmp_path, method, indices):
    """Test that indices values and type is maintained after saving.

    If `indices` is None, `indices` in the returned connectivity object should
    be None, otherwise, `indices` should be a tuple. The type of `indices` and
    its values should be retained after saving and reloading.
    """
    rng = np.random.RandomState(0)
    n_epochs, n_chs, n_times, sfreq, f = 5, 4, 2000, 1000.0, 20.0
    data = rng.randn(n_epochs, n_chs, n_times)
    sig = np.sin(2 * np.pi * f * np.arange(1000) / sfreq) * np.hanning(1000)
    data[:, :, 500:1500] += sig
    info = create_info(n_chs, sfreq, "eeg")
    tmin = -1
    epochs = EpochsArray(data, info, tmin=tmin)
    freqs = np.arange(10, 31)
    tmp_file = os.path.join(tmp_path, "foo_mvc.nc")

    # for this_indices, this_method in zip(indices, methods):
    con_epochs = spectral_connectivity_epochs(
        epochs, method=method, indices=indices, sfreq=sfreq, fmin=10, fmax=30
    )
    con_time = spectral_connectivity_time(
        epochs, freqs, method=method, indices=indices, sfreq=sfreq
    )

    for con in [con_epochs, con_time]:
        con.save(tmp_file)
        read_con = read_connectivity(tmp_file)
        if indices is not None:
            # check indices of same type (tuples)
            assert isinstance(con.indices, tuple) and isinstance(
                read_con.indices, tuple
            )
            # check indices have same values
            assert np.all(np.array(con.indices) == np.array(read_con.indices))
        else:
            assert con.indices is None and read_con.indices is None

However, this test fails.

Discussion on why unit-test fails

We have indices defined in two ways, which to me is not easy to remember:

bivar_indices = (np.array([0, 1]), np.array([2, 3]))
multivar_indices = (np.array([[0, 1]]), np.array([[2, 3]]))

I would expect that if someone forgets to pass in a 2D array, they accidentally perform bivariate indices. I'm jw if it's possible to infer this behavior in anyway? If not, I think the docstring can maybe be a bit more clear on what indices to expect. I.e. perhaps a tuple of 1D arrays, tuple of 2D arrays | None? WDYT?

indices : tuple of array | None
        Two arrays with indices of connections for which to compute
        connectivity. If a multivariate method is called, each array for the
        seeds and targets should contain a nested array of channel indices for
        the individual connections. If None, connections between all channels
        are computed, unless a Granger causality method is called, in which
        case an error is raised.

So IIUC, mic/mim will not work when bivariate indices are passed. And bivariate methods will not work when multivariate indices are passed. However, the current error messages are rather uninformative for bivariate methods:

____________________ test_spectral_connectivity_indices_roundtrip_io[indices2-wpli] ____________________
mne_connectivity/spectral/tests/test_spectral.py:1724: in test_spectral_connectivity_indices_roundtrip_io
    con_epochs = spectral_connectivity_epochs(
<decorator-gen-623>:12: in spectral_connectivity_epochs
    ???
mne_connectivity/spectral/epochs.py:1996: in spectral_connectivity_epochs
    _epoch_spectral_connectivity(data=this_epoch, **call_params)
mne_connectivity/spectral/epochs.py:1418: in _epoch_spectral_connectivity
    method.accumulate(con_idx, csd)
mne_connectivity/spectral/epochs.py:818: in accumulate
    self._acc[0, con_idx] += im_csd
E   ValueError: non-broadcastable output operand with shape (1,41) doesn't match the broadcast shape (1,2,41)

The multivariate one looks good:

    con_epochs = spectral_connectivity_epochs(
<decorator-gen-623>:12: in spectral_connectivity_epochs
    ???
mne_connectivity/spectral/epochs.py:1891: in spectral_connectivity_epochs
    n_signals, indices_use, warn_times) = _prepare_connectivity(
mne_connectivity/spectral/epochs.py:116: in _prepare_connectivity
    indices_use = check_multivariate_indices(indices)  # pad with -1
mne_connectivity/utils/utils.py:134: in check_multivariate_indices
    raise TypeError(
E   TypeError: multivariate indices must contain array-likes of channel indices for each seed and target

Lmk if there's anything I can clarify

This reverts commit 6fe682f.

This reverts commit 5bad5fb.

This reverts commit 921cf9d.

This reverts commit 6fe682f.

This reverts commit 5bad5fb.

tsbinns · 2023-11-02T15:40:11Z

Sorry that the commit history is such a mess, but I have reverted all the refactoring changes. Everything now relates only to the support for multiple connections with the multivariate methods, with the use of masked arrays for handling ragged indices.

I have also made check_multivariate_indices private following the above discussion.

mne-connectivity/mne_connectivity/utils/utils.py

Lines 86 to 87 in d6ba398

    
           def _check_multivariate_indices(indices, n_chans): 
        
               """Check indices parameter for multivariate connectivity and mask it.

I believe all points relating to how multivariate indices are handled have now been addressed.

larsoner

Very minor stuff from me. @drammock @adam2392 do either of you want to look again?

larsoner · 2023-11-02T16:17:45Z

examples/granger_causality.py

 try:
    spectral_connectivity_epochs(
        epochs, method=['gc'], indices=indices_ab, fmin=5, fmax=30, rank=None,
-        gc_n_lags=20)  # A => B
+        gc_n_lags=20, verbose=False)  # A => B
    print('Success!')
 except RuntimeError as error:
    print('\nCaught the following error:\n' + repr(error))


If you want, you could add this to expected_failing_examples then you don't have to try/except at all here. Just let it actually fail and sphinx-gallery will print a nicely formatted traceback for you.

https://sphinx-gallery.github.io/stable/configuration.html#dont-fail-exit

The traceback in the file works really nicely, but is it possible to overwrite the "BROKEN" thumbnail?

E.g. setting # sphinx_gallery_thumbnail_path = '_static/granger_causality_gallery_thumbnail.png' within the example did not have an effect.

Hmm not at the moment. Feel free to open an issue at https://github.com/sphinx-gallery/sphinx-gallery about adding this possibility

Opened! sphinx-gallery/sphinx-gallery#1220

If the others agree, I would stick with the try-except approach until this behaviour in sphinx gallery is changed, so as not to give the impression from the thumbnail that the entire example is about failing code.

I think this is fine as is already, no need to wait to merge for this. It's an easy follow up PR after this is merged and SG has the machinery it needs

ignore_words.txt

mne_connectivity/base.py

mne_connectivity/spectral/epochs.py

adam2392 · 2023-11-02T17:56:41Z

mne_connectivity/spectral/epochs.py

+        connectivity. If a bivariate method is called, each array for the seeds
+        and targets should contain the channel indices for each bivariate
+        connection. If a multivariate method is called, each array for the
+        seeds and targets should consist of nested arrays containing
+        the channel indices for each multivariate connection. If ``None``,
+        connections between all channels are computed, unless a Granger
+        causality method is called, in which case an error is raised.


Do we need to discuss anything about the masking behavior? Or would we say that's an internal implementation detail?

If it's an internal implementation detail, I wonder if we can perhaps add inline comments where it first comes up about how the indices variable changes into a mask.

I'm okay being vetoed here. Just a thought

In terms of what is passed into the function, the masking is an internal feature. At present there is only a small inline comment documenting this:

mne-connectivity/mne_connectivity/spectral/epochs.py

Lines 118 to 119 in 883e816

# mask indices

indices_use = _check_multivariate_indices(indices, n_signals)

... although the function where the masking happens is documented:

mne-connectivity/mne_connectivity/utils/utils.py

Lines 86 to 144 in 883e816

def _check_multivariate_indices(indices, n_chans):

"""Check indices parameter for multivariate connectivity and mask it.

Parameters

----------

indices : tuple of array of array of int, shape (2, n_cons, variable)

Tuple containing index sets.

n_chans : int

The number of channels in the data. Used when converting negative

indices to positive indices.

Returns

-------

indices : tuple of array of array of int, shape of (2, n_cons, max_n_chans)

The indices as a masked array.

Notes

-----

Indices for multivariate connectivity should be a tuple of length 2

containing the channel indices for the seed and target channel sets,

respectively. Seed and target indices should be equal-length array-likes

representing the indices of the channel sets in the data for each

connection. The indices for each connection should be an array-like of

integers representing the individual channels in the data. The length of

indices for each connection do not need to be equal. All indices within a

connection must be unique.

If the seed and target indices are given as lists or tuples, they will be

converted to numpy arrays. Because the number of channels can differ across

connections or between the seeds and targets for a given connection (i.e.

ragged/jagged indices), the returned array will be padded out to a 'full'

array with an invalid index (``-1``) according to the maximum number of

channels in the seed or target of any one connection. These invalid

entries are then masked and returned as numpy masked arrays. E.g. the

ragged indices of shape ``(2, n_cons, variable)``::

indices = ([[0, 1], [0, 1 ]], # seeds

[[2, 3], [4, 5, 6]]) # targets

would be padded to full arrays::

indices = ([[0, 1, -1], [0, 1, -1]], # seeds

[[2, 3, -1], [4, 5, 6]]) # targets

to have shape ``(2, n_cons, max_n_chans)``, where ``max_n_chans = 3``. The

invalid entries are then masked::

indices = ([[0, 1, --], [0, 1, --]], # seeds

[[2, 3, --], [4, 5, 6]]) # targets

In case "indices" contains negative values to index channels, these will be

converted to the corresponding positive-valued index before any masking is

applied.

More information on working with multivariate indices and handling

connections where the number of seeds and targets are not equal can be

found in the :doc:`../auto_examples/handling_ragged_arrays` example.

"""

I'm happy to add more comments directly in epochs.py and time.py.

Of course, as soon as the connectivity object is returned, the masking is no longer just an internal implementation. The goal of this example was to provide some more context on this:

mne-connectivity/examples/handling_ragged_arrays.py

Lines 1 to 8 in 883e816

"""

=========================================================

Working with ragged indices for multivariate connectivity

=========================================================

This example demonstrates how multivariate connectivity involving different

numbers of seeds and targets can be handled in MNE-Connectivity.

"""

However I realise adding some documentation within the connectivity classes themselves also makes sense, e.g. here:

mne-connectivity/mne_connectivity/utils/docs.py

Lines 36 to 43 in 883e816

docdict['indices'] = """

indices : tuple of arrays | str | None

The indices of relevant connectivity data. If ``'all'`` (default),

then data is connectivity between all nodes. If ``'symmetric'``,

then data is symmetric connectivity between all nodes. If a tuple,

then the first list represents the "in nodes", and the second list

represents the "out nodes". See "Notes" for more information.

"""

Happy to add this as well.

f5c1c3e:

Expanded upon the inline comments per the suggestion.

Was also able to clean up some code now that _check_multivariate_indices is private.

adam2392

Minor comment, but otw LGTM upon a quick glance! The implementation is much cleaner :)

Co-authored-by: Eric Larson <[email protected]>

…ctivity into pr-mvc_padding

drammock · 2023-11-03T21:34:20Z

In it goes! Thanks a million @tsbinns !!

adam2392 · 2023-11-03T21:41:33Z

Amazing work @tsbinns !

[ENH] Add support for ragged connections with multivariate methods with padding

tsbinns and others added 8 commits July 24, 2023 11:19

added support for ragged connections

73792c3

added author

a3354cd

bug fix ragged indices comparison

63f2c39

bug fix ragged indices comparison

d2f3de8

bug fix ragged indices comparison

ec36983

Merge branch 'main' into pr-mvc_padding

f70ed11

added extra multivariate indices unit test

02f5ad0

Merge branch 'main' into pr-mvc_padding

684e35f

adam2392 self-requested a review August 5, 2023 02:33

adam2392 reviewed Aug 9, 2023

View reviewed changes

mne_connectivity/utils/utils.py Show resolved Hide resolved

adam2392 reviewed Aug 9, 2023

View reviewed changes

mne_connectivity/utils/utils.py Outdated Show resolved Hide resolved

adam2392 reviewed Aug 9, 2023

View reviewed changes

tsbinns added 3 commits August 14, 2023 16:44

updated utils tests and docs

c229d4e

bug fix utils doc update

ffca93f

bug fix utils doc update

c9855b7

bug fix utils doc update

8c35917

adam2392 reviewed Aug 24, 2023

View reviewed changes

mne_connectivity/spectral/epochs.py Outdated Show resolved Hide resolved

adam2392 reviewed Aug 24, 2023

View reviewed changes

mne_connectivity/spectral/tests/test_spectral.py Show resolved Hide resolved

adam2392 reviewed Aug 24, 2023

View reviewed changes

adam2392 mentioned this pull request Aug 24, 2023

Slate v0.6 for release #146

Closed

tsbinns added 2 commits August 28, 2023 11:59

Merge branch 'main' into pr-mvc_padding

9bd2513

updated spectral tests

409c2c6

tsbinns and others added 3 commits August 31, 2023 11:21

added note for refactoring

fe0fe68

updated spectral tests

b7fcf12

Update ignore words

96a0dcf

Signed-off-by: Adam Li <[email protected]>

tsbinns added 10 commits November 2, 2023 10:30

Revert "bug fix missing refactoring for example"

f54cecc

This reverts commit 6fe682f.

Revert "Squashed commit of the following:"

610ab7d

This reverts commit 5bad5fb.

Revert "Squashed commit of the following:"

f106d07

This reverts commit 5bad5fb.

switched to masked indices for multivariate conn

c985886

Revert "switch to masked arrays for indices"

a6a58e4

This reverts commit 921cf9d.

Revert "bug fix missing refactoring for example"

034dafe

This reverts commit 6fe682f.

Revert "Squashed commit of the following:"

4907106

This reverts commit 5bad5fb.

Merge branch 'pr-mvc_padding_revert' into pr-mvc_padding

736e642

updated time

cd69d65

switched to masked indices for multivariate conn

d6ba398

larsoner reviewed Nov 2, 2023

View reviewed changes

adam2392 reviewed Nov 2, 2023

View reviewed changes

tsbinns and others added 6 commits November 2, 2023 19:12

removed redundant ignored word

0732d22

Co-authored-by: Eric Larson <[email protected]>

removed redundant list creation

cd0b90f

Co-authored-by: Eric Larson <[email protected]>

updated default non-zero rank tolerance

883e816

Merge branch 'pr-mvc_padding' of https://github.com/tsbinns/mne-conne…

ff3b9e9

…ctivity into pr-mvc_padding

switched to array indices & added inline comments

f5c1c3e

fixed grammar mistake

bac5000

drammock merged commit 5e3724a into mne-tools:main Nov 3, 2023
15 checks passed

This was referenced Nov 6, 2023

[MAINT] Refactor and move computation functions for epochs and time in spectral submodule to a sep file #127

Closed

Add support for additional multivariate methods #155

Open

tsbinns mentioned this pull request Nov 18, 2023

[MAINT] Refactor bivariate and multivariate methods into separate files #156

Merged

tsbinns deleted the pr-mvc_padding branch November 25, 2023 12:11

tsbinns mentioned this pull request Nov 28, 2023

[ENH, WIP] Multivariate connectivity methods (imaginary coherence and Granger causality) #125

Closed

tsbinns pushed a commit to tsbinns/mne-connectivity that referenced this pull request Dec 15, 2023

Merge pull request mne-tools#142 from tsbinns/pr-mvc_padding

8bebc62

[ENH] Add support for ragged connections with multivariate methods with padding

ruuskas mentioned this pull request Mar 6, 2024

Memory issues introduced in spectral_connectivity_time by #138 #174

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Add support for ragged connections with multivariate methods with padding #142

[ENH] Add support for ragged connections with multivariate methods with padding #142

tsbinns commented Jul 24, 2023 •

edited

Loading

adam2392 Aug 9, 2023

tsbinns Aug 9, 2023

adam2392 Aug 24, 2023

drammock Sep 8, 2023

adam2392 left a comment

tsbinns commented Aug 14, 2023

adam2392 Aug 24, 2023

tsbinns Aug 28, 2023

tsbinns Aug 28, 2023

adam2392 Aug 31, 2023

tsbinns Sep 1, 2023

adam2392 left a comment

tsbinns commented Aug 28, 2023

adam2392 commented Sep 1, 2023

tsbinns commented Nov 2, 2023

larsoner left a comment

larsoner Nov 2, 2023

tsbinns Nov 2, 2023

larsoner Nov 2, 2023

tsbinns Nov 2, 2023

larsoner Nov 2, 2023

adam2392 Nov 2, 2023

tsbinns Nov 2, 2023

tsbinns Nov 2, 2023

tsbinns Nov 3, 2023

adam2392 left a comment

drammock commented Nov 3, 2023

adam2392 commented Nov 3, 2023

	=========================================================
	Working with ragged indices for multivariate connectivity
	=========================================================

	This example demonstrates how multivariate connectivity involving different
	numbers of seeds and targets can be handled in MNE-Connectivity.

	def test_multivar_save_load(tmp_path):
	"""Test saving and loading results of multivariate connectivity."""
	rng = np.random.RandomState(0)
	n_epochs, n_chs, n_times, sfreq, f = 10, 4, 2000, 1000., 20.
	data = rng.randn(n_epochs, n_chs, n_times)
	sig = np.sin(2 * np.pi * f * np.arange(1000) / sfreq) * np.hanning(1000)
	data[:, :, 500:1500] += sig
	info = create_info(n_chs, sfreq, 'eeg')
	tmin = -1
	epochs = EpochsArray(data, info, tmin=tmin)
	tmp_file = os.path.join(tmp_path, 'foo_mvc.nc')

	non_ragged_indices = (np.array([[0, 1]]), np.array([[2, 3]]))
	ragged_indices = (np.array([[0, 1]]), np.array([[2]]))
	for indices in [non_ragged_indices, ragged_indices]:
	con = spectral_connectivity_epochs(
	epochs, method=['mic', 'mim', 'gc', 'gc_tr'], indices=indices,
	sfreq=sfreq, fmin=10, fmax=30)
	for this_con in con:
	this_con.save(tmp_file)
	read_con = read_connectivity(tmp_file)
	assert_array_almost_equal(this_con.get_data(),
	read_con.get_data('raveled'))
	if this_con.attrs['patterns'] is not None:
	assert_array_almost_equal(np.array(this_con.attrs['patterns']),
	np.array(read_con.attrs['patterns']))
	# split `repr` before the file size (`~23 kB` for example)
	a = repr(this_con).split('~')[0]
	b = repr(read_con).split('~')[0]
	assert a == b

	if isinstance(indices, tuple):
	# check that the indices passed in are of the same length
	if len(indices[0]) != len(indices[1]):
	raise ValueError(f'If indices are passed in '
	f'then they must be the same '
	f'length. They are right now '
	f'{len(indices[0])} and '
	f'{len(indices[1])}.')
	# indices length should match the data length
	if len(indices[0]) != data_len:
	raise ValueError(
	f'The number of indices, {len(indices[0])} '
	f'should match the raveled data length passed '
	f'in of {data_len}.')

	elif indices == 'symmetric':
	expected_len = ((n_nodes + 1) * n_nodes) // 2
	if data_len != expected_len:
	raise ValueError(f'If "indices" is "symmetric", then '
	f'connectivity data should be the '
	f'upper-triangular part of the matrix. There '
	f'are {data_len} estimated connections. '
	f'But there should be {expected_len} '
	f'estimated connections.')

	# convert indices numpy arrays to a tuple of arrays
	if isinstance(array.attrs['indices'], np.ndarray):
	array.attrs['indices'] = tuple(array.attrs['indices'])

	def test_spectral_connectivity_indices_maintained(tmp_path):
	"""Test that indices values and type is maintained after saving.

	If `indices` is None, `indices` in the returned connectivity object should
	be None, otherwise, `indices` should be a tuple. The type of `indices` and
	its values should be retained after saving and reloading.
	"""
	rng = np.random.RandomState(0)
	n_epochs, n_chs, n_times, sfreq, f = 5, 4, 2000, 1000., 20.
	data = rng.randn(n_epochs, n_chs, n_times)
	sig = np.sin(2 * np.pi * f * np.arange(1000) / sfreq) * np.hanning(1000)
	data[:, :, 500:1500] += sig
	info = create_info(n_chs, sfreq, 'eeg')
	tmin = -1
	epochs = EpochsArray(data, info, tmin=tmin)
	freqs = np.arange(10, 31)
	tmp_file = os.path.join(tmp_path, 'foo_mvc.nc')

	bivar_indices = (np.array([0, 1]), np.array([2, 3]))
	multivar_indices = (np.array([[0, 1]]), np.array([[2, 3]]))
	indices = [None, bivar_indices, None, multivar_indices]
	methods = ['coh', 'coh', 'mic', 'mic']

	for this_indices, this_method in zip(indices, methods):
	con_epochs = spectral_connectivity_epochs(
	epochs, method=this_method, indices=this_indices, sfreq=sfreq,
	fmin=10, fmax=30)
	con_time = spectral_connectivity_time(
	epochs, freqs, method=this_method, indices=this_indices,
	sfreq=sfreq)

	for con in [con_epochs, con_time]:
	con.save(tmp_file)
	read_con = read_connectivity(tmp_file)
	if this_indices is not None:
	# check indices of same type (tuples)
	assert (isinstance(con.indices, tuple) and
	isinstance(read_con.indices, tuple))
	# check indices have same values
	assert np.all(np.array(con.indices) ==
	np.array(read_con.indices))
	else:
	assert con.indices is None and read_con.indices is None

	# mask indices
	indices_use = _check_multivariate_indices(indices, n_signals)

	def _check_multivariate_indices(indices, n_chans):
	"""Check indices parameter for multivariate connectivity and mask it.

	Parameters
	----------
	indices : tuple of array of array of int, shape (2, n_cons, variable)
	Tuple containing index sets.

	n_chans : int
	The number of channels in the data. Used when converting negative
	indices to positive indices.

	Returns
	-------
	indices : tuple of array of array of int, shape of (2, n_cons, max_n_chans)
	The indices as a masked array.

	Notes
	-----
	Indices for multivariate connectivity should be a tuple of length 2
	containing the channel indices for the seed and target channel sets,
	respectively. Seed and target indices should be equal-length array-likes
	representing the indices of the channel sets in the data for each
	connection. The indices for each connection should be an array-like of
	integers representing the individual channels in the data. The length of
	indices for each connection do not need to be equal. All indices within a
	connection must be unique.

	If the seed and target indices are given as lists or tuples, they will be
	converted to numpy arrays. Because the number of channels can differ across
	connections or between the seeds and targets for a given connection (i.e.
	ragged/jagged indices), the returned array will be padded out to a 'full'
	array with an invalid index (``-1``) according to the maximum number of
	channels in the seed or target of any one connection. These invalid
	entries are then masked and returned as numpy masked arrays. E.g. the
	ragged indices of shape ``(2, n_cons, variable)``::

	indices = ([[0, 1], [0, 1 ]], # seeds
	[[2, 3], [4, 5, 6]]) # targets

	would be padded to full arrays::

	indices = ([[0, 1, -1], [0, 1, -1]], # seeds
	[[2, 3, -1], [4, 5, 6]]) # targets

	to have shape ``(2, n_cons, max_n_chans)``, where ``max_n_chans = 3``. The
	invalid entries are then masked::

	indices = ([[0, 1, --], [0, 1, --]], # seeds
	[[2, 3, --], [4, 5, 6]]) # targets

	In case "indices" contains negative values to index channels, these will be
	converted to the corresponding positive-valued index before any masking is
	applied.

	More information on working with multivariate indices and handling
	connections where the number of seeds and targets are not equal can be
	found in the :doc:`../auto_examples/handling_ragged_arrays` example.
	"""

	docdict['indices'] = """
	indices : tuple of arrays \| str \| None
	The indices of relevant connectivity data. If ``'all'`` (default),
	then data is connectivity between all nodes. If ``'symmetric'``,
	then data is symmetric connectivity between all nodes. If a tuple,
	then the first list represents the "in nodes", and the second list
	represents the "out nodes". See "Notes" for more information.
	"""

[ENH] Add support for ragged connections with multivariate methods with padding #142

[ENH] Add support for ragged connections with multivariate methods with padding #142

Conversation

tsbinns commented Jul 24, 2023 • edited Loading

Padding indices

Padding patterns

Other changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adam2392 left a comment

Choose a reason for hiding this comment

tsbinns commented Aug 14, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adam2392 left a comment

Choose a reason for hiding this comment

tsbinns commented Aug 28, 2023

adam2392 commented Sep 1, 2023

Discussion on why unit-test fails

tsbinns commented Nov 2, 2023

larsoner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adam2392 left a comment

Choose a reason for hiding this comment

drammock commented Nov 3, 2023

adam2392 commented Nov 3, 2023

tsbinns commented Jul 24, 2023 •

edited

Loading

Padding `indices`

Padding `patterns`