Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH, MRG] Add EpochsSpectrumArray and SpectrumArray classes #11803

Merged
merged 62 commits into from
Sep 2, 2023

Conversation

alexrockhill
Copy link
Contributor

Makes API a lot nicer and consistent with MNE related to discussion here: https://mne.discourse.group/t/instantiating-epochsspectrum-and-spectrum-class-objects/7127.

Without the docstrings of the array, you actually have to look in __set_state__ to figure out what is required in the state argument which is pretty confusing for me. The use case of ragged/unequal epochs is something I ran into too.

Okay with you @drammock ?

mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
@alexrockhill
Copy link
Contributor Author

Looks like the failures are unrelated, good to go by me

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit with a couple changes too far outside the diff to do as web-interface suggestions. See my remaining comments below.

doc/changes/latest.inc Outdated Show resolved Hide resolved
mne/time_frequency/spectrum.py Show resolved Hide resolved
mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
Comment on lines 1225 to 1226
inst_type_str="Raw",
data_type="Average Power Spectrum",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the data comes from an array we don't know if it was computed on Raw or averaged (Evoked) data. Also we don't know if it's power/amplitude, real/complex, etc.

Suggested change
inst_type_str="Raw",
data_type="Average Power Spectrum",
inst_type_str="Array",
data_type="Unknown",

This will probably break some tests (assuming we're testing thoroughly) so hopefully those test failures can guide what needs to change elsewhere

Copy link
Contributor Author

@alexrockhill alexrockhill Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So real/complex can be inferred from the data type. We ask specifically for power in the constructor so maybe we should require that and let the amplitude kwarg in Spectrum.plot handle that? That'd be my 2 cents but not highly opinionated, whatever is simplest

mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
tutorials/simulation/10_array_objs.py Outdated Show resolved Hide resolved
@alexrockhill
Copy link
Contributor Author

alexrockhill commented Jul 14, 2023

Okay, I think the outstanding things here are:

  • Decide on fmin/fmax or freqs constructor
  • Think about if we want to support passing amplitude with a kwarg (psd_array_welch/multitaper only supports output power or complex so I would vote just supporting those which can be inferred from data type)
  • Decide about what Epochs-level data to put in the EpochsSpectrumArray API (I think it's a good compromise now but happy to hear from all the opinions @drammock requested to weigh in)

Otherwise looks good, thanks for the review @drammock.

Oh one more thing, I set Spectrum._inst_type to np.ndarray for the arrays, it's not used anywhere yet so this won't break anything but I'm not sure what the plan for that attribute was so just wanted to make sure that's the right move for future compatibility.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK looking pretty good, thanks for tackling this! Left a couple more comments/questions, let's wait and see what other devs say now.

mne/time_frequency/spectrum.py Outdated Show resolved Hide resolved
%(drop_log)s
%(events_epochs)s
%(event_id)s
%(metadata_epochs)s
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did metadata get missed here or did you intentionally keep it? To me that one is the clearest case of "can / should be set afterward, not in constructor"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it in, I could see a use case for it here but that sounds reasonable to modularize if we prefer they set metadata later, I'll take it out

mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
tutorials/simulation/10_array_objs.py Outdated Show resolved Hide resolved
@alexrockhill
Copy link
Contributor Author

Okay to merge? The EpochsSpectrumArray constructor is minimal so we can always follow up with a PR adding more things to the constructor if necessary. This seems like a good start.

mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
mne/time_frequency/tests/test_spectrum.py Outdated Show resolved Hide resolved
@agramfort
Copy link
Member

agramfort commented Jul 18, 2023 via email

@drammock
Copy link
Member

Okay to merge?

Your checklist above has only 1 of 3 items checked off (plus there's a fourth item that doesn't have a checkbox --- the question about _inst_type --- which is also not resolved yet). So no: not okay to merge. Is there some rush on your end?

@alexrockhill
Copy link
Contributor Author

alexrockhill commented Jul 18, 2023

Not a huge rush but I'm sharing a script that uses it with someone else so it's a bit of a pain to describe how to install a feature branch so it would be nice not to take too long.

I was going to let whoever merged the PR check those boxes to make sure they're okay with them, it's fine by me how it is right now (after I implement the latest review).

@alexrockhill
Copy link
Contributor Author

alexrockhill commented Jul 18, 2023

I just went ahead and added the fixtures so everyone wins 😄

Oh man that's a creepy smile, maybe 🙂 that's better

@alexrockhill
Copy link
Contributor Author

Looks like the failure is unrelated.

@drammock
Copy link
Member

sorry, I got captured by a grant deadline, it was on my list to come back to this yesterday but I ran out of time. Will prioritize for today.

@alexrockhill
Copy link
Contributor Author

No worries, next week is great too, just didn't want it to get forgotten.

@drammock
Copy link
Member

drammock commented Aug 25, 2023

we chatted about this PR in the dev meeting today, to gauge opinions on whether supporting complex data as input to SpectrumArray and EpochsSpectrumArray is important or YAGNI. General feeling was:

  • PRO: it would be nice to include it for symmetry reasons, since in some cases our .compute_psd() methods return complex data.

  • CON 1: it makes it hard to validate the input data (specifically the dimensionality). Currently we output:

    • 2D real-valued Spectrum (welch, multitaper on Raw/Evoked)
    • 3D real-valued Spectrum (unaggregated welch on Raw/Evoked)
    • 3D complex-valued Spectrum (unaggregated multitaper on Raw/Evoked)
    • 3D real-valued EpochsSpectrum (welch, multitaper on Epochs)
    • 4D real-valued EpochsSpectrum (unaggregated welch on Epochs)
    • 4D complex-valued EpochsSpectrum (unaggregated multitaper on Epochs)

    If we only accept real-valued input, we can safely only support 2 use cases: 2D real data -> Spectrum, 3D real data -> EpochsSpectrum. In contrast, if we allow complex input, for example how do we decide if a 3D complex input should be treated as unaggregated estimates from continuous data versus normal estimates from epoched data?

  • CON 2: supporting complex data means thinking a lot about downstream processing effects (i.e., does it make sense to plot? if so, is abs() a sufficient approach for converting to real-valued numbers? etc) which means significant maintenance burden. Now, we already kinda have to do this for the unaggregated multitaper output, but importantly that is the only case where we allow complex data so (1) we know what is the sensible way to convert to real-valued when needed (i.e. aggregating with precomputed multitaper weights) and (2) we can rely on knowing the dimension of the data (because we created the object).

All this has made me realize something that I previously overlooked / didn't discuss in the meeting, which is that when we originally added complex support, we were outputting numpy arrays so the ramifications didn't need to be traced through the rest of our API to make sure the complexness of the data didn't break things. When I created the Spectrum classes, support for complex unaggregated multitaper output was included for legacy reasons, but it probably would have been better to not support it in the classes (and to keep open a separate array-yielding code path for users who want the unaggregated multitaper output).

Which leads me to ask @mmagnuski (who originally requested the complex unaggregated multitaper output) a few questions:

  1. what was your original use case that led to that request?
  2. do you know others who use MNE's unaggregated multitaper estimates in their work?
  3. do you foresee a use case for computing complex-valued spectral estimates outside of MNE and then putting them into a Spectrum or EpochsSpectrum object?

@alexrockhill
Copy link
Contributor Author

Those are pretty reasonable concerns: you have to pass the method if it's complex and I added checks that the dimensions matched the size of the frequencies passed so I think it the checks for the wrong data make it very safe (i.e. even if frequencies and n_segments/n_tapers are same size, the method is required so that would disambiguate the dimensions. I really can't think of a way you could pass the wrong data).

@mmagnuski
Copy link
Member

@drammock
Hi, I am fine with NOT supporting complex per-taper output in Spectrum classes. The reason for returning complex per-taper output is mostly for calculating multitaper coherence (and other similar measures) - this is done per taper and then averaged. This is an example paper that showcases something similar.
But in general I don't think it would be used frequently and it does not make much sense to support it in the Spectrum object - that might be more time spent implementing and testing than actual user-time spent using the code. :)

@agramfort
Copy link
Member

what's the status here? I see some remaining comments by @drammock. It's a matter of agreeing if we support complex or not?

@alexrockhill
Copy link
Contributor Author

alexrockhill commented Aug 29, 2023

what's the status here? I see some remaining comments by @drammock. It's a matter of agreeing if we support complex or not?

Yes, that's my understanding we should agree whether to support complex psds.

One last point: I think from a maintenance perspective having complex spectrumarrays was actually very helpful because in testing complete cases, there were several bugs in how plotting was handled.

Comment on lines 777 to 789
# handle unaggregated multitaper
if hasattr(self, "_mt_weights"):
logger.info("Aggregating multitaper estimates before plotting...")
psds = _psd_from_mt(psds, weights=self._mt_weights)
# handle unaggregated Welch
elif "segment" in self._dims:
logger.info("Aggregating Welch estimates (median) before plotting...")
seg_axis = self._dims.index("segment")
psds = np.nanmedian(psds, axis=seg_axis)
if np.iscomplexobj(psds): # convert to power for plotting
psds = (psds * psds.conj()).real
if "epoch" in self._dims:
psds = np.mean(psds, axis=self._dims.index("epoch"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. here

@drammock
Copy link
Member

I'll restate her the pros and cons from my comment above:

  • PRO: it would be nice to include it for symmetry reasons, since in some cases our .compute_psd() methods return complex data.
  • CON 1: it makes it hard to validate the input data (specifically the dimensionality).
  • CON 2: supporting complex data means thinking a lot about downstream processing effects.

@mmagnuski's comment suggests to me that we should probably deprecate complex output support in the (Epochs)Spectrum classes, and provide an alternate code path that outputs NumPy arrays, for folks like @mmagnuski who really want access to the individual taper estimates. If we do that, then the main "pro" of allowing complex input to the (Epochs)SpectrumArray classes goes away. I'll gladly acknowledge that some issues with plotting were revealed by the effort of adding that support, but IMO that does not constitute a "maintenance win" --- we can easily keep the plotting bugfixes while eliminating the complex support, and keeping complex support is still more lines of code and more branching code paths and more downstream effects to keep track of, all for the sake of (as far as I've heard) no known real-world use case for why it's preferable to have the complex data in a Spectrum-like container.

My vote here is:

  1. a PR that deprecates complex output in the Spectrum classes
  2. a PR that adds a spectrum pytest fixture and adds it to all the existing tests that use Spectrum objects (which was not done here, and is why I initially requested a separate PR for that in the first place)
  3. a PR that adds the new (Epochs)SpectrumArray classes with only real-valued input allowed (and no support for unaggregated-welch-style input, i.e., 2D is always SpectrumArray, 3D is always EpochsSpectrumArray), and associated tests

@alexrockhill
Copy link
Contributor Author

That's fine with me, I'll remove complex support, you spent a lot of time and effort making spectrum classes @drammock so I'm happy to differ to your judgment and I agree there are not a lot of apparent use cases that come to mind--phase doesn't make any sense, it's not time-frequency, thanks for the direct communication, that helps move forward.

@alexrockhill
Copy link
Contributor Author

Ok I removed complex support and this is ready for review. I changed the data_type to Power Spectrum from Real/Complex Spectrum because of the change not to support complex input. It's just for the display, I really don't care that much so just let me know if anyone wants it back to Real or whatever else.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's a self-review as commentary on the changes I've just pushed.

@larsoner @agramfort or @mmagnuski maybe best for one of you to push the green button on this one, it's got a fair amount of code from both me and @alexrockhill

"""Get raw with power spectral density computed from mne.io.tests.data."""
return [raw.compute_psd(method=method) for method in ("welch", "multitaper")]
return raw.compute_psd()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have a good reason to test both Welch and multitaper in the fixture; the differences aren't relevant to most tests (and should be covered adequately by the tests of the welch/mutitaper array methods used under the hood)

@@ -298,9 +298,9 @@ def raw_ctf():


@pytest.fixture(scope="function")
def raw_psds(raw):
def raw_spectrum(raw):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for naming consistency

Comment on lines 52 to 62
from .ar import fit_iir_model_raw
from .multitaper import dpss_windows, psd_array_multitaper, tfr_array_multitaper
from .spectrum import (
EpochsSpectrum,
EpochsSpectrumArray,
Spectrum,
SpectrumArray,
read_spectrum,
)
from ._stft import stft, istft, stftfreq
from ._stockwell import tfr_stockwell, tfr_array_stockwell
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you merged in main, this wasn't properly integrated into the new lazy loading scheme

@@ -429,7 +429,7 @@ def _check_values(self):
assert len(self._dims) == self._data.ndim, (self._dims, self._data.ndim)
assert self._data.shape == self._shape
# negative values OK if the spectrum is really fourier coefficients
if np.iscomplexobj(self._data):
if "taper" in self._dims:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting to reduce churn, esp. given that we now plan to deprecate support for complex data anyway.

Comment on lines 777 to 787
# handle unaggregated multitaper
if hasattr(self, "_mt_weights"):
logger.info("Aggregating multitaper estimates before plotting...")
psds = _psd_from_mt(psds, weights=self._mt_weights)
# handle unaggregated Welch
elif "segment" in self._dims:
logger.info("Aggregating Welch estimates (median) before plotting...")
seg_axis = self._dims.index("segment")
psds = np.nanmedian(psds, axis=seg_axis)
if np.iscomplexobj(psds): # convert to power for plotting
psds = (psds * psds.conj()).real
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting; will not be needed.

Comment on lines 904 to 914
# handle unaggregated multitaper
if hasattr(self, "_mt_weights"):
logger.info("Aggregating multitaper estimates before plotting...")
psds = _psd_from_mt(psds, weights=self._mt_weights)
# handle unaggregated Welch
elif "segment" in self._dims:
logger.info("Aggregating Welch estimates (median) before plotting...")
seg_axis = self._dims.index("segment")
psds = np.nanmedian(psds, axis=seg_axis)
if np.iscomplexobj(psds): # convert to power for plotting
psds = (psds * psds.conj()).real
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverting, will not be needed.

@@ -1226,11 +1197,21 @@ def __getitem__(self, item):
return BaseRaw._getitem(self, item, return_times=False)


def _check_data_shape(param, dim, data, expected):
if data.shape[dim] != expected:
def _check_data_shape(data, freqs, info, ndim):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworked this to check everything at once rather than needing multiple calls to the check function. Seemed simpler this way once we knew we only had to handle 2D and 3D cases.

@@ -141,18 +142,21 @@ def test_spectrum_io(inst, tmp_path, request, evoked):
assert orig == loaded


def test_spectrum_copy(raw):
def test_spectrum_copy(raw_spectrum):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'd have preferred adding the fixture in its own PR, I decided it was expedient to just go with it, so I've used it in the other spectrum tests too.

@drammock
Copy link
Member

drammock commented Sep 2, 2023

All green, this one is ready for review/merge

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty short diff in the end, thanks @alexrockhill @drammock !

@larsoner larsoner merged commit db07d55 into mne-tools:main Sep 2, 2023
26 checks passed
@alexrockhill alexrockhill deleted the defaults branch September 5, 2023 16:19
snwnde pushed a commit to snwnde/mne-python that referenced this pull request Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants