Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BaseRaw.duration property and fix duration calculation for BaseRaw "repr/html" display #12955

Merged
merged 6 commits into from
Nov 14, 2024

Conversation

leorochael
Copy link
Contributor

@leorochael leorochael commented Nov 11, 2024

Reference issue (if any)

Fixes: #12954

What does this implement/fix?

Now the duration calculation takes into account the length of the last sample, by calculating the duration in seconds based on the number of samples divided by the sampling frequency.

Additional information

Instead of using len(self.times) / self.info['sfreq'] as suggested in the bugfix, I used self.n_times / self.info['sfreq'] instead, as self.times is calculated from the same info self.n_times is calculated from, but self.n_times seems to be "cheaper".

I humbly suggest reviewing commit by commit.

Copy link

welcome bot commented Nov 11, 2024

Hello! 👋 Thanks for opening your first pull request here! ❤️ We will try to get back to you soon. 🚴

@cbrnr
Copy link
Contributor

cbrnr commented Nov 11, 2024

Quick question, would this not also be relevant for the normal __repr__?

@leorochael
Copy link
Contributor Author

Quick question, would this not also be relevant for the normal __repr__?

@cbrnr, good point! For a 500 hz EDF, the repr actually reports something like:

<RawEDF | filename.edf, 30 x 1800000 (3600.0 s), ~34 kB, data not loaded>

So I didn't realize it was also using the wrong self.times[-1] duration. I'll try to reproduce a failure in a test.

@cbrnr
Copy link
Contributor

cbrnr commented Nov 11, 2024

Isn't 3600.0 s correct?

@leorochael
Copy link
Contributor Author

Isn't 3600.0 s correct?

It is! But only because:

>>> f"{3599.99609375:0.1f}"
'3600.0'

@cbrnr
Copy link
Contributor

cbrnr commented Nov 11, 2024

OK, this makes sense.

A single sample can be interpreted as the measured value sampled at a particular time instant, so technically it has no duration. But I agree that in general, it makes more sense to take the number of samples and divide it by the sampling frequency to get the duration in seconds 😄.

There is a related issue when creating epochs. If you specify tmin=-0.25 and tmax=0.75, and assuming you have a sampling frequency of 100Hz, which duration would you expect? Currently, the epoch is 101 samples long (1.01s), because it includes time point 0.

@leorochael
Copy link
Contributor Author

leorochael commented Nov 11, 2024

A single sample can be interpreted as the measured value sampled at a particular time instant, so technically it has no duration.

I get that logic, but when you take a collection of samples as a collection, each individual sample is representative of the interval, at least in the sense that, until the next "beat" of the sampling frequency, no other sample will be present...

There is a related issue when creating epochs. If you specify tmin=-0.25 and tmax=0.75, and assuming you have a sampling frequency of 100Hz, which duration would you expect? Currently, the epoch is 101 samples long, so 1.01s, because it includes time point 0.

That depends on whether tmax represent an open or closed end of the interval... In my experience, treating the ends of intervals as being open (i.e. not included), with the starts being closed, makes it easier to stitch together contiguous time intervals (collections of contiguous epochs?).

@cbrnr
Copy link
Contributor

cbrnr commented Nov 11, 2024

Yes, I totally agree with you (and these are two separate issues, currently the epochs interval is inclusive on both ends, and we should probably add an option to exclude the end time).

@leorochael
Copy link
Contributor Author

In order to avoid repeating the self.n_times / self.info["sfreq"] calculation, can I create a BaseRAW.duration property?

I suppose this would mean introducing a new feature, rather than being a bugfix. I could introduce a BaseRAW._duration property instead...

@leorochael
Copy link
Contributor Author

leorochael commented Nov 11, 2024

Another question, the Towncrier action failed, but I don't understand why. I added a new doc/changes/devel as required...

mne/io/base.py Outdated
Comment on lines 2136 to 2149
def _get_duration_timedelta(self):
seconds = self.n_times / self.info["sfreq"]
return timedelta(seconds=seconds)

def _get_duration_string(self, duration):
# https://stackoverflow.com/a/10981895
hours, remainder = divmod(duration.seconds, 3600)
minutes, seconds = divmod(remainder, 60)
seconds += duration.microseconds / 1e6
seconds = np.ceil(seconds) # always take full seconds

return f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of the change to using self.n_times / self.info["sfreq"] instead of self.times[-1]. But the rest of this seems a bit overcomplicated, I'm not sure why timedelta is even needed? Why not this, either inside _repr_html_ or as a single helper method?

duration = self.n_times / self.info["sfreq"]
hours, remainder = divmod(duration, 3600)
minutes, seconds = divmod(remainder, 60)
duration = f"{hours:02.0f}:{minutes:02.0f}:{np.ceil(seconds):02.0f}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @drammock, I moved the timedelta back into the ._get_duration_string() method.

On the other hand, I've factored out the self.n_times / self.info["sfreq"] calculation into a new .duration property.

The timedelta usage was originally added to the string representation to use in the rounding-up of the seconds amount, so indeed there is no sense in having a method just to return it.

On the other hand, having a separate ._get_duration_string method helps testing it in isolation from the ._repr_html_() method which is barely tested otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I kept the timedelta on the refactored method, because I didn't want to mess with the logic for calculating _repr_html_.

But if you want it removed, I'm ok with that.

@cbrnr
Copy link
Contributor

cbrnr commented Nov 12, 2024

I like these changes, but I'll let @drammock decide (1) if it's OK to add a BaseRAW.duration property and (2) whether or not the duration calculation should be in a private method or just in _repr_html_.

Also, I have no idea why Towncrier is failing, both changelog entries look correct to me. Maybe because there are two and you are a new contributor? Also something for @drammock or @larsoner.

Copy link
Contributor

@wmvanvliet wmvanvliet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 on adding a .duration property.

@@ -0,0 +1 @@
Add :meth:`mne.io.base.BaseRAW.duration` property to centralize duration calculation for :meth:`mne.io.base.BaseRAW.__repr__` and :meth:`mne.io.base.BaseRAW._repr_html_`, by :newcontrib:`Leonardo Rochael Almeida`.
Copy link
Contributor

@wmvanvliet wmvanvliet Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be named 12955.newfeature.rst which is why towncrier is complaining.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this will fail because:

  1. the classname is BaseRaw not BaseRAW
  2. duration is an :attr: not a :meth:
  3. I think(?) that you can't cross-reference the methods __repr__ and _repr_html_ (you can only cross-ref things that are in doc/python_reference.rst (or the various files in doc/api/*.rst that get included into it). I suggest rewording this to just say "there's a new convenience property Raw.duration" and don't mention the repr stuff.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just one (repeated) comment

mne/io/base.py Outdated Show resolved Hide resolved
doc/changes/devel/12955.bugfix.rst Outdated Show resolved Hide resolved
@@ -0,0 +1 @@
Add :meth:`mne.io.base.BaseRAW.duration` property to centralize duration calculation for :meth:`mne.io.base.BaseRAW.__repr__` and :meth:`mne.io.base.BaseRAW._repr_html_`, by :newcontrib:`Leonardo Rochael Almeida`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this will fail because:

  1. the classname is BaseRaw not BaseRAW
  2. duration is an :attr: not a :meth:
  3. I think(?) that you can't cross-reference the methods __repr__ and _repr_html_ (you can only cross-ref things that are in doc/python_reference.rst (or the various files in doc/api/*.rst that get included into it). I suggest rewording this to just say "there's a new convenience property Raw.duration" and don't mention the repr stuff.

The `|` in the regex meant that any string containing `<RawArray `
would match, so the data in the repr was not being checked.
It represents the duration in seconds of the acquisition.

Centralize duration calculation for the textual (__repr__) and html
(_repr_html_, used by e.g. Jupyter) display of mne.io.BaseRAW instances
using the new property.

Extract the method to get the human string from the duration property
for use by the html representation, so it's easier to test later.
The duration calculation, used in `BaseRAW._html_repr_()` and
`BaseRAW.__repr__()`, was taking the timestamp of the last sample as the
duration of the acquisition, but was not accounting for the length of
the last sample.

Also, added tests for the refactored `BaseRAW.duration` property and
`BaseRAW._get_duration()` method, and used a sfreq value that revealed
the discrepancy in the duration calculation in the `BaseRAW.__repr__()`
method.

Finally, simplified the duration string calculation for the html display
by rounding up all the duration seconds, not just the remainder after
hour and minute calculations, thereby avoiding "00:01:60" calculations,
which should have been "00:02:00", when there are fractions of a second
remaining.

Fixes: mne-tools#12954
@leorochael leorochael changed the title Fix duration calculation for BaseRAW._html_repr_ Fix duration calculation for BaseRAW "repr/html" display Nov 14, 2024
@leorochael leorochael changed the title Fix duration calculation for BaseRAW "repr/html" display Add BaseRaw.property and fix duration calculation for BaseRAW "repr/html" display Nov 14, 2024
@leorochael
Copy link
Contributor Author

Did the changes suggested, both to towncrier files and eliminating the timedelta from the calculation.

By moving the np.ceil() call earlier, this helped resolve some other nonsense time rendering ("00:01:60" rather than "00:02:00" when rounding up fractions of a second).

I also enhanced the repr/duration-string tests with parametrize to test different situations.

Still need to understand the CI failures that are happening...

@leorochael
Copy link
Contributor Author

leorochael commented Nov 14, 2024

According to this, it seems the new duration property reference cannot be found, but I don't understand where else I should declare it...

Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will push a commit once local build confirms it should fix things then mark for merge-when-green since @drammock's comments have been addressed. Thanks in advance @leorochael !

@leorochael leorochael changed the title Add BaseRaw.property and fix duration calculation for BaseRAW "repr/html" display Add BaseRaw.property and fix duration calculation for BaseRaw "repr/html" display Nov 14, 2024
@leorochael leorochael changed the title Add BaseRaw.property and fix duration calculation for BaseRaw "repr/html" display Add BaseRaw.duration property and fix duration calculation for BaseRaw "repr/html" display Nov 14, 2024
@leorochael
Copy link
Contributor Author

All checks have passed. Should I rebase?

@drammock
Copy link
Member

All checks have passed. Should I rebase?

no need. GitHub will squash-merge anyway. Thanks @leorochael!

@drammock drammock merged commit 060b600 into mne-tools:main Nov 14, 2024
28 checks passed
Copy link

welcome bot commented Nov 14, 2024

🎉 Congrats on merging your first pull request! 🥳 Looking forward to seeing more from you in the future! 💪

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The duration calculation for display in BaseRAW._repr_html_ doesn't account for the lenght of the last sample.
5 participants