Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal Python error: Aborted when deleting span from SpanGroup #13638

Open
NixBiks opened this issue Sep 27, 2024 · 5 comments
Open

Fatal Python error: Aborted when deleting span from SpanGroup #13638

NixBiks opened this issue Sep 27, 2024 · 5 comments

Comments

@NixBiks
Copy link
Contributor

NixBiks commented Sep 27, 2024

How to reproduce the behaviour

Unfortunately I haven't been able to make a minimal reproducible example. The error happens sometimes when I run my tests - the more tests I run the higher probability of running into the following critical error

Python(69256,0x1fa11cf40) malloc: Region cookie corrupted for region 0x12f800000 (value is c)[0x12f80407c]
Python(69256,0x1fa11cf40) malloc: *** set a breakpoint in malloc_error_break to debug
Fatal Python error: Aborted

I've found out that I don't encounter the error if I remove the following line.

del span.doc.spans["my_spans"][index]

The documentation states this is the way to do it though. Note there is a typo in the code in the docs though.

I realise you most likely can't help with this little info but maybe you can give me some ways to investigate further.

Info about spaCy

  • spaCy version: 3.7.4
  • Platform: macOS-14.6.1-arm64-arm-64bit
  • Python version: 3.12.5
@NixBiks
Copy link
Contributor Author

NixBiks commented Sep 30, 2024

My current workaround is to create a custom span attribute and filter on that.

Span.set_extension("is_removed", default=False)

Instead of removing the object with del I simply change that custom attribute span.doc.spans["my_spans"][index]._.is_removed = True

@honnibal
Copy link
Member

honnibal commented Oct 2, 2024

Thanks for the report. I'll have a look at this.

@honnibal
Copy link
Member

honnibal commented Oct 2, 2024

Is it possible you're removing an element during iteration? From the implementation this wouldn't be expected to work: https://github.com/explosion/spaCy/blob/master/spacy/tokens/span_group.pyx#L113

@NixBiks
Copy link
Contributor Author

NixBiks commented Oct 2, 2024

It's inside some iteration but I'm not iterating over the spans in the SpanGroup no. I've iterated earlier to find the index that needs to be removed though. I hope it makes sense.

@honnibal
Copy link
Member

honnibal commented Oct 2, 2024

I think it makes sense yeah. The pattern you want to avoid is something like:

for span in span_group:
    if check_remove(span):
        del span_group[span]

Deleting during iteration like that won't work, and of course precalculating an index and not adjusting it after you delete will cause problems too. But the latter one shouldn't cause an out-of-bounds access. Unless there's a bug in how we check the indexing, it should only be possible to get the out-of-bounds access from within the class's C methods.

These are the tests we have for the __del__ method, which look okay to me: https://github.com/explosion/spaCy/blob/master/spacy/tests/doc/test_span_group.py#L191

Usually if we have a memory access problem in the tests we do get a crash, because we've got a lot of tests that run in various orders. So it must be some sort of access pattern that's not working. Are you able to share any code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants