Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crash when opening a large one-line file (500MB) on 32-bit, and improve loading time #329

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jofon
Copy link

@jofon jofon commented Sep 5, 2023

Problem:
32-bit Notepad++ crashes when opening large one-line files.
Example issue: notepad-plus-plus/notepad-plus-plus#11427

Steps to reproduce problem:
Take the file from notepad-plus-plus/notepad-plus-plus#10407 (comment)
Duplicate it's contents on the same line until you have a 500 MB file.
Try to open it on 32-bit N++. It should crash.

Problem in code:
Calling get_mapped_wstring_range for the entire 500MB line, results in placing 500MB in one single buffer. With 32-bit, this crashes. With 64-bit, it's very slow.

Proposed fix:
Merged and modified get_visible_text and underline_misspelled_words
Doesn't get the entire line at once, instead it now works in blocks of 4096 characters
Takes into account visible lines, horizontal scroll, and the end of the visible text in a line

Changed is_word_under_cursor_correct to check prev token and next token from the current position instead of using the entire line
Also added a protection for when the document isn't loaded by N++

Result of fix:
No crash, and a loading time of around 30 seconds in Debug. N++ might not render the file, but at least it's not crashing.

Worked on this a month ago, and ended up leaving it on the side when I started looking at other functions that are taking too much into memory, and which also crash (or throw a "bad allocation" exception):
erase_all_misspellings
get_all_misspellings_as_string
mark_lines_with_misspelling

Don't think I'll have time to work on them, though.

	Doesn't get the entire line at once, instead it now works in blocks of 4096 characters
	Takes into account visible lines, horizontal scroll, and the end of the visible text in a line

Changed is_word_under_cursor_correct to check prev token and next token from the current position instead of using the entire line
	Also added a protection for when the document isn't loaded by N++
@Predelnik
Copy link
Owner

Thank you very much for all the work, I might not be able to on it in detail currently unfortunately but will try to get to it in reasonable time.

It bothers me slightly that such issues were not reported to plugin's issue list, I don't look at notepad++ own issues that much unfortunately.

@Predelnik
Copy link
Owner

Strangely enough, I see a bit different results:

  • even with 100mb 32-bit Notepad++ without spell checking enabled throws quiet a few std::bad_alloc, I don't think it would work very well after this. The issue happens with or without word-wrap.
  • Without word-wrap DSpellCheck seems to work fine on original 32 mb file.
  • With word-wrap I indeed observe the bad slowdown on original file with spell-checking.
  • However I tried your fix and it seems to not work well for me, it calls GetMappedWstring first with 0, 4096 parameters and then with 4097, 32146539.
  • Maybe I'm doing something differently, can't be sure.

Anyway I will try to look on how to fix it, but the fact that 500mb file was successfully opened by you with 32-but N++ seems surprising to me.
Maybe the other bit we can do is to disable spell-checking on big or poorly structured files by default and make it enabled only via a special option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants