Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api.thread_churn test failing on Windows due to reattach problems #4349

Open
derekbruening opened this issue Jul 1, 2020 · 0 comments
Open

Comments

@derekbruening
Copy link
Contributor

My new test api.thread_churn for #4334 is failing on Windows due to some re-attach issues:

  • .data is protected causing a crash on init
  • failing to take over threads: 80 such messages on the re-attach:
<Detaching from process, entering final cleanup>
<Starting application D:\derek\dr\git\build_x64_dbg_tests\suite\tests\bin\api.thread_churn.exe (7644)>
<Running on newer-than-this-build "Microsoft Windows 10-1909 x64">
<Early threads found>
<Initial options = -no_dynamic_options -loglevel 2 -code_api -probe_api -msgbox_mask 0 -dumpcore_mask 125 -stderr_mask 15 -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -no_inline_ignored_syscalls -staged -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<CURIOSITY : (0) && "failed to take over a thread!" in file D:\derek\dr\git\src\core\win32\os.c line 2715
version 8.0.18444, custom build
-no_dynamic_options -loglevel 2 -code_api -probe_api -msgbox_mask 0 -dumpcore_mask 125 -stderr_mask 15 -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -no_inline_ignored_syscalls -staged -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<CURIOSITY : (0) && "failed to take over a thread!" in file D:\derek\dr\git\src\core\win32\os.c line 2715
version 8.0.18444, custom build
-no_dynamic_options -loglevel 2 -code_api -probe_api -msgbox_mask 0 -dumpcore_mask 125 -stderr_mask 15 -stack_size 56K -max_elide_jmp 0 -max_elide_call 0 -no_inline_ignored_syscalls -staged -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
<...>

I tried to use logging to diagnose but hit another:

It's not clear what's going on. The test makes 10 threads in the first burst. Is some state left over? Fixing logging would make it much easier to diagnose. For now I'm disabling the test on Windows.

derekbruening added a commit that referenced this issue Jul 1, 2020
These leaks are all cleaned up at process exit, which is why the
existing exit-time leak checks don't notice anything.  The global
block list lets us clean them up later, and hence a classic
reachability-based leak detector would not catch them either: so
perhaps they should be called "reachable accumulations" instead of
"leaks".

Fixes a memory leak on thread exit in all builds:
+ The private reachable_heap units were not being freed

Fixes a number of memory leaks on thread exit in release build:
+ Adds individual private fragment deletion on thread exit, to free
  stubs if -separate_private_stubs is on.
  Turns off -separate_private_stubs by default, to avoid this for
  shared caches where there are few benefits to separating private stubs.
  -thread_private turns it back on where the benefits probably outweight
  the thread exit costs.
+ Moves the privload_tls_exit call to release build too to properly
  unmap the TLS.
+ Moves the sigaltstack free to release build to properly free it.
+ Moves freeing of "local unprotected" heap to the release-build path,
  since it is actually global!  This is done for kstats, stats, and
  clone_tls.
+ Moves client_data_t, client_todo_list_t, and thread-private
  client_flush_req_t to be PROTECTED to make it thread-local and thus
  not need freeing (the free is avoided for i#271).

Adds an -rstats_to_stderr dump point when DR terminates for OOM to
make it much easier to diagnose what the problem is.

Adds a new api.thread_churn test which attaches twice, once with a few
threads and once with many threads, and confirms memory usage has not
gone up.  The 5 rstats on peak vmm block sizes are added to dr_stats_t
to facilitate this.

I tried to get api.thread_churn to work on Windows but hit a number of
issues with reattach.  I solved two of them before bailing and making
the test UNIX-only for now and filed the rest as #4349.
+ Unprotect .data on reattach
+ Clear the logfile flag on detach

Issue: #4334, #271, #4349
Fixes #4334
derekbruening added a commit that referenced this issue Jul 2, 2020
These leaks are all cleaned up at process exit, which is why the
existing exit-time leak checks don't notice anything.  The global
block list lets us clean them up later, and hence a classic
reachability-based leak detector would not catch them either: so
perhaps they should be called "reachable accumulations" instead of
"leaks".

Fixes a memory leak on thread exit in all builds:
+ The private reachable_heap units were not being freed

Fixes a number of memory leaks on thread exit in release build:
+ Adds individual private fragment deletion on thread exit, to free
  stubs if -separate_private_stubs is on.
  Turns off -separate_private_stubs by default, to avoid this for
  shared caches where there are few benefits to separating private stubs.
  -thread_private turns it back on where the benefits probably outweight
  the thread exit costs.
+ Moves the privload_tls_exit call to release build too to properly
  unmap the TLS.
+ Moves the sigaltstack free to release build to properly free it.
+ Moves freeing of "local unprotected" heap to the release-build path,
  since it is actually global!  This is done for kstats, stats, and
  clone_tls.
+ Moves client_data_t, client_todo_list_t, and thread-private
  client_flush_req_t to be PROTECTED to make it thread-local and thus
  not need freeing (the free is avoided for i#271), keeping its
  freeing DEBUG-only.

Adds an -rstats_to_stderr dump point when DR terminates for OOM to
make it much easier to diagnose what the problem is.

Adds a new api.thread_churn test which attaches twice, once with a few
threads and once with many threads, and confirms memory usage has not
gone up.  The 5 rstats on peak vmm block sizes are added to dr_stats_t
to facilitate this.

I tried to get api.thread_churn to work on Windows but hit a number of
issues with reattach.  I solved two of them before bailing and making
the test UNIX-only for now and filed the rest as #4349.
+ Unprotect .data on reattach
+ Clear the logfile flag on detach

Issue: #4334, #271, #4349
Fixes #4334
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant