You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The latest DR TLS changes from #2089 break the DrM nudge test:
Trying to update the DR used by DrM results in the nudge test failing (I
had to add custom output to see the nudge-out file contents):
62: <received nudge mask=0x40000 id=0x00000000 arg=0x0000000000000000>
62: <ERROR: master_signal_handler with no siginfo (i#26?): tid=16408, sig=4>
62: |
62:
62: CMake Error at runtest.cmake:249 (message):
62: Timed out waiting for summary output
62:
master_signal_handler: sig=4, retaddr=0xf776bbd0
handle_nudge_signal: sig=4 code=-1 errno=268697600
received nudge version=1 flags=0x0 mask=0x40000 id=0x00000000 arg=0x0000000000000000
SYSLOG_INFORMATION: received nudge mask=0x40000 id=0x00000000 arg=0x0000000000000000
master_signal_handler 4 returning now
Exit from F1647(0x08048901).0xe79cbcc0 (shared)
(UNKNOWN DIRECT EXIT F1647.0xe79cbcc0->F1647)
dispatch: target = 0x08048901
synch with all threads my id = 17188 Giving 4 permission and seeking 4 state
Skipping synch with thread 17188
Finished synch with all threads: result=1
returning holding initexit_lock and all_threads_synch_lock
os_switch_seg_to_base: switching to app, setting gs to 0x63
os_switch_seg_to_base to app: set_thread_area successful for thread 17188 base 0x0804cc80
TLS copy is 0xe77516e4
os_switch_seg_to_base: switching to dr, setting fs to 0x73
os_switch_seg_to_base to DR: set_thread_area successful for thread 17188 base 0xe77516e4
=> <failure here>
os_switch_seg_to_base: switching to dr, setting gs to 0x63
os_switch_seg_to_base to DR: set_thread_area successful for thread 17188 base 0xe77c2b70
os_switch_seg_to_base: switching to dr, setting fs to 0x73
os_switch_seg_to_base to DR: set_thread_area successful for thread 17188 base 0xe77bd000
os_file_exists failed: 0xfffffffe
os_file_exists failed: 0xfffffffe
os_file_exists failed: 0xfffffffe
Just flushed targetf, next_tag is 0x08048901
Entry into F1647(0x08048901).0xe79cbc77 (shared)
The failure happens when the 2nd nudge arrives when the thread is in app state.
The #2089 changes mean that a nudge signal arriving in a native thread
will always cause this assert. Previously dr_switch_to_app_state() did
nothing on Linux which is why nothing fired.
For the DrM failure, it's a leak scan nudge, which calls
dr_switch_to_app_state() for Windows TEB. For now I will just have DrM
only do the swap to app state on Windows. Longer term we need a better
signal-in-cur-app-thread strategy.
We have another short-term problem too: alloc_replace.c calls dr_switch_to_app_state() all over the place.
Maybe dr_switch_to_app_state() for a full-control client shouldn't swap the DR TLS on Linux.
Though that won't help solve the start/stop signal handling it will avoid this regression. I can add a dr_state_flags_t controlling it, as a negative to keep compatibility, or only maintaining source compat.
This all means there's also a racy window pre-clone where a nudge or other signal will fail, so we want a broader solution.
549ce61 disables DR TLS swapping by default for clients. Leaving this issue open for more general handling including covering signals during the mentioned race window.
The latest DR TLS changes from #2089 break the DrM nudge test:
Trying to update the DR used by DrM results in the nudge test failing (I
had to add custom output to see the nudge-out file contents):
I can repro like this:
The failure happens when the 2nd nudge arrives when the thread is in app state.
The #2089 changes mean that a nudge signal arriving in a native thread
will always cause this assert. Previously dr_switch_to_app_state() did
nothing on Linux which is why nothing fired.
For the DrM failure, it's a leak scan nudge, which calls
dr_switch_to_app_state() for Windows TEB. For now I will just have DrM
only do the swap to app state on Windows. Longer term we need a better
signal-in-cur-app-thread strategy.
Xref the related #1921.
The text was updated successfully, but these errors were encountered: