Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code_api|pthreads.ptsig failing on x86 Travis #2921

Closed
toshipiazza opened this issue Apr 11, 2018 · 3 comments
Closed

code_api|pthreads.ptsig failing on x86 Travis #2921

toshipiazza opened this issue Apr 11, 2018 · 3 comments

Comments

@toshipiazza
Copy link
Contributor

Seen once in #2906

 93/294 Test  #88: code_api|pthreads.ptsig ..........................................***Failed  Required regular expression not found.Regex=[^Estimation of pi is 3\.142425985001098
$
]  0.64 sec
<ERROR: master_signal_handler with no siginfo (i#26?): tid=19519, sig=10>
Estimation of pi is 3.142425985001098
@hgreving2304 hgreving2304 self-assigned this Jan 22, 2019
hgreving2304 pushed a commit that referenced this issue Jan 29, 2019
The test occasionally hit the case where a async signal was received while the main thread is exiting. The patch adds a global var that waits for until the signal has been received.

Fixes #2921
@hgreving2304
Copy link

hgreving2304 commented Jan 31, 2019

There are 2 separate problems with this test.

First problem. The 2 threads it's spawning each are sending a signal. Occasionally the signal is delayed enough such that the receiving main thread is already on its way to exit and we are receiving the signal w/o a dcontext.

Second problem. The 2 threads are getting spawned right after another, and it appears that occasionally the first child thread is sending it so early that the signal is received in the main thread while it is cloning the second child thread. Specifically, while it is cloning/memcpy its tls memory with its tls magic TLS_MAGIC_INVALID. DR's signal handler code only handles the case where a SUSPEND signal arrives while a thread is in this temporary state. Other signals cause error above.

@derekbruening
Copy link
Contributor

The 2nd problem: xref #1921, #1242. It is a regression from #2089's changes: before that this would not have been a problem.

@derekbruening
Copy link
Contributor

Aso xref #2400

hgreving2304 pushed a commit that referenced this issue Feb 1, 2019
Fixes an asynch signal arriving late when thread is on its way to exit by blocking all signals during exit. This is ok if because we're past the app's thread_exit system call. When doing a detach, the app's signal mask is restored before going native. For terminate events using kill, we are not blocking the kill signal. Suspend signal is also excluded because a detaching thread may try to synchronize with an exiting threads and as long as the signal is getting delivered to this thread, we need to reply to it from the signal handler.

Adds support to the signal handler for an asynch signal arriving in the middle of a clone system call within DR, while the spawning thread's tls magic is invalid. We're now detecting this condition in the signal handler, but we need to be careful in the future that any thread has not been added to the global all_threads list as long as its tls magic is invalid.

Fixes #2921
hgreving2304 pushed a commit that referenced this issue Feb 5, 2019
…one syscall (#3364)

Fixes an asynch signal arriving late when thread is on its way to exit by blocking all signals during exit. This is ok, because we're at the app's thread_exit system call. When doing a detach, the app's signal mask is restored before going native. For terminate events using kill, we are not blocking the kill signal. The suspend signal is also excluded, because a detaching thread may try to synchronize with an exiting thread and as long as the signal is getting delivered to this thread, we need to reply to it from the signal handler.

Adds support to the signal handler for an asynch signal arriving in the middle of a clone system call or temporarily-native thread, while the spawning thread's tls magic sentinel is invalid. We're now detecting this condition in the signal handler by making assumptions as stated in this patch.

Fixes #2921
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants