-
Notifications
You must be signed in to change notification settings - Fork 566
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread takeover on attach fails b/c SIGUSR2 is blocked: switch to another signal? #5458
Comments
Changes the signal that DR uses to suspend a thread from SIGUSR2, which is sometimes blocked by the app at attach time, to SIGFPE, which as a fatal normally-synchronous signal is less likely to be blocked. Manually tested on an attach to mysqld which failed with SIGUSR2 but succeeds now. Fixes #5458
Unfortunately it's looking like QEMU does not handle DR sending SIGFPE via SYS_kill: it just crashes right up front.
Could try:
Maybe the best thing is to change it from a compile-time constant to a |
I think it is a little too complex to have a runtime option-controlled signal with all the constraints: going to go with hardcoded SIGSTKFLT on Linux and SIGFPE on Mac for now. |
Changes the signal that DR uses to suspend a thread from SIGUSR2, which is sometimes blocked by the app at attach time, to SIGSTKFLT on Linux and SIGFPE on Mac. (SIGFPE was the first choice on Linux but QEMU crashes when we use it.) These are fatal normally-synchronous signals and so are less likely to be blocked. Manually tested on an attach to mysqld which failed with SIGUSR2 but succeeds now. Fixes #5458
Attaching (via ptrace #38 or for statically-linked DR) to a process that has masked most non-fatal signals fails to take over the rest of the app threads. We could try to use ptrace to take them over but that is difficult for the static-link case. Or we could switch from SIGUSR2 to a signal less likely to be masked, like SIGFPE. We would distinguish from a synchronous signal by looking at si->code (set as far back as 2.2 kernel; some other siginfo fields were unreliable back then but not this one) and other fields.
E.g., this is hit attaching to mysqld, which blocks all non-fatal signals. The ptrace attach succeeds but then DR's takeover times out and fails.
The text was updated successfully, but these errors were encountered: