-
Notifications
You must be signed in to change notification settings - Fork 568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
i#5505 PT tracing: Skip interrupted thread-final syscall trace #7027
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Skips dumping the PT trace for the thread-final syscall. Syscall PT traces from thread-final syscalls like futex, epoll_wait have been observed to not decode successfully. They also do not represent the correct app behavior as they were interrupted by the detach signal. We skip dumping them to the raw trace. Issue: #5505
abhinav92003
changed the title
i#5505 PT tracing: Skip thread-final syscall trace
i#5505 PT tracing: Skip interrupted thread-final syscall trace
Oct 8, 2024
derekbruening
approved these changes
Oct 8, 2024
abhinav92003
added a commit
that referenced
this pull request
Oct 10, 2024
Adds a test where one of the threads is waiting on a futex when detach occurs. PT traces for such futex syscalls have been observed to fail in libipt decode. We also do not want such PT traces because they do not represent real app behavior, as the syscall was interrupted by DR's detach signal. #7027 added logic to skip them from the written trace. This PR adds a regression test. Unfortunately this test still does not reproduce the original libipt decode issue that was seen on a large app. Most errors seen were on a modified kernel and only a few on a regular futex. But it is still useful to add this test that ensures that the thread-final interrupted syscall is skipped. This test also uncovers a possible transparency violation seen in the behavior of an interrupted-and-restarted futex call, where the blocked thread doesn't remember that it was supposed to wait on a different futex specified by a later FUTEX_CMP_REQUEUE call than the one specified by it in the original futex syscall. Since the new test requires Intel-PT, verified that it passes by running it manually locally: ``` The following tests passed: code_api|tool.drcacheoff.burst_syscall_pt_SUDO The following tests passed: code_api|tool.drcacheoff.kernel.simple_SUDO code_api|tool.drcacheoff.kernel.opcode-mix_SUDO code_api|tool.drcacheoff.kernel.syscall-mix_SUDO code_api|tool.drcacheoff.kernel.invariant-checker_SUDO ``` Issue: #5505 Issue: #7034
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Skips dumping the PT trace for the interrupted thread-final syscall. Syscall PT traces from interrupted thread-final syscalls like futex, epoll_wait have been observed to not decode successfully due to pte_bad_context in libipt. They also do not represent the correct app behavior as they were interrupted by the detach signal. We skip dumping them to the raw trace.
Verified on a system that supports Intel-PT that relevant tests continue to pass:
Also verified on a large app that user+syscall PT traces gathered with this change do not have the decode issue previously seen on the PT trace of the interrupted last syscall.
Issue: #5505