-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build on musl libc systems #2916
Conversation
While this can succesfully build against musl, there are probably some things broken at runtime.
Presumably it's the core/lib/globals_shared.h modification making the travis build sad? The reason for that change was that pid_t was undefined for me without <sys/types.h>. Possibly just leaving out the <signal.h> inclusion would avoid the ucontext stuff. |
run aarch64 tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your PR. There are some build failures on Travis with this patch. Could you take a look at those?
core/lib/globals_shared.h
Outdated
@@ -120,12 +120,12 @@ | |||
# define X86_64 | |||
#endif | |||
|
|||
/* DR_API EXPORT VERBATIM */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? Do we really want to expose those headers as part of the DynamoRIO API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently, using musl headers pid_t is undefined without including <sys/types.h>
(pid_t being required to define DR's process_id_t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I don't believe there is any reason to include <signal.h>, and will remove that.)
core/unix/os.c
Outdated
@@ -6191,6 +6191,7 @@ handle_close_pre(dcontext_t *dcontext) | |||
|
|||
/* Xref PR 258731 - duplicate STDOUT/STDERR when app closes them so we (or | |||
* a client) can continue to use them for logging. */ | |||
#ifdef STDFILE_FILENO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should not allow the option in the first place, if we do not support it, see dynamorio/core/optionsx.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks for pointing out the option location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, as Derek noted below, there's a way to leave this option partially functional, so I'm going to try that first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the patch. I think the dup-on-close feature should remain and only the private libc portion of it needs any change.
core/unix/os.c
Outdated
@@ -6191,6 +6191,7 @@ handle_close_pre(dcontext_t *dcontext) | |||
|
|||
/* Xref PR 258731 - duplicate STDOUT/STDERR when app closes them so we (or | |||
* a client) can continue to use them for logging. */ | |||
#ifdef STDFILE_FILENO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is important functionality for some use cases. Many DR tools on UNIX, like similar tools such as strace
or valgrind
, provide their results by printing to the console, either stdout or stderr. Without this dup-on-close, the output is lost when running apps like shells that like to close these files. Compare:
$ bin64/drrun -stderr_mask 15 -c api/bin/libbbcount.so -- tcsh
<Starting application /bin/tcsh (66788)>
Client bbcount is running
<Stopping application /bin/tcsh (66789)>
Instrumentation results:
1739101 basic block executions
2071 basic blocks needed flag saving
4287 basic blocks did not
% date
<Starting application /bin/date (66790)>
Client bbcount is running
Mon Apr 9 14:18:45 EDT 2018
<Stopping application /bin/date (66790)>
Instrumentation results:
48939 basic block executions
659 basic blocks needed flag saving
2621 basic blocks did not
% exit
exit
<Stopping application /bin/tcsh (66788)>
Instrumentation results:
6347122 basic block executions
3473 basic blocks needed flag saving
7672 basic blocks did not
$ bin64/drrun -stderr_mask 15 -no_dup_stdout_on_close -no_dup_stderr_on_close -c api/bin/libbbcount.so -- tcsh
<Starting application /bin/tcsh (66795)>
Client bbcount is running
<Stopping application /bin/tcsh (66796)>
Instrumentation results:
1739099 basic block executions
2071 basic blocks needed flag saving
4285 basic blocks did not
% date
<Starting application /bin/date (66797)>
Client bbcount is running
Mon Apr 9 14:18:52 EDT 2018
% exit
exit
As you can see, without this feature the tool's results are just lost once in tcsh.
I see no reason to disable the key part of this feature: the dup of STDOUT and STDERR. The only thing here that musl impacts is the update of the stdout/stderr used in the private libc copy, the privmod_std* update. That is also less likely to affect the regular operation of many tools.
Also, rather than silently not working, could we have some kind of visible warning so that someone building this knows it's missing functionality that may cause existing tools to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Yeah, agreed. I didn't look at the code closely enough to notice that leaving this feature partially intact was a better option.
By "some kind of visible warning" do you mean a #warning, a LOG(THREAD, ...
like I see nearby, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "some kind of visible warning" do you mean a #warning, a LOG(THREAD, ... like I see nearby, or something else?
#warning and maybe also a SYSLOG_INTERNAL_WARNING
which will both print to the log file and print to the console (in debug build only) (pre-close so should show up :)) so the user is aware something may be missing -- maybe that's too noisy but I would vote for that.
core/unix/signal_private.h
Outdated
@@ -535,7 +535,7 @@ libc_sigismember(const sigset_t *set, int _sig) | |||
return TEST(1UL << sig, *set); | |||
#else | |||
uint bits_per = 8*sizeof(ulong); | |||
return TEST(1UL << (sig % bits_per), set->__val[sig / bits_per]); | |||
return TEST(1UL << (sig % bits_per), ((const ulong *)set)[sig / bits_per]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment would help explain why this isn't using ->val
for glibc and prevent someone from reverting this in the future since looking just at glibc this is less clean.
core/unix/injector.c
Outdated
@@ -154,7 +154,7 @@ static bool | |||
inject_ptrace(dr_inject_info_t *info, const char *library_path); | |||
|
|||
static long | |||
our_ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data); | |||
our_ptrace(int request, pid_t pid, void *addr, void *data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the musl ptrace constants #defines and not an enum? Should we treat like we do for Mac and create an enum so everyone has the same type? Or at least put in a comment so we know why it's not using the proper type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, they're #defines. The enum solution won't be as clean as it is for Mac, since you'd have to #undef the existing constants. I'll go with a comment, I guess.
Just for reference: my basic test that this actually compiles and runs with musl is a dockerfile with
|
Ideally, if you could add a CDash test suite build (see https://github.com/DynamoRIO/dynamorio/wiki/Test-Suite#cdash-dashboard) (or Jenkins post-commit), then developers can tell when this breaks: otherwise there's no guarantee it's not going to break in the future. |
Ah, cool. Although, I think getting the full test suite set up is a task that'll have to wait for someone who cares more about musl support than I do. I was satisfied at "fixed the fixable-looking compiler errors; now it builds and can apparently run at least one DR client" 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some style issues.
core/unix/os.c
Outdated
static void | ||
set_stdfile_fileno(stdfile_t **stdfile, file_t file_no) { | ||
#ifdef STDFILE_FILENO | ||
(*stdfile)->STDFILE_FILENO = file_no; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: indentation is off throughout this function
core/unix/os.c
Outdated
@@ -6168,6 +6168,22 @@ cleanup_after_vfork_execve(dcontext_t *dcontext) | |||
HEAPACCT(ACCT_THREAD_MGT)); | |||
} | |||
|
|||
static void | |||
set_stdfile_fileno(stdfile_t **stdfile, file_t file_no) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: { on own line
core/unix/os.c
Outdated
(*stdfile)->STDFILE_FILENO = file_no; | ||
#else | ||
/* this occurs on musl libc */ | ||
#warning stdfile_t is opaque; DynamoRIO will skip setting fds of libc FILEs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: # warning
core/lib/globals_shared.h
Outdated
# include <signal.h> | ||
#endif | ||
/* DR_API EXPORT VERBATIM */ | ||
#ifdef UNIX | ||
# include <sys/types.h> /* Fix for case 5341. And for pid_t (non-glibc) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now in an exposed header so let's remove the ancient "case 5341"
core/unix/os.c
Outdated
#ifdef STDFILE_FILENO | ||
(*stdfile)->STDFILE_FILENO = file_no; | ||
#else | ||
/* this occurs on musl libc */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include i#1973
in the comment
core/unix/os.c
Outdated
"work as expected, because the app is closing the UNIX fds backing these." | ||
); | ||
/* i#1973: musl libc support (and potentially other non-glibcs) */ | ||
# warning stdfile_t is opaque; DynamoRIO will not set fds of libc FILEs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, one final nit: "The # character must be in the first column" from https://github.com/DynamoRIO/dynamorio/wiki/Code-Style (I'm surprised the Vera++ style checks on Travis didn't complain about it).
run aarch64 tests |
run aarch32 tests |
The Travis burst_traces failure is strange: eyeballing the diff does not show why it failed but presumably it is something flaky about the output pattern match. |
run aarch32 tests |
run aarch64 tests |
Hmm now it's ptsig -- Travis reliability used to be higher:
That is #2921 |
Fixed in #2923. |
These changes seem sufficient to allow building dynamorio on an alpine linux container, and to successfully run some basic clients. No further testing was done.
Addresses #1973 , sort of. I don't believe musl libc distros expose
struct _IO_FILE
fields at all, so I opted to just disable the related feature. (FWIW, copying in some specific version of musl'sstruct _IO_FILE
would "work".)