-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add valgrind constant-time test to make check
#723
Add valgrind constant-time test to make check
#723
Conversation
Ping @theuni: any ideas about libtool? |
e4af0f4
to
540dbaf
Compare
@real-or-random libtool --mode-execute is needed because when libtool is used to build a shared library the "binary" (e.g. the benches, or valgrind executable) you run isn't a binary. -- look at it, it's a shell script and the real binary is hidden in .libs. The reason for this is so that it can run the real binary linked against your non-installed copy of the library rather than getting linked to the system library. If you valgrind the wrapper script, that won't be terribly informative! :) There is probably some official way to use libtool --mode=execute in a make check. I hadn't even noticed that there was a local copy, that's probably part of it. |
Ah thanks! We had just figured that out in IRC at the same time when you replied. But we were wondering if there is a reason why we don't simply pass |
Yup, passing -static makes sense. I did the same thing just the other day for benches when it was interfering with my test wrapping: theuni@a9577fd |
Then you're no longer testing the production libraries and small compiler differences change the constant time behaviour. The tests are compiled with special instrumentation and don't use the libraries in any case (and completely static builds are not supported by some OSes today, e.g. fedora doesn't ship static versions of most libraries anymore). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
bf9c81e
to
110bc4f
Compare
I couldn't find anything but the current way seems to work, so I think we should use it.
Okay, yes, I was suspecting that this is the reason. Makes sense. |
That's fine if it matters for instrumentation. But be aware (because it's really non-obvious) that in this case "-static" is a libtool flag, not a linker one. Libtool builds 2 versions of the lib regardless. -static just tells the particular binaries which to link against. It doesn't actually attempt to create a static binary. |
110bc4f
to
c81ae4c
Compare
Can you try using |
Moreover, this changes the way Travis runs the constant-time tests by adding a `--error-exitcode=1` parameter to valgrind. Without that parameter, valgrind simply passes the exit code through, even in the case of valgrind errors. That is, Travis would succeed even if valgrind found problems.
c81ae4c
to
35a648f
Compare
This has been not been caught by the new constant-time tests because valgrind currently gives us a zero exit code even if finds errors, see bitcoin-core#723 (comment) . Note that the timing leak here was the bit whether a secret key was out of range. This leak is harmless and not exploitable. It is just our overcautious practice to prefer constant-time code even here.
This has been not been caught by the new constant-time tests because valgrind currently gives us a zero exit code even if finds errors, see bitcoin-core#723 (comment) . This commit also simplifies the arithmetic in memczero. Note that the timing leak here was the bit whether a secret key was out of range. This leak is harmless and not exploitable. It is just our overcautious practice to prefer constant-time code even here.
This has been not been caught by the new constant-time tests because valgrind currently gives us a zero exit code even if finds errors, see bitcoin-core#723 (comment) . This commit also simplifies the arithmetic in memczero. Note that the timing leak here was the bit whether a secret key was out of range. This leak is harmless and not exploitable. It is just our overcautious practice to prefer constant-time code even here.
FWIW, According to my tests in adding macOS to travis, the local libtool works on macOS but the system's libtool doesn't. apparently Apple has it's own tool named from https://formulae.brew.sh/formula/libtool:
|
Needs a rebase. |
Summary: * Suppress a harmless variable-time optimization by clang in memczero This has been not been caught by the new constant-time tests because valgrind currently gives us a zero exit code even if finds errors, see bitcoin-core/secp256k1#723 (comment) . This commit also simplifies the arithmetic in memczero. Note that the timing leak here was the bit whether a secret key was out of range. This leak is harmless and not exploitable. It is just our overcautious practice to prefer constant-time code even here. * Add test for memczero() This is a backport of libsecp256k1 [[bitcoin-core/secp256k1#728 | PR728]] Test Plan: ninja all check check-secp256k1 Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D6363
Summary: * Suppress a harmless variable-time optimization by clang in memczero This has been not been caught by the new constant-time tests because valgrind currently gives us a zero exit code even if finds errors, see bitcoin-core/secp256k1#723 (comment) . This commit also simplifies the arithmetic in memczero. Note that the timing leak here was the bit whether a secret key was out of range. This leak is harmless and not exploitable. It is just our overcautious practice to prefer constant-time code even here. * Add test for memczero() This is a backport of libsecp256k1 [[bitcoin-core/secp256k1#728 | PR728]] Test Plan: ninja all check check-secp256k1 Reviewers: #bitcoin_abc, Fabien Reviewed By: #bitcoin_abc, Fabien Differential Revision: https://reviews.bitcoinabc.org/D6363
@real-or-random What's the status here? |
I'm a little dubious about this running in make check in bitcoin core. You're going to get reports of errors which will be hard to address. e.g. there are ones with current gcc on ppc64. On one hand good to learn about them, OTOH there are already known ones which AFAIK no one is working on resolving. It's really bad to give users errors that you won't respond to, because it will habituate them to not reporting ones that you will respond to. (including by them googling the error and then finding messages saying it's known/ignore it) |
Won't documenting those properly in the README(or an issue) will solve the problem? (and maybe someone will actually fix those) |
Unless the failure is extremely rare and specialized, documenting a failure thrown out by make check will just perpetually stop users from reporting other vaguely similar looking issues because they will find it and assume the authors already know or don't care. There is also plenty of collateral damage possible. A test failure here will cause some users to feel this library is unsafe and go pick some other code that not only doesn't have constant time testing but which doesn't make any effort to be constant time and isn't. In the past we had issues where debian started refusing to upgrade their Bitcoin package after we added tests that failed on bigendian (at the time BE was totally unsupported by the software)-- debian had been shipping a completely non-functional BE package, once make check failed, the packaging process failed, and they refused to upgrade to a newer version that failed even though all the failure did was document the existing broken behaviour. As far as users submitting fixes go-- If the worlds foremost experts in the software won't fix an issue, what chance does anyone else have? These sorts of issues also are at high risk of being no one's bug: The language makes no promises about constant time behaviour, so a compiler is not faulty just for making obviously constant time code variable time. Simultaneously, if we can't get constant time behaviour from the compiler, there is sometimes little we can do (other than write more of the impacted routines in ASM). The test serves a purpose because it helps find the introduction of behaviours that are grossly non-constant or subtle but easily avoided-- as a tool for reviewers. If it were the case that that extensive effort had gone into making sure the tests passed on a huge variety of targets and all issues fixed such that the discovery was limiting additional improvements then there would be a case for using users to expand the search. Without first having our own comprehensive-ish coverage we couldn't really tell that any proposed fix wasn't actually making the matter worse someplace else. But that isn't where this test is right now. I'd much rather see this PR make it a separate check-like target-- but while there are known issues I think it would be counterproductive to have uninformed end users running it. -- besides, valgrind isn't normally a dependency: So if the test optionally runs when valgrind is present you'll have make check fail or pass randomly based on if valgrind is installed. |
Hm, I'd really like to see this test enabled by default because then there's a chance that it will be included in packaging process of distros. If the Debian/Fedora/Arch Linux secp256k1 packaging process uses a compiler that produces variable-time code, then I'd really like to know about this. At the moment we're just blind. The fact that C does not guarantee constant-time compilation is exactly the reason why the user should run these tests.
Yes, given unlimited resources, we should do this extensive effort, and then I see merit in your argument. But I don't believe that's realistic. Unless someone funds another three cryptographic people working full time on this library, this won't happen. With that said, I prefer to learn about issues now than never. For users who switch to other libraries. Ok yes, there's a risk. But if we want to see adoption, there are other low-hanging fruits and we just need to get our hands dirty. There are so many open issues here that make people use other code: no example code, experimental ECDH, no public hash module (okay, debatable), no docs for manual building, and most importantly no release. And let's be honest, who's really working on this? A handful of people who have hundreds of other things that are more interesting than maintaining a library. We had better phases this year and last year but at the moment, we don't even have the resources to review incoming PRs. edit: If we want, we could still disable the test in Core, e.g., by setting ps: Urghs, I accidentally edited your comment instead of mine. Sorry, reverted... |
As it stands now the test fails on platforms these distributions targets! So the result would be what I pointed out above w/ debian-- suddenly the distribution refuses to upgrade to new versions because the tests fail. Having bitcoin core not running libsecp256k1 tests at all because a constant time test fails is not a good tradeoff: To avoid a test that we know fails but only indicates behaviour that is usually inconsequential in most places bitcoin core is deployed, we'd disable detection of miscompilation that can cause a total consensus fault, direct leak of secret keys, etc. -- reports of issues that won't get ignored. Again: We already know the test fails on some systems. You say "If the Debian/Fedora/Arch Linux secp256k1 packaging process uses a compiler that produces variable-time code, then I'd really like to know about this"-- but this isn't true: We already know the fedora compiler on PPC64LE fails this test, and no one is interested in doing anything about it. So what happens then when users show up reporting the same issue because make check fails? We already know it will fail in at least fedora's build process (and likely debian since it already supports PPC64LE). Even if it weren't the case that it already failed, if no one maintaining this library can afford the time to spend a couple afternoons first getting this likely-to-fail test run one time on those distro targets (e.g. by finding people with relevant systems and begging them to run the tests, or asking a distro packager to do an experimental package build that runs the test)-- how can you expect that anyone is going to find the time to fix any additional reported issues? At best all this does is teach them to not run make check because it reports issues that the developers don't care about and send off a signal that this library is unmaintained. Or it fixes distributors on old worse versions that don't fail, or worst it drives users to software which isn't even remotely constant time or doesn't even have tests (which would be pretty much every alternative to this library). |
Could it be enabled by default just on platforms where we expect it to work? |
When it fails will someone fix the failures? If not, there is only downsides from running it. |
I think this should be fixed I'm very interested in doing something about it! But I didn't know it. It's not documented anywhere (except that you hinted at it above). We need to write this down at least.
Why is it set in stone that it's a couple of afternoons? The last one took me a few minutes (#728). Sure, this was on a system I could test on but in the end I looked more at compiler output than running tests. But if we don't get these reports, we don't even know about the issues. |
That sounds reasonable as a first thing to do.
I believe when it fails on a platform where we except it to work, that's a bug, and sure we'll fix bugs. |
I reported it. and http://gnusha.org/secp256k1/2020-01-11.log at 11:29 and 11:37 (and probably in several other places if I were to go search) Working on this and repetitively getting zero response from anyone is extremely demoralizing. I expect that it also fails elsewhere, but why bother testing more exotic platforms when there is a known failure in a less exotic one that no one seems interested in? I stopped testing more systems once I saw no interest in this known failure. Similarly, why waste my time arguing me in a loop over this without pointing out that you didn't know that it already failed? Your response is much less surprising to me now that I realize you didn't know that it was already known to fail. If I thought that the test passed everywhere I would have been somewhat more supportive of running it by default.
It usually doesn't take much effort to fix a constant time behaviour ... Except for complicated ones like "oh gee, we can't use < in constant time code"-- for that I think we're facing a big complicated rewrite, potential performance loss, and having to retest all over to make sure the fix doesn't move the non-constant time bug to another platform. But whatever effort it takes to fix an issue it takes much less effort to get a single passing test result, which was really my point. If the project can't find the effort to get some tests run then it's not going to be able to muster the effort to fix issues that crop up. Without the ability to retest fixes broadly you can't be confident that a "fix" isn't making the problem worse except on platforms you can test. Having a year long iteration cycle where you try a fix for a users platform then 6 months later learn you broke a different user's platform (potentially an established one with many users) isn't a convergent process. The best you can really do is have a set of platforms you're able to test (either directly or via community help) and be able to make some promises that the tests pass on those platforms. Anything you can't test a proposed fix on is just a crapshoot because any change to the software, particularly any constant time fix for another uncommon platform, might break them. |
Needs a rebase? |
Well, yeah, if we want this at all. Enabling it by default just on platforms where we expect it to work still sounds reasonable to me. |
I think we should do this. If not, we should at least enable it in dev-mode, so that we contributors run tests on a regular basis. |
Widely available version of GCC and Clang beat our field asm on -O2. In particular, GCC 10.0.5, which is Bitcoin Core's current compiler for official x86_64 builds, produces code that is > 20% faster for fe_mul and > 10% faster for signature verification (see bitcoin-core#723). These are the alternatives to this PR: We could replace our current asm with the fastest compiler output that we can find. This is potentially faster, but it has multiple drawbacks: - It's more coding work because it needs detailed benchmarks (e.g., with many compiler/options). - It's more review work because we need to deal with inline asm (including clobbers etc.) and there's lack of experts reviewers in this area. - It's not unlikely that we'll fall behind again in a few compiler version, and then we have to deal with this again, i.e., redo the benchmarks. Given our history here, I doubt that we'll revolve this timely. We could change the default of the asm build option to off. But this will also disable the scalar asm, which is still faster. We could split the build option into two separate options for field and scalar asm and only disable the field asm by default. But this adds complexity to the build and to the test matrix. My conclusion is that this PR gets the low-hanging fruit in terms of performance. It simplifies our code significantly. It's clearly an improvement, and it's very easy to review. Whether re-introducing better asm (whether from a compiler or from CryptOpt) is worth the hassle can be evaluated separately, and should not hold up this improvement. Solves bitcoin-core#723.
Widely available version of GCC and Clang beat our field asm on -O2. In particular, GCC 10.0.5, which is Bitcoin Core's current compiler for official x86_64 builds, produces code that is > 20% faster for fe_mul and > 10% faster for signature verification (see bitcoin-core#723). These are the alternatives to this PR: We could replace our current asm with the fastest compiler output that we can find. This is potentially faster, but it has multiple drawbacks: - It's more coding work because it needs detailed benchmarks (e.g., with many compiler/options). - It's more review work because we need to deal with inline asm (including clobbers etc.) and there's lack of experts reviewers in this area. - It's not unlikely that we'll fall behind again in a few compiler version, and then we have to deal with this again, i.e., redo the benchmarks. Given our history here, I doubt that we'll revolve this timely. We could change the default of the asm build option to off. But this will also disable the scalar asm, which is still faster. We could split the build option into two separate options for field and scalar asm and only disable the field asm by default. But this adds complexity to the build and to the test matrix. My conclusion is that this PR gets the low-hanging fruit in terms of performance. It simplifies our code significantly. It's clearly an improvement, and it's very easy to review. Whether re-introducing better asm (whether from a compiler or from CryptOpt) is worth the hassle can be evaluated separately, and should not hold up this improvement. Solves bitcoin-core#723.
Widely available versions of GCC and Clang beat our field asm on -O2. In particular, GCC 10.0.5, which is Bitcoin Core's current compiler for official x86_64 builds, produces code that is > 20% faster for fe_mul and > 10% faster for signature verification (see bitcoin-core#723). These are the alternatives to this PR: We could replace our current asm with the fastest compiler output that we can find. This is potentially faster, but it has multiple drawbacks: - It's more coding work because it needs detailed benchmarks (e.g., with many compiler/options). - It's more review work because we need to deal with inline asm (including clobbers etc.) and there's lack of experts reviewers in this area. - It's not unlikely that we'll fall behind again in a few compiler versions, and then we have to deal with this again, i.e., redo the benchmarks. Given our history here, I doubt that we'll revolve this timely. We could change the default of the asm build option to off. But this will also disable the scalar asm, which is still faster. We could split the build option into two separate options for field and scalar asm and only disable the field asm by default. But this adds complexity to the build and to the test matrix. My conclusion is that this PR gets the low-hanging fruit in terms of performance. It simplifies our code significantly. It's clearly an improvement, and it's very easy to review. Whether re-introducing better asm (whether from a compiler or from CryptOpt) is worth the hassle can be evaluated separately, and should not hold up this improvement. Solves bitcoin-core#723.
Widely available versions of GCC and Clang beat our field asm on -O2. In particular, GCC 10.0.5, which is Bitcoin Core's current compiler for official x86_64 builds, produces code that is > 20% faster for fe_mul and > 10% faster for signature verification (see bitcoin-core#723). These are the alternatives to this PR: We could replace our current asm with the fastest compiler output that we can find. This is potentially faster, but it has multiple drawbacks: - It's more coding work because it needs detailed benchmarks (e.g., with many compiler/options). - It's more review work because we need to deal with inline asm (including clobbers etc.) and there's a lack of experts reviewers in this area. - It's not unlikely that we'll fall behind again in a few compiler versions, and then we have to deal with this again, i.e., redo the benchmarks. Given our history here, I doubt that we'll revolve this timely. We could change the default of the asm build option to off. But this will also disable the scalar asm, which is still faster. We could split the build option into two separate options for field and scalar asm and only disable the field asm by default. But this adds complexity to the build and to the test matrix. My conclusion is that this PR gets the low-hanging fruit in terms of performance. It simplifies our code significantly. It's clearly an improvement, and it's very easy to review. Whether re-introducing better asm (whether from a compiler or from CryptOpt) is worth the hassle can be evaluated separately, and should not hold up this improvement. Solves bitcoin-core#723.
Tracked in #1560 |
I'm not sure about the second commit. The local
./libtool
works for me and I see no reason why we should use the system libtool. Let's see what travis says.@gmaxwell
I don't understand really
libtool --mode=execute
to be honest. Can you explain why we need this here?