Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add valgrind constant-time test to make check #723

Closed

Conversation

real-or-random
Copy link
Contributor

I'm not sure about the second commit. The local ./libtool works for me and I see no reason why we should use the system libtool. Let's see what travis says.

@gmaxwell
I don't understand really libtool --mode=execute to be honest. Can you explain why we need this here?

@sipa
Copy link
Contributor

sipa commented Mar 4, 2020

Ping @theuni: any ideas about libtool?

@gmaxwell
Copy link
Contributor

gmaxwell commented Mar 4, 2020

@real-or-random libtool --mode-execute is needed because when libtool is used to build a shared library the "binary" (e.g. the benches, or valgrind executable) you run isn't a binary. -- look at it, it's a shell script and the real binary is hidden in .libs. The reason for this is so that it can run the real binary linked against your non-installed copy of the library rather than getting linked to the system library.

If you valgrind the wrapper script, that won't be terribly informative! :)

There is probably some official way to use libtool --mode=execute in a make check. I hadn't even noticed that there was a local copy, that's probably part of it.

@real-or-random
Copy link
Contributor Author

real-or-random commented Mar 4, 2020

Ah thanks! We had just figured that out in IRC at the same time when you replied.

But we were wondering if there is a reason why we don't simply pass -static as we do for the other tests?

@theuni
Copy link
Contributor

theuni commented Mar 4, 2020

Yup, passing -static makes sense. I did the same thing just the other day for benches when it was interfering with my test wrapping: theuni@a9577fd

@gmaxwell
Copy link
Contributor

gmaxwell commented Mar 5, 2020

Then you're no longer testing the production libraries and small compiler differences change the constant time behaviour. The tests are compiled with special instrumentation and don't use the libraries in any case (and completely static builds are not supported by some OSes today, e.g. fedora doesn't ship static versions of most libraries anymore).

Copy link
Contributor

@sipa sipa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

.travis.yml Outdated Show resolved Hide resolved
@real-or-random real-or-random force-pushed the 202003-ct-make-check branch 2 times, most recently from bf9c81e to 110bc4f Compare March 5, 2020 11:53
@real-or-random
Copy link
Contributor Author

There is probably some official way to use libtool --mode=execute in a make check. I hadn't even noticed that there was a local copy, that's probably part of it.

I couldn't find anything but the current way seems to work, so I think we should use it.

Then you're no longer testing the production libraries and small compiler differences change the constant time behaviour. The tests are compiled with special instrumentation and don't use the libraries in any case (and completely static builds are not supported by some OSes today, e.g. fedora doesn't ship static versions of most libraries anymore).

Okay, yes, I was suspecting that this is the reason. Makes sense.

@theuni
Copy link
Contributor

theuni commented Mar 5, 2020

That's fine if it matters for instrumentation.

But be aware (because it's really non-obvious) that in this case "-static" is a libtool flag, not a linker one. Libtool builds 2 versions of the lib regardless. -static just tells the particular binaries which to link against. It doesn't actually attempt to create a static binary.

@elichai
Copy link
Contributor

elichai commented Mar 8, 2020

Can you try using $(LIBTOOL) it Makefile.am? I think this is how autotools expects it to be used.
Although that means not calling the script so not sure

Moreover, this changes the way Travis runs the constant-time tests
by adding a `--error-exitcode=1` parameter to valgrind. Without that
parameter, valgrind simply passes the exit code through, even in the
case of valgrind errors. That is, Travis would succeed even if
valgrind found problems.
real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Mar 25, 2020
This has been not been caught by the new constant-time tests because
valgrind currently gives us a zero exit code even if finds errors, see
bitcoin-core#723 (comment) .

Note that the timing leak here was the bit whether a secret key was
out of range. This leak is harmless and not exploitable. It is just
our overcautious practice to prefer constant-time code even here.
real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Mar 25, 2020
This has been not been caught by the new constant-time tests because
valgrind currently gives us a zero exit code even if finds errors, see
bitcoin-core#723 (comment) .

This commit also simplifies the arithmetic in memczero.

Note that the timing leak here was the bit whether a secret key was
out of range. This leak is harmless and not exploitable. It is just
our overcautious practice to prefer constant-time code even here.
real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Mar 27, 2020
This has been not been caught by the new constant-time tests because
valgrind currently gives us a zero exit code even if finds errors, see
bitcoin-core#723 (comment) .

This commit also simplifies the arithmetic in memczero.

Note that the timing leak here was the bit whether a secret key was
out of range. This leak is harmless and not exploitable. It is just
our overcautious practice to prefer constant-time code even here.
@elichai
Copy link
Contributor

elichai commented May 2, 2020

FWIW, According to my tests in adding macOS to travis, the local libtool works on macOS but the system's libtool doesn't. apparently Apple has it's own tool named libtool which has nothing to do with the GNU libtool

from https://formulae.brew.sh/formula/libtool:

In order to prevent conflicts with Apple's own libtool we have prepended a "g"
so, you have instead: glibtool and glibtoolize.

also racket/racket#337 (comment)

@elichai elichai mentioned this pull request May 3, 2020
@elichai
Copy link
Contributor

elichai commented Jun 4, 2020

Needs a rebase.
Also, maybe instead of mapping over error 127 and mapping to 77 etc we check in autotools if valgrind exists?
we currently look for the valgrind header in VALGRIND_ENABLED but we could also look for the actual executable via https://www.gnu.org/software/autoconf-archive/ax_valgrind_check.html

deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jun 4, 2020
Summary:
 * Suppress a harmless variable-time optimization by clang in memczero

This has been not been caught by the new constant-time tests because
valgrind currently gives us a zero exit code even if finds errors, see
bitcoin-core/secp256k1#723 (comment) .

This commit also simplifies the arithmetic in memczero.

Note that the timing leak here was the bit whether a secret key was
out of range. This leak is harmless and not exploitable. It is just
our overcautious practice to prefer constant-time code even here.

 * Add test for memczero()

This is a backport of libsecp256k1 [[bitcoin-core/secp256k1#728 | PR728]]

Test Plan:
  ninja all check check-secp256k1

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D6363
deadalnix pushed a commit to Bitcoin-ABC/secp256k1 that referenced this pull request Jun 5, 2020
Summary:
 * Suppress a harmless variable-time optimization by clang in memczero

This has been not been caught by the new constant-time tests because
valgrind currently gives us a zero exit code even if finds errors, see
bitcoin-core/secp256k1#723 (comment) .

This commit also simplifies the arithmetic in memczero.

Note that the timing leak here was the bit whether a secret key was
out of range. This leak is harmless and not exploitable. It is just
our overcautious practice to prefer constant-time code even here.

 * Add test for memczero()

This is a backport of libsecp256k1 [[bitcoin-core/secp256k1#728 | PR728]]

Test Plan:
  ninja all check check-secp256k1

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D6363
@elichai
Copy link
Contributor

elichai commented Jul 23, 2020

@real-or-random What's the status here?
Core might pull another libsecp update to fix because of #769 and I'd really like to see the ct tests being run by everyone running make check in core.

@gmaxwell
Copy link
Contributor

I'm a little dubious about this running in make check in bitcoin core. You're going to get reports of errors which will be hard to address. e.g. there are ones with current gcc on ppc64. On one hand good to learn about them, OTOH there are already known ones which AFAIK no one is working on resolving. It's really bad to give users errors that you won't respond to, because it will habituate them to not reporting ones that you will respond to. (including by them googling the error and then finding messages saying it's known/ignore it)

@elichai
Copy link
Contributor

elichai commented Jul 23, 2020

I'm a little dubious about this running in make check in bitcoin core. You're going to get reports of errors which will be hard to address. e.g. there are ones with current gcc on ppc64. On one hand good to learn about them, OTOH there are already known ones which AFAIK no one is working on resolving. It's really bad to give users errors that you won't respond to, because it will habituate them to not reporting ones that you will respond to. (including by them googling the error and then finding messages saying it's known/ignore it)

Won't documenting those properly in the README(or an issue) will solve the problem? (and maybe someone will actually fix those)

@gmaxwell
Copy link
Contributor

gmaxwell commented Jul 23, 2020

Unless the failure is extremely rare and specialized, documenting a failure thrown out by make check will just perpetually stop users from reporting other vaguely similar looking issues because they will find it and assume the authors already know or don't care.

There is also plenty of collateral damage possible. A test failure here will cause some users to feel this library is unsafe and go pick some other code that not only doesn't have constant time testing but which doesn't make any effort to be constant time and isn't. In the past we had issues where debian started refusing to upgrade their Bitcoin package after we added tests that failed on bigendian (at the time BE was totally unsupported by the software)-- debian had been shipping a completely non-functional BE package, once make check failed, the packaging process failed, and they refused to upgrade to a newer version that failed even though all the failure did was document the existing broken behaviour.

As far as users submitting fixes go-- If the worlds foremost experts in the software won't fix an issue, what chance does anyone else have?

These sorts of issues also are at high risk of being no one's bug: The language makes no promises about constant time behaviour, so a compiler is not faulty just for making obviously constant time code variable time. Simultaneously, if we can't get constant time behaviour from the compiler, there is sometimes little we can do (other than write more of the impacted routines in ASM).

The test serves a purpose because it helps find the introduction of behaviours that are grossly non-constant or subtle but easily avoided-- as a tool for reviewers.

If it were the case that that extensive effort had gone into making sure the tests passed on a huge variety of targets and all issues fixed such that the discovery was limiting additional improvements then there would be a case for using users to expand the search. Without first having our own comprehensive-ish coverage we couldn't really tell that any proposed fix wasn't actually making the matter worse someplace else. But that isn't where this test is right now.

I'd much rather see this PR make it a separate check-like target-- but while there are known issues I think it would be counterproductive to have uninformed end users running it. -- besides, valgrind isn't normally a dependency: So if the test optionally runs when valgrind is present you'll have make check fail or pass randomly based on if valgrind is installed.

@real-or-random
Copy link
Contributor Author

real-or-random commented Jul 24, 2020

Hm, I'd really like to see this test enabled by default because then there's a chance that it will be included in packaging process of distros. If the Debian/Fedora/Arch Linux secp256k1 packaging process uses a compiler that produces variable-time code, then I'd really like to know about this. At the moment we're just blind. The fact that C does not guarantee constant-time compilation is exactly the reason why the user should run these tests.

If it were the case that that extensive effort had gone into making sure the tests passed on a huge variety of targets and all issues fixed such that the discovery was limiting additional improvements then there would be a case for using users to expand the search.

Yes, given unlimited resources, we should do this extensive effort, and then I see merit in your argument. But I don't believe that's realistic. Unless someone funds another three cryptographic people working full time on this library, this won't happen. With that said, I prefer to learn about issues now than never.

For users who switch to other libraries. Ok yes, there's a risk. But if we want to see adoption, there are other low-hanging fruits and we just need to get our hands dirty. There are so many open issues here that make people use other code: no example code, experimental ECDH, no public hash module (okay, debatable), no docs for manual building, and most importantly no release. And let's be honest, who's really working on this? A handful of people who have hundreds of other things that are more interesting than maintaining a library. We had better phases this year and last year but at the moment, we don't even have the resources to review incoming PRs.

edit: If we want, we could still disable the test in Core, e.g., by setting TESTS. But I think it should be enabled here by default.

ps: Urghs, I accidentally edited your comment instead of mine. Sorry, reverted...

@gmaxwell
Copy link
Contributor

As it stands now the test fails on platforms these distributions targets! So the result would be what I pointed out above w/ debian-- suddenly the distribution refuses to upgrade to new versions because the tests fail.

Having bitcoin core not running libsecp256k1 tests at all because a constant time test fails is not a good tradeoff: To avoid a test that we know fails but only indicates behaviour that is usually inconsequential in most places bitcoin core is deployed, we'd disable detection of miscompilation that can cause a total consensus fault, direct leak of secret keys, etc. -- reports of issues that won't get ignored.

Again: We already know the test fails on some systems. You say "If the Debian/Fedora/Arch Linux secp256k1 packaging process uses a compiler that produces variable-time code, then I'd really like to know about this"-- but this isn't true: We already know the fedora compiler on PPC64LE fails this test, and no one is interested in doing anything about it.

So what happens then when users show up reporting the same issue because make check fails? We already know it will fail in at least fedora's build process (and likely debian since it already supports PPC64LE). Even if it weren't the case that it already failed, if no one maintaining this library can afford the time to spend a couple afternoons first getting this likely-to-fail test run one time on those distro targets (e.g. by finding people with relevant systems and begging them to run the tests, or asking a distro packager to do an experimental package build that runs the test)-- how can you expect that anyone is going to find the time to fix any additional reported issues?

At best all this does is teach them to not run make check because it reports issues that the developers don't care about and send off a signal that this library is unmaintained. Or it fixes distributors on old worse versions that don't fail, or worst it drives users to software which isn't even remotely constant time or doesn't even have tests (which would be pretty much every alternative to this library).

@sipa
Copy link
Contributor

sipa commented Jul 24, 2020

Could it be enabled by default just on platforms where we expect it to work?

@gmaxwell
Copy link
Contributor

When it fails will someone fix the failures? If not, there is only downsides from running it.

@real-or-random
Copy link
Contributor Author

Again: We already know the test fails on some systems. You say "If the Debian/Fedora/Arch Linux secp256k1 packaging process uses a compiler that produces variable-time code, then I'd really like to know about this"-- but this isn't true: We already know the fedora compiler on PPC64LE fails this test, and no one is interested in doing anything about it.

I think this should be fixed I'm very interested in doing something about it! But I didn't know it. It's not documented anywhere (except that you hinted at it above). We need to write this down at least.

Even if it weren't the case that it already failed, if no one maintaining this library can afford the time to spend a couple afternoons first getting this likely-to-fail test run one time on those distro targets (e.g. by finding people with relevant systems and begging them to run the tests, or asking a distro packager to do an experimental package build that runs the test)-- how can you expect that anyone is going to find the time to fix any additional reported issues?

Why is it set in stone that it's a couple of afternoons? The last one took me a few minutes (#728). Sure, this was on a system I could test on but in the end I looked more at compiler output than running tests. But if we don't get these reports, we don't even know about the issues.

@real-or-random
Copy link
Contributor Author

Could it be enabled by default just on platforms where we expect it to work?

That sounds reasonable as a first thing to do.

When it fails will someone fix the failures? If not, there is only downsides from running it.

I believe when it fails on a platform where we except it to work, that's a bug, and sure we'll fix bugs.

@gmaxwell
Copy link
Contributor

gmaxwell commented Jul 24, 2020

I think this should be fixed I'm very interested in doing something about it! But I didn't know it. It's not documented anywhere (except that you hinted at it above).

I reported it.

#708 (comment)

and

http://gnusha.org/secp256k1/2020-01-11.log at 11:29 and 11:37

(and probably in several other places if I were to go search)

Working on this and repetitively getting zero response from anyone is extremely demoralizing.

I expect that it also fails elsewhere, but why bother testing more exotic platforms when there is a known failure in a less exotic one that no one seems interested in? I stopped testing more systems once I saw no interest in this known failure.

Similarly, why waste my time arguing me in a loop over this without pointing out that you didn't know that it already failed? Your response is much less surprising to me now that I realize you didn't know that it was already known to fail. If I thought that the test passed everywhere I would have been somewhat more supportive of running it by default.

Why is it set in stone that it's a couple of afternoons? The last one took me a few minutes

It usually doesn't take much effort to fix a constant time behaviour ... Except for complicated ones like "oh gee, we can't use < in constant time code"-- for that I think we're facing a big complicated rewrite, potential performance loss, and having to retest all over to make sure the fix doesn't move the non-constant time bug to another platform.

But whatever effort it takes to fix an issue it takes much less effort to get a single passing test result, which was really my point.

If the project can't find the effort to get some tests run then it's not going to be able to muster the effort to fix issues that crop up. Without the ability to retest fixes broadly you can't be confident that a "fix" isn't making the problem worse except on platforms you can test. Having a year long iteration cycle where you try a fix for a users platform then 6 months later learn you broke a different user's platform (potentially an established one with many users) isn't a convergent process.

The best you can really do is have a set of platforms you're able to test (either directly or via community help) and be able to make some promises that the tests pass on those platforms. Anything you can't test a proposed fix on is just a crapshoot because any change to the software, particularly any constant time fix for another uncommon platform, might break them.

@hebasto
Copy link
Member

hebasto commented Feb 22, 2023

Needs a rebase?

@real-or-random
Copy link
Contributor Author

Well, yeah, if we want this at all.

Enabling it by default just on platforms where we expect it to work still sounds reasonable to me.

@real-or-random
Copy link
Contributor Author

Could it be enabled by default just on platforms where we expect it to work?

I think we should do this. If not, we should at least enable it in dev-mode, so that we contributors run tests on a regular basis.

real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Nov 23, 2023
Widely available version of GCC and Clang beat our field asm on -O2.
In particular, GCC 10.0.5, which is Bitcoin Core's current compiler
for official x86_64 builds, produces code that is > 20% faster for
fe_mul and > 10% faster for signature verification (see bitcoin-core#723).

These are the alternatives to this PR:

We could replace our current asm with the fastest compiler output
that we can find. This is potentially faster, but it has multiple
drawbacks:
 - It's more coding work because it needs detailed benchmarks (e.g.,
   with many compiler/options).
 - It's more review work because we need to deal with inline asm
   (including clobbers etc.) and there's lack of experts reviewers
   in this area.
 - It's not unlikely that we'll fall behind again in a few compiler
   version, and then we have to deal with this again, i.e., redo the
   benchmarks. Given our history here, I doubt that we'll revolve
   this timely.

We could change the default of the asm build option to off. But this
will also disable the scalar asm, which is still faster.

We could split the build option into two separate options for field
and scalar asm and only disable the field asm by default. But this
adds complexity to the build and to the test matrix.

My conclusion is that this PR gets the low-hanging fruit in terms of
performance. It simplifies our code significantly. It's clearly an
improvement, and it's very easy to review. Whether re-introducing
better asm (whether from a compiler or from CryptOpt) is worth the
hassle can be evaluated separately, and should not hold up this
improvement.

Solves bitcoin-core#723.
real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Nov 23, 2023
Widely available version of GCC and Clang beat our field asm on -O2.
In particular, GCC 10.0.5, which is Bitcoin Core's current compiler
for official x86_64 builds, produces code that is > 20% faster for
fe_mul and > 10% faster for signature verification (see bitcoin-core#723).

These are the alternatives to this PR:

We could replace our current asm with the fastest compiler output
that we can find. This is potentially faster, but it has multiple
drawbacks:
 - It's more coding work because it needs detailed benchmarks (e.g.,
   with many compiler/options).
 - It's more review work because we need to deal with inline asm
   (including clobbers etc.) and there's lack of experts reviewers
   in this area.
 - It's not unlikely that we'll fall behind again in a few compiler
   version, and then we have to deal with this again, i.e., redo the
   benchmarks. Given our history here, I doubt that we'll revolve
   this timely.

We could change the default of the asm build option to off. But this
will also disable the scalar asm, which is still faster.

We could split the build option into two separate options for field
and scalar asm and only disable the field asm by default. But this
adds complexity to the build and to the test matrix.

My conclusion is that this PR gets the low-hanging fruit in terms of
performance. It simplifies our code significantly. It's clearly an
improvement, and it's very easy to review. Whether re-introducing
better asm (whether from a compiler or from CryptOpt) is worth the
hassle can be evaluated separately, and should not hold up this
improvement.

Solves bitcoin-core#723.
real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Nov 23, 2023
Widely available versions of GCC and Clang beat our field asm on -O2.
In particular, GCC 10.0.5, which is Bitcoin Core's current compiler
for official x86_64 builds, produces code that is > 20% faster for
fe_mul and > 10% faster for signature verification (see bitcoin-core#723).

These are the alternatives to this PR:

We could replace our current asm with the fastest compiler output
that we can find. This is potentially faster, but it has multiple
drawbacks:
 - It's more coding work because it needs detailed benchmarks (e.g.,
   with many compiler/options).
 - It's more review work because we need to deal with inline asm
   (including clobbers etc.) and there's lack of experts reviewers
   in this area.
 - It's not unlikely that we'll fall behind again in a few compiler
   versions, and then we have to deal with this again, i.e., redo the
   benchmarks. Given our history here, I doubt that we'll revolve
   this timely.

We could change the default of the asm build option to off. But this
will also disable the scalar asm, which is still faster.

We could split the build option into two separate options for field
and scalar asm and only disable the field asm by default. But this
adds complexity to the build and to the test matrix.

My conclusion is that this PR gets the low-hanging fruit in terms of
performance. It simplifies our code significantly. It's clearly an
improvement, and it's very easy to review. Whether re-introducing
better asm (whether from a compiler or from CryptOpt) is worth the
hassle can be evaluated separately, and should not hold up this
improvement.

Solves bitcoin-core#723.
real-or-random added a commit to real-or-random/secp256k1 that referenced this pull request Nov 23, 2023
Widely available versions of GCC and Clang beat our field asm on -O2.
In particular, GCC 10.0.5, which is Bitcoin Core's current compiler
for official x86_64 builds, produces code that is > 20% faster for
fe_mul and > 10% faster for signature verification (see bitcoin-core#723).

These are the alternatives to this PR:

We could replace our current asm with the fastest compiler output
that we can find. This is potentially faster, but it has multiple
drawbacks:
 - It's more coding work because it needs detailed benchmarks (e.g.,
   with many compiler/options).
 - It's more review work because we need to deal with inline asm
   (including clobbers etc.) and there's a lack of experts reviewers
   in this area.
 - It's not unlikely that we'll fall behind again in a few compiler
   versions, and then we have to deal with this again, i.e., redo the
   benchmarks. Given our history here, I doubt that we'll revolve
   this timely.

We could change the default of the asm build option to off. But this
will also disable the scalar asm, which is still faster.

We could split the build option into two separate options for field
and scalar asm and only disable the field asm by default. But this
adds complexity to the build and to the test matrix.

My conclusion is that this PR gets the low-hanging fruit in terms of
performance. It simplifies our code significantly. It's clearly an
improvement, and it's very easy to review. Whether re-introducing
better asm (whether from a compiler or from CryptOpt) is worth the
hassle can be evaluated separately, and should not hold up this
improvement.

Solves bitcoin-core#723.
@real-or-random
Copy link
Contributor Author

Tracked in #1560

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants