Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test: Run a self-test for our valgrind setup #4182

Merged
merged 3 commits into from
Jul 17, 2024

Conversation

reneme
Copy link
Collaborator

@reneme reneme commented Jul 5, 2024

This runs some basic scenarios we'd like to find with valgrind and checks that they are detected as expected. Otherwise, to the best of our understanding, there wouldn't be any negative signal from valgrind in a typical CI run.

We introduced a new build target "valgrind_selftest" for that and added a python-based steering script that runs each of the C++ test cases individually. The "valgrind" CI job builds and runs the selftest before running the unit tests. This is built as a minimal framework to add more test cases over time.

One of the test cases isn't an obvious side channel, it is a reproduction of the latest Kyber issue found by PQShield and will trigger valgrind only in certain compiler configurations. In a follow-up, we might add a nightly valgrind run with such a configuration in the hope to increase the valgrind signal overall. The python-based steering script detects vulnerable compiler settings (from build-config.json) and skips this test otherwise.

Another one is a de-facto regression test for the recently-introduced value-barrier. The regression is obviously not visible unless run with vulnerable compiler settings either. Then, with a disabled value-barrier, valgrind does detect a secret-dependent branch successfully.

@coveralls
Copy link

Coverage Status

coverage: 91.726%. remained the same
when pulling defef18 on Rohde-Schwarz:test/valgrind_selftest
into 4cc6142 on randombit:master.

Comment on lines 152 to 165
// This mimicks what went wrong in Kyber's secret message expansion
// that was found by PQShield in Kyber's reference implementation and
// was fixed in https://github.com/randombit/botan/pull/4107.
//
// Certain versions of Clang, namely 15, 16, 17 and 18 (maybe more) with
// specific optimization flags (-Os, -O1, -O2 -fno-vectorize, ...) do
// realize that `poisoned_mask` can only ever be all-zero or all-one and
// conditionally jump over the loop execution below.
//
// See: https://pqshield.com/pqshield-plugs-timing-leaks-in-kyber-ml-kem-to-improve-pqc-implementation-maturity/
const uint8_t poisoned_mask = -static_cast<uint8_t>(poisoned_byte & 1);
for(size_t i = 0; i < sizeof(output_bytes); ++i) {
output_bytes[i] &= poisoned_mask;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the right [*] compiler and flags, this actually triggers valgrind.

[*]

./configure.py --without-documentation --cc=clang --compiler-cache=ccache --build-tool=ninja --build-targets=static,valgrind_selftest --with-valgrind --minimized-build --enable-modules=auto_rng,hmac,sha2_64,system_rng --with-debug-info
ninja valgrind_selftest
valgrind --error-exitcode=2 --track-origins=yes ./botan_valgrind_selftest  clang_vs_bare_metal_ct_mask

Comment on lines 183 to 189
// Before the introduction of CT::value_barrier, this did generate a
// conditional jump when compiled with clang using certain compiler
// optimizations. See the test case above for further details.
auto poisoned_mask = Botan::CT::Mask<uint8_t>::expand(poisoned_byte & 1);
for(size_t i = 0; i < sizeof(output_bytes); ++i) {
output_bytes[i] = poisoned_mask.select(output_bytes[i], 0);
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you comment-out the gist of CT::value_barrier() this also triggers valgrind under the same conditions explained above: 487eb50#r1666833485

@coveralls
Copy link

Coverage Status

coverage: 91.73% (+0.004%) from 91.726%
when pulling 487eb50 on Rohde-Schwarz:test/valgrind_selftest
into 4cc6142 on randombit:master.

@coveralls
Copy link

Coverage Status

coverage: 91.728% (+0.002%) from 91.726%
when pulling a970549 on Rohde-Schwarz:test/valgrind_selftest
into 4cc6142 on randombit:master.

@reneme reneme requested a review from randombit July 8, 2024 07:02
@reneme reneme added this to the Botan 3.6.0 milestone Jul 8, 2024
@reneme reneme self-assigned this Jul 8, 2024
@reneme reneme marked this pull request as ready for review July 8, 2024 07:14
Comment on lines 121 to 124
# This is certainly not an exhaustive list of compiler configurations
# that could be problematic for this specific test
if cc_macro == "CLANG" and "-Os" in cc_compile_flags:
test_list.append(ValgrindTest("clang_vs_bare_metal_ct_mask", True))
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a sanity check, we could run ./botan_valgrind_selftest --list-special and double-check that the "clang_vs_bare_metal_ct_mask" test is in this list. Also, we could take the info whether or not it is expected to fail from there, rather than hard-coding it.

@randombit
Copy link
Owner

Generally a good idea. I'll review fully later. For now, a bikeshed:

This should be named something like ct_selftest - that we are currently using valgrind for our constant time testing is an implementation detail. In fact I recently learned ASan also supports similar annotations, and might be a better option already. In the future I hope we'll be able to adopt a special purpose tool which could additionally detect use of variable time opcodes like divisions, as well as memory access/jumps which are conditional on the secret.

@reneme
Copy link
Collaborator Author

reneme commented Jul 10, 2024

This should be named something like ct_selftest

Fair. Renamed the target (and the python steering script to ct_selftest), the actual cpp-file stayed with valgrind_selftest.cpp because there it's not just an implementation detail. :)

@randombit
Copy link
Owner

because there it's not just an implementation detail.

How so? Oh I see, because we have to see if we're running under valgrind or not. Let's just fix that

ct_utils.h should export a macro and a function

  • BOTAN_HAS_CT_POSION - currently just set if BOTAN_HAS_VALGRIND is set
  • CT::ct_poison_has_effect() returns RUNNING_ON_VALGRIND if BOTAN_HAS_VALGRIND otherwise false

Let's not make our abstractions leakier than they have to be :) there should be no need for this test to know how we've implemented the constant time checking.

In particular this would (in the future) allow us to run this same test using multiple tools, which will be helpful to check that they are doing what we expect.

@coveralls
Copy link

coveralls commented Jul 10, 2024

Coverage Status

coverage: 91.717% (+0.004%) from 91.713%
when pulling 2257d83 on Rohde-Schwarz:test/valgrind_selftest
into 5477167 on randombit:master.

@reneme
Copy link
Collaborator Author

reneme commented Jul 11, 2024

Let's not make our abstractions leakier than they have to be :) there should be no need for this test to know how we've implemented the constant time checking.

Yeah, that's indeed a very valid point. We've implemented it as suggested, except that the define is called BOTAN_CT_POISON_ENABLED as BOTAN_HAS_CT_POISON did sound like CT::poison() wouldn't be defined at all; as is the convention for all other modules. Also, there was a usage in bigint.h that should actually use this new abstraction.

Edit: In order to reach BOTAN_CT_POISON_ENABLED in public headers, we had to define it in build.h rather than ct_utils.h (for bigint.h).

@reneme reneme force-pushed the test/valgrind_selftest branch 2 times, most recently from a36cea0 to 78fa574 Compare July 11, 2024 10:26
@reneme
Copy link
Collaborator Author

reneme commented Jul 12, 2024

As said in #4202 (comment), let's keep this open so that we can add another test that puts the new poison() abstractions introduced in #4197 through its paces.

Copy link
Owner

@randombit randombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

src/ct_selftest/valgrind_selftest.cpp Outdated Show resolved Hide resolved
src/ct_selftest/ct_selftest.cpp Outdated Show resolved Hide resolved
src/ct_selftest/ct_selftest.cpp Outdated Show resolved Hide resolved
src/ct_selftest/ct_selftest.cpp Outdated Show resolved Hide resolved
src/ct_selftest/ct_selftest.cpp Outdated Show resolved Hide resolved
src/lib/math/bigint/bigint.h Outdated Show resolved Hide resolved
This allows abstracting from the concrete implementation of the CT::poison()
checks. Currently this can only be done with valgrind, but in the future we
may want to use other tools for that.

Co-Authored-By: Fabian Albert <[email protected]>
@reneme
Copy link
Collaborator Author

reneme commented Jul 16, 2024

Rebased to master, solved conflicts with the newly introduced CT::poison() overloads (#4197) and addressed review comments.

@reneme reneme force-pushed the test/valgrind_selftest branch 2 times, most recently from c6d4c3b to fed1ee1 Compare July 16, 2024 12:24
Copy link
Owner

@randombit randombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is really helpful for improving assurance that our tests and CT techniques are doing what we expect

... this selftest deliberately creates negative signals for typical
cases where CT::poison() should result in a warning from valgrind
(or other tools).

Co-Authored-By: Fabian Albert <[email protected]>
@reneme
Copy link
Collaborator Author

reneme commented Jul 16, 2024

Thanks for the approval, @randombit. I was working on some more tests, though. Sorry. 😅

As said in #4202 (comment), let's keep this open so that we can add another test that puts the new poison() abstractions introduced in #4197 through its paces.

The low-level wrappers (for ranges and integers, as well as scoped_poison()) are now also tested in ct_selftest.

The higher-level abstractions (_const_time_poison(), poison_all(), poison_range(), and again scoped_poison()) are tested as a "normal" unit test. By creating a sentinel object that implements an observable side-effect in _const_time_poison().

Note that particularly the scoped_poison() in ct_selftest would have been able to find the regression in scoped_cleanup. See #4202.

src/tests/test_ct_utils.cpp Show resolved Hide resolved
@randombit
Copy link
Owner

Squash plz

@reneme reneme merged commit 25541af into randombit:master Jul 17, 2024
39 checks passed
@reneme reneme deleted the test/valgrind_selftest branch July 17, 2024 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants