Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PQC: Classic McEliece #3883

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

FAlbertDev
Copy link
Collaborator

@FAlbertDev FAlbertDev commented Jan 11, 2024

This PR relates to the Classic McEliece KEM as specified in this ISO draft. It also contains the instances defined in the NIST Round 4 submission. The test cases were generated using the NIST submissions reference implementation. Note that Classic McEliece (module cmce) is not the original McEliece Algorithm that is implemented in Botan's mce module. See the Classic McEliece homepage, for a brief comparison.

TODO Tracker

@coveralls
Copy link

coveralls commented Jan 11, 2024

Coverage Status

coverage: 91.85% (+0.1%) from 91.713%
when pulling 9a6f4df on Rohde-Schwarz:pqc/classic_mceliece
into 6945e90 on randombit:master.

@reneme reneme added this to the Botan 3.4.0 milestone Jan 11, 2024
@FAlbertDev FAlbertDev force-pushed the pqc/classic_mceliece branch 4 times, most recently from f73829f to e505390 Compare January 19, 2024 14:53
@atreiber94 atreiber94 force-pushed the pqc/classic_mceliece branch 2 times, most recently from 04b9314 to 190261d Compare January 19, 2024 15:59
@reneme reneme self-requested a review January 22, 2024 09:33
@FAlbertDev

This comment was marked as resolved.

@reneme reneme force-pushed the pqc/classic_mceliece branch 6 times, most recently from 7662592 to 2a3e60f Compare January 23, 2024 15:55
@reneme

This comment was marked as resolved.

Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks at cmce_decaps.cpp, cmce_encaps.cpp and cmce.cpp, noting many minor code style things and a few suggestions for alternative code structuring. Not looking into the Classic McEliece specifics at all here.

src/build-data/oids.txt Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_types.h Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_encaps.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_encaps.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_encaps.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_encaps.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_encaps.cpp Outdated Show resolved Hide resolved
src/lib/utils/strong_type.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass on cmce_field_orderings.cpp. I'm somewhat concerned that this isn't very efficient. But it may well be "good enough". Should we profile that?

src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_field_ordering.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More comments. I didn't look at cmce_parameter_set.*, cmce_parameters_* and cmce_poly.*, yet.

src/lib/utils/strong_type.h Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce.cpp Outdated Show resolved Hide resolved
Comment on lines 41 to 49
// TODO: Only for test instances. Remove on final PR
size_t m = Classic_McEliece_GF::log_q_from_mod(mod);

for(int i = static_cast<int>(m) - 2; i >= 0; --i) {
x ^= CT::Mask<uint32_t>::expand((uint32_t(1) << (i + m)) & x)
.if_set_return(static_cast<uint32_t>(mod.get()) << i);
}

return GF_Elem(static_cast<uint16_t>(x));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to remove (and probably replace by some exception?)

src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_matrix.cpp Outdated Show resolved Hide resolved
Comment on lines 247 to 258
Code_Word Classic_McEliece_Matrix::mul(const Classic_McEliece_Parameters& params, const Error_Vector& e) const {
auto s = e.subvector(0, params.pk_no_rows());
auto e_T = e.subvector(params.pk_no_rows());
auto pk_slicer = BufferSlicer(m_mat_bytes);

for(size_t i = 0; i < params.pk_no_rows(); ++i) {
auto pk_current_bytes = pk_slicer.take(params.pk_row_size_bytes());
auto row = secure_bitvector(pk_current_bytes, params.n() - params.pk_no_rows());
row &= e_T;
s.at(i) = s.at(i) ^ row.has_odd_hamming_weight();
}

BOTAN_ASSERT_NOMSG(pk_slicer.empty());
return s.as<Code_Word>();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This produces quite a few copies and allocations. If that's in the implementation's hot path, we should probably want to have another look. Here's which (I believe cause allocations/copies):

  1. .subvector() always creates a new bitvector and copies the content into a newly allocated buffer,
  2. the c'tor of bitvector copies the passed-in buffer (particularly pk_current_bytes
  3. .as<> produces a copy of the bitvector with a new type

For (3): This could already be the right type using e.subvector<Code_Word>(), no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some context: This function is called once per encapsulation. Also, e and s are pretty small, while m_mat_bytes is gigantic. Therefore, 1) and 3) are not critical; I'll apply your suggestion for 3) anyway. I don't know how we can prevent 2) without having something like a bitvector view or dropping the bitvector altogether.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll have a timeboxed look into a bitvector that doesn't own its underlying storage. Perhaps that isn't too hard to achieve, especially when we can limit it to byte-aligned subvectors.

Copy link
Collaborator

@reneme reneme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with a first pass on the implementation. Don't get put off by the number of comments. Most are just C++ style nits and smaller programming suggestions.

That's really good work! 😃

Comment on lines 48 to 57
/// Reduced instances for side channel analysis (Self-created test instance with
/// m=8, n=128, t=8, f(z)=z^8+z^7+z^2+z+1, F(y)=y^8+y^4+y^3+y^2+1)
/// Minimal instance without semi-systematic matrix creation and no plaintext confirmation
test,
/// Minimal instance with semi-systematic matrix creation and no plaintext confirmation
testf,
/// Minimal instance without semi-systematic matrix creation and with plaintext confirmation
testpc,
/// Minimal instance with semi-systematic matrix creation and with plaintext confirmation
testpcf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: remove before merging

src/lib/pubkey/classic_mceliece/cmce_parameter_set.h Outdated Show resolved Hide resolved
Comment on lines 27 to 32
/**
* @returns ceil(n/d)
* TODO: Remove once LMS is merged
*/
constexpr size_t ceil_div(size_t n, size_t d) {
return (n + d - 1) / d;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-evaluate before merging this.

src/lib/pubkey/classic_mceliece/cmce_parameters.h Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_parameters.h Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_poly.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_poly.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/cmce_poly.cpp Outdated Show resolved Hide resolved
src/lib/pubkey/classic_mceliece/info.txt Outdated Show resolved Hide resolved
src/lib/utils/bit_ops.h Outdated Show resolved Hide resolved
@FAlbertDev
Copy link
Collaborator Author

Thanks a lot for your extensive review, @reneme! I addressed your suggestions and am optimistic that this PR is ready to drop its Draft status 🎉. Note that some suggestions that depend on other PRs are still open, which are not critical, though. Also, note that an extensive side-channel analysis is still in progress.

@FAlbertDev FAlbertDev marked this pull request as ready for review February 5, 2024 14:49
constexpr bitref& operator^=(bool other) noexcept { return assign(this->is_set() ^ other); }

private:
constexpr bitref& assign(bool bit) noexcept { return (bit) ? set() : unset(); }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, DATA identified this line as a leakage during our SCA review.

Due to the ? operator, a control-flow difference is observed based on the boolean input bit variable.

The assign() routine is used within the push_back() routine, which in turn is used within the decode() routine. This may allow an adversary to observe the error vector e and, hence recover the shared secret.

We suggest to perform both the set() and unset() functions with the input as a mask, for example:

         private:
            constexpr bitref& assign(bool bit) noexcept {
                const block_type assign_mask = 0 - static_cast<block_type>(bit);
                this->m_block |=  (this->m_mask &  assign_mask);
                this->m_block &= ~(this->m_mask & ~assign_mask);
                return *this;
            }

In our case, this results in the following instructions - without any conditional branch based on the input:

[ ... ] 
                                         const block_type assign_mask = 0 - static_cast<block_type>(bit);
41d397: 41 f7 dc                   neg %r12d
[ ... ]
                                         this->m_block |= (this->m_mask & assign_mask);
41d3ab: 44 89 e1                   mov %r12d,%ecx
41d3ae: 21 c1                      and %eax,%ecx
                                         this->m_block &= ~(this->m_mask & ~assign_mask);
41d3b0: f7 d0                      not %eax
                                         this->m_block |= (this->m_mask & assign_mask);
41d3b2: 0a 0a                      or (%rdx),%cl
                                         this->m_block &= ~(this->m_mask & ~assign_mask);
41d3b4: 44 09 e0                   or %r12d,%eax
41d3b7: 21 c8                      and %ecx,%eax
[ ... ]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your analysis and your report! I applied your fix using Botan's constant-time helper class (7a0fe59)

@@ -1,6 +1,8 @@
/*
* An abstraction for an arbitrarily large bitvector that can
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reneme the code still contains some artefacts for varying the underlying data type that we'd ideally want to get rid of.

@FAlbertDev FAlbertDev force-pushed the pqc/classic_mceliece branch 2 times, most recently from 393f698 to 8804b64 Compare May 27, 2024 13:46
@FAlbertDev
Copy link
Collaborator Author

@FAlbertDev This needs another rebase. Also, with #3985 merged, it'll need to implement the new method Public_Key::raw_public_key_bits() as well as the method PK_Key_Generation_Test::public_key_from_raw() in the CMCE_Generic_Keygen_Tests test case.

Done! @reneme Do you still want to reduce the test times further for the --run-long-test? Without a --nightly test CLI option or something similar, I see no elegant way to do so without cutting out KATs. I don't know if we want to introduce multiple layers of long tests, though.

@randombit
Copy link
Owner

Looks like another rebase is required, have not investigate the reason

@randombit
Copy link
Owner

Re the long tests we should probably just go ahead with a --nightly or some similar option. Doesn't have to happen in this PR.

@FAlbertDev
Copy link
Collaborator Author

Rebased to master.

@FAlbertDev
Copy link
Collaborator Author

Cleaned-up history.

Copy link
Owner

@randombit randombit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a partial review I have yet to get into the meat of the implementation

* - Relatively slow key generation
* - Algorithm is complex and hard to implement side-channel resistant
*/
class BOTAN_PUBLIC_API(3, 4) Classic_McEliece_PublicKey : public virtual Public_Key {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should bump this to match 3,5 (optimistically)


protected:
std::shared_ptr<Classic_McEliece_PublicKeyInternal>
m_public; // NOLINT(misc-non-private-member-variables-in-classes)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readability nit I think this would be better with // NOLINTNEXTLINE so the declaration itself isn't split

BOTAN_DIAGNOSTIC_PUSH
BOTAN_DIAGNOSTIC_IGNORE_INHERITED_VIA_DOMINANCE

class BOTAN_PUBLIC_API(3, 4) Classic_McEliece_PrivateKey final : public virtual Classic_McEliece_PublicKey,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3,5


namespace {
// Only for moduli 0b0010000000011011 and 0b0001000000001001
inline CmceGfElem internal_reduce(uint32_t x, CmceGfMod mod) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for mod to be a compile-time template parameter instead?

Alternately, if CmceGfMod could only take on these specific values we could be assured that unexpected inputs could not occur.

}

Classic_McEliece_GF Classic_McEliece_GF::inv() const {
// Compute the inverse using fermat's little theorem: a^(q-1) = 1 => a^(q-2) = a^-1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fermat

src/lib/utils/bitvector.h Outdated Show resolved Hide resolved
/**
* Renders this bitvector into a sequence of "0"s and "1"s.
*/
std::string to_string() const {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would this be useful? If just for tests/debugging maybe better to define it elsewhere

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's mostly meant for debugging, yes. Where would you prefer to have this implemented?
Why is that bad to have as a member? It's an internal class, after all.

I added a note to the docstring for now.

src/lib/utils/bitvector.h Outdated Show resolved Hide resolved
/**
* @returns true iff no bit is set
*/
bool none() const {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This convention of foo plus ct_foo seems very easy to make a mistake and call foo in the wrong context and introduce a side channel.

In newer code (eg in pcurves) I've moved to foo is the constant time implementation, and if a variable time foo is required this is explicit as eg vartime_foo or foo_vartime. The explicit vartime jumps out at the reader and it's much harder to accidentally be variable time when not intended.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now applied this convention to none(), all(), any() and equals() and removed the ct_ prefix from hamming_weight() and has_odd_hamming_weight() (without providing a ct-alternative). All other (non re-allocating) methods should be CT (review appreciated for this statement).

template <std::unsigned_integral BlockT>
requires(is_byte_aligned())
size_t is_memory_aligned_to() const {
void* ptr = const_cast<void*>(static_cast<const void*>(m_source.as_byte_span().data() + read_bytepos()));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the const_cast here? Probably worth a comment

Copy link
Collaborator

@reneme reneme Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::align is treating ptr as an out-param (taking it as void*& 😨). The data pointed to by ptr isn't changed, but it re-sets ptr to the next aligned address. I changed the order of operations a little and added a comment.

@reneme
Copy link
Collaborator

reneme commented Jun 28, 2024

Rebased to master, after #4154 got merged.

@randombit randombit modified the milestones: Botan 3.5.0, Botan 3.6.0 Jul 6, 2024

std::vector<Test::Result> test_bitvector_bitwise_accessors() {
return {
Botan_Tests::CHECK("default constructed bitvector",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Botan_Tests:: prefix is unnecessary and causes more indentation than would otherwise be used

result.confirm("1", bv.at(1));
result.confirm("2", bv.at(2));
result.confirm("3", bv.at(3));
result.confirm("4", !bv.at(4));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this test, and many of the others here, could usefully be randomized.

  1. Choose a random bit vector length n
  2. Choose a hamming weight h between 0 and n
  3. Choose h of n indexes randomly
  4. Iterate over n verify that all are either set or unset appropriately

I don't insist on it I guess but it just feels odd to me to test this one situation (set 1,2,3 otherwise unset) especially when it skips edge cases like the topmost bit. Currently (Lines of code) / (Signal of things working correctly if this test passes) feels quite high.

template <typename T>
auto random_pc(Test::Result& result) {
std::array<uint8_t, sizeof(T)> buf;
Test::rng().randomize(buf);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can use random_array from #4152

FAlbertDev and others added 4 commits July 12, 2024 14:40
- constant time conditional swap with mask
- floor_log2

Co-Authored-By: Amos Treiber <[email protected]>
This is an implementation of the Classic McEliece KEM according to the
NIST Round 4 submission and the ISO draft 20230419.

Co-Authored-By: Amos Treiber <[email protected]>
Comment on lines +109 to +120
/**
* @brief Create permutation pi as in (Section 8.2, Step 3).
*/
CmcePermutation create_pi(std::span<const uint32_t> a) {
auto a_pi_zipped = enumerate(a);
CMCE_CT::bitonic_sort_pair(std::span(a_pi_zipped));

CmcePermutation pi_sorted;
std::tie(a, pi_sorted.get()) = unzip(std::span(a_pi_zipped));

return pi_sorted;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to me: This method is messed up. We want to return the sorted a values too, otherwise the std::adjacent_find will not find repetitions. Also add a regression test for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I just saw we used to have secure_vector<uint32_t>& a as input

Comment on lines +103 to +110
pivot_indices.at(row_idx) = count_lsb_zeros(row_acc);

// Add subsequent rows to the current row, until the pivot
// bit is set.
for(size_t next_row = row_idx + 1; next_row < Classic_McEliece_Parameters::mu(); ++next_row) {
sub_mat.at(row_idx).ct_conditional_xor(!sub_mat.at(row_idx).at(pivot_indices.at(row_idx)).as_choice(),
sub_mat.at(next_row));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to future me: This seems to be a side-channel. (conditional memory access based on pivot_idices.at(row_idx))

Co-Authored-By: Fabian Albert <[email protected]>
This allows extracting subvectors of the bitvector as unsigned
integral bit masks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants