Add EC_Scalar and EC_AffinePoint types #4042

randombit · 2024-05-05T19:28:22Z

This is the other side of #3979 where we push the interface that's exposed to the rest of the library to a non-BigInt oriented model.

#4027

coveralls · 2024-05-05T20:21:33Z

coverage: 91.693% (-0.03%) from 91.726%
when pulling 8042e9d on jack/new-ec-types
into 7fad1d2 on master.

src/lib/pubkey/ec_group/ec_apoint.h

src/lib/pubkey/ec_group/ec_group.h

src/lib/pubkey/ec_group/ec_inner_data.h

src/lib/pubkey/ec_group/ec_scalar.h

reneme · 2024-06-11T10:08:44Z

I will go through this in the coming days. Now with the underlying PR merged. 👍

reneme

A first review pass. Looks really good! Especially the simplified algorithm implementations!

Left some nits and suggestions. There's one finding that might point to a missing test, though.

src/lib/pubkey/ec_group/ec_inner_data.h

reneme · 2024-06-12T07:07:20Z

src/lib/pubkey/ec_group/ec_inner_data.h

+
+         BufferStuffer stuffer(bytes);
+         stuffer.append(y_is_odd ? 0x03 : 0x02);
+         stuffer.append(std::span{m_xy}.last(fe_bytes));


Compressed encoding is oddness-header + x-value, no? Surprised that it didn't fail any tests.

Suggested change

stuffer.append(std::span{m_xy}.last(fe_bytes));

stuffer.append(std::span{m_xy}.first(fe_bytes));

See also: #4042 (comment)

LOL WTF that definitely should have caused test failures, need to look at this

Ah OK the reason this was missed is that currently this serialization logic in unused, because keys/etc are still using EC_Point.

We're in for a somewhat awkward situation in the short term as we'll have 3 distinct elliptic curve point types - original flavor EC_Point, this EC_AffinePoint, plus the (internal only) pcurves AffinePoint, with a lot of redundant/duplicated logic.

For 3.6.0 onward the situation improves

The various pubkey algorithms switch to storing their points as EC_AffinePoint. At that point EC_Point becomes unused except for testing; unfortunately we can't remove it completely until Botan 4. Maybe we'll be able to segregate it into a deprecated submodule of ec_group though.

EC_AffinePoint gains a bridge to pcurve. This affords removing some redundancies between pcurves and what we actually need in practice. For example pcurves has scalar_from_bits_with_trunc but this is just modular reduction plus some shifts. It'll likely be simpler, once the bridge exists, to remove that and do the shifting in some higher point in the call stack so the logic is shared between the pcurves and BigInt based approaches.

src/lib/pubkey/ec_group/ec_inner_data.h

src/lib/pubkey/gost_3410/gost_3410.cpp

src/lib/pubkey/rfc6979/rfc6979.cpp

reneme · 2024-06-12T11:11:36Z

src/lib/pubkey/rfc6979/rfc6979.cpp

 const BigInt& RFC6979_Nonce_Generator::nonce_for(const BigInt& m) {
-   m.serialize_to(std::span{m_rng_in}.subspan(m_rlen));
+   std::vector<uint8_t> m_bytes(m_rlen);
+   m.serialize_to(m_bytes);
+   return this->nonce_for(m_bytes);
+}


Not sure its worth omitting the extra allocation, but you could serialize straight into m_rng_in and then establish an internal method that just assumes m_rng_in is properly set up. Similarly, for the std::span<> overload, that could just copy_mem into m_rng_in and then also call this private method.

Like so:

m.serialize_to(std::span{m_rng_in}.subspan(m_rlen)); return this->generate_nonce();

While we're on it: it might be worthwhile to keep the span "std::span{m_rng_in}.subspan(m_rlen)" as a member variable to avoid misuse (by malforming the subspan() invocation) here and in the other overload

Good idea. I think the original version of this was written before we finalized on the write-to-span idiom for BigInt.

Actually looked at this more and it doesn't seem worth bothering with.

There are two uses of RFC 6979 right now, ECDSA and DSA.

ECDSA uses the bytes variant (or will, with this PR merged)

DSA actually sets up a complete new RFC6979_Nonce_Generator object for each signature! Saving one extra allocation is the least of our worries 🙈

Any future users of RFC 6979 (eg ECGDSA or etc) will use the bytes interface, so this second overload of nonce_for really is (will become) DSA specific at this point.

I think this can be cleaned up further - in particular in the long run I'd like to avoid the multiple conversions between BigInt and EC_Scalar implied by

const auto k = EC_Scalar::from_bigint(m_group, m_rfc6979->nonce_for(m_bytes: m.serialize()));

It's interesting that as code becomes faster, tiny optimizations become more meaningful. We're already under 250K cycles for ECDSA signatures for secp256r1 (for pcurves). At this point saving even a thousand cycles can be noticeable in the overall throughput.

I left a todo regarding this in #4027

I was somewhat surprised in #4143 that ECDSA once bridged was slower than our "demo" ECDSA directly using pcurves. I initially ascribed it to overhead of the abstractions but it turns out almost all of the overhead is RFC 6979. Almost 50K cycles on my laptop. I think, while implementing this in terms of HMAC_DRBG was convenient, it's not very fast...

randombit · 2024-06-16T13:36:44Z

BTW I should mention in terms of reviewing, the overall structure behind the new types isn't that important since it's going to change quite a bit to handle the possibility of different backend implementations. That part is literally just the first thing that came to mind that works. Whereas EC_Scalar and EC_AffinePoints interfaces, and how they are used to implement the various schemes, are IMO "final" and so should deserve more scrutiny.

randombit · 2024-06-22T11:35:43Z

OK with eb4f1bd in place, previous comment can be disregarded. I think this is more or less the "final" [*] approach for implementing the new scalar and point types. You can see in #4143 what the bridge looks like for pcurves.

[*] Not ideal in any way, I'd like there to be fewer allocations, dynamic_cast, etc in here - but it works and doesn't seem to have that much overhead in practice. I expect we'll iterate on this in the future. A lot of what I'd like to do isn't possible until we've eliminated the real weirdo curves and all the deprecated interfaces, EC_Point, etc. [**]

[**] Not technically true - we could get away with a third internal EC implementation - classic BigInt approach, pcurves, plus a third generic curve but bounded size impl. But it just doesn't seem worth it from a complexity perspective. I expect that in Botan 4, with the restrictions on curves in place, we'll transition away from BigInt entirely within EC.

randombit · 2024-06-25T12:37:26Z

Deferring to 3.6 - it's better that all of this cook in master for a few months before we commit to anything wrt public API

src/lib/pubkey/rfc6979/rfc6979.cpp

reneme · 2024-06-25T10:04:23Z

src/lib/pubkey/rfc6979/rfc6979.cpp

+         uint8_t carry = 0;
+         for(size_t i = 0; i != m_rng_out.size(); ++i) {
+            const uint8_t w = m_rng_out[i];
+            m_rng_out[i] = (w >> shift) | carry;
+            carry = w << (8 - shift);
+         }


Not at all sure this is any worth it, performance wise. But we could do the shifting on 64-bit words (for instance) as long as possible (i'm assuming that m_rlen isn't necessarily divisible by 8.

Suggested change

uint8_t carry = 0;

for(size_t i = 0; i != m_rng_out.size(); ++i) {

const uint8_t w = m_rng_out[i];

m_rng_out[i] = (w >> shift) | carry;

carry = w << (8 - shift);

}

uint8_t carry = 0;

BufferSlicer slicer(m_rng_out);

BufferStuffer stuffer(m_rng_out);

while(slicer.remaining() >= 8) {

const auto w = load_be(slicer.take<8>());

stuffer.append(store_be((w >> shift) | carry));

carry = w << (64 - shift);

}

while(!slicer.empty()) {

const uint8_t w = slicer.take_byte();

stuffer.append((w >> shift) | carry);

carry = w << (8 - shift);

}

The BufferTransformer proposed in #3942 could slim this down.

In more general terms: this tiered transformation could even be a feature of the BufferTransformer, where one passes a lambda and the BufferTransformer takes care of the load/stores and the optimal strides. Like so:

BufferTransformer bt(m_rng_out); bt.transform([carry = 0, shift](std::unsigned_integral auto i) mutable -> decltype(i) { const auto r = (i >> shift) | carry; carry = i << (sizeof(i) * 8 - shift); return r; });

... just thinking out loud, I'm aware that this might be a vast over-engineering. However, I believe, in #3883 I could use this for the bitvector<> implementation that needs something similar for its bit operations.

Also, e.g. for block ciphers BT::transform could be overloaded with a statically-sized span to factor out the loop and buffer handling.

In this specific case, this shift only occurs for curves where the group order is not a multiple of 8, which consists of a set of weird curves (P239, etc almost all of which are deprecated already) plus P-521. So here I don’t think it matters much, especially when HMAC_DRBG is so expensive.

However I do think this general concept is worth exploring. A related (but possibly dissimilar enough that it’s not worth attempting to handle in the same abstraction) issue is in the mp code where we have explict unrolling for 8 and sometimes 4 words for various algorithms followed by a loop that handles the tail one word at a time. If we did this systematically for 16,8,4,2,1 word increments, that would likely lead to some nice performance improvements.

For the record: This draft of BufferTransformer actually demonstates how we could systematically do the tiered block processing. Along those lines:

BufferTransformer bt(m_rng_out); uint8_t carry = 0; bt.process_blocks_of<8, 1>([&]<size_t s>(std::span<const uint8_t, s> in, std::span<uint8_t, s> out) { const auto i = load_be(in); using block_type = decltype(i); const auto r = static_cast<block_type>((i >> shift) | static_cast<block_type>(carry)); carry = i << (sizeof(block_type) * 8 - shift); store_be(out, r); });

src/lib/pubkey/rfc6979/rfc6979.cpp

randombit · 2024-07-10T07:17:39Z

@reneme Was not sure if you were planning further review of this

reneme · 2024-07-10T08:52:58Z

Was not sure if you were planning further review of this

I believe I wasn't fully done but no need to block the merge.

randombit · 2024-07-10T11:38:07Z

OK I'm going to go ahead and merge. Feel free to do a retro-review anytime, we have plenty of time between now and 3.6

These have a much more restrictive interface, as compared to our existing EC_Point and BigInt Co-authored-by: René Meusel <[email protected]>

Co-Authored-By: René Meusel <[email protected]>

Co-authored-by: René Meusel <[email protected]>

This will allow switching in pcurves later

randombit requested a review from reneme May 5, 2024 19:28

randombit changed the title ~~Jack/new ec types~~ Add EC_Scalar and EC_AffinePoint types May 5, 2024

randombit mentioned this pull request May 5, 2024

Replace BigInt based elliptic curve library #4027

Open

17 tasks

randombit force-pushed the jack/new-ec-types branch 2 times, most recently from 03b58cd to bae2ce8 Compare May 5, 2024 19:54

randombit force-pushed the jack/new-ec-types branch 4 times, most recently from a70db6d to b09a762 Compare May 6, 2024 11:55

randombit force-pushed the jack/new-ec-types branch 6 times, most recently from 4db015d to b594b6b Compare May 19, 2024 11:21

randombit added this to the Botan 3.5.0 milestone May 22, 2024

randombit force-pushed the jack/new-ec-types branch 3 times, most recently from d5fa5d9 to 94c81d6 Compare May 29, 2024 10:17

reneme mentioned this pull request May 30, 2024

Add library for compile time instantiation of elliptic curves #3979

Merged

randombit commented Jun 3, 2024

View reviewed changes

src/lib/pubkey/ec_group/ec_apoint.h Outdated Show resolved Hide resolved

src/lib/pubkey/ec_group/ec_group.h Outdated Show resolved Hide resolved

src/lib/pubkey/ec_group/ec_inner_data.h Outdated Show resolved Hide resolved

src/lib/pubkey/ec_group/ec_scalar.h Outdated Show resolved Hide resolved

randombit force-pushed the jack/new-ec-types branch 4 times, most recently from 9d2a5c2 to a66bee9 Compare June 11, 2024 09:38

reneme reviewed Jun 12, 2024

View reviewed changes

randombit force-pushed the jack/new-ec-types branch 2 times, most recently from 2450e74 to 5f46292 Compare June 15, 2024 13:13

randombit force-pushed the jack/new-ec-types branch 5 times, most recently from 08980de to eb4f1bd Compare June 21, 2024 14:36

randombit modified the milestones: Botan 3.5.0, Botan 3.6.0 Jun 25, 2024

reneme reviewed Jun 26, 2024

View reviewed changes

reneme mentioned this pull request Jun 26, 2024

Chore: BufferTransformer and Blowfish Refactor #4151

Draft

randombit force-pushed the jack/new-ec-types branch 2 times, most recently from f78faf5 to 2ada5e5 Compare July 10, 2024 10:51

randombit and others added 10 commits July 10, 2024 08:38

Add new EC_Scalar and EC_AffinePoint types

502b49c

These have a much more restrictive interface, as compared to our existing EC_Point and BigInt Co-authored-by: René Meusel <[email protected]>

Modify RFC6979 generator to accept EC_Scalar

a311bb3

Co-Authored-By: René Meusel <[email protected]>

Use new EC types in ECDSA

19ea895

Co-authored-by: René Meusel <[email protected]>

Use new EC types in ECKCDSA

9758800

Co-authored-by: René Meusel <[email protected]>

Use new EC types in ECGDSA

0b7a833

Use new EC types in GOST 34.10

460ac4a

Co-authored-by: René Meusel <[email protected]>

Use new EC types in SM2

621f9d5

Use new EC types in ECDH

056fcc5

Use new EC types in ECIES

5b1d15b

Hide EC Scalar and Point data behind a virtual interface

8042e9d

This will allow switching in pcurves later

randombit force-pushed the jack/new-ec-types branch from 2ada5e5 to 8042e9d Compare July 10, 2024 12:40

randombit merged commit 45be74e into master Jul 10, 2024
39 checks passed

randombit deleted the jack/new-ec-types branch July 10, 2024 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add EC_Scalar and EC_AffinePoint types #4042

Add EC_Scalar and EC_AffinePoint types #4042

randombit commented May 5, 2024 •

edited

Loading

coveralls commented May 5, 2024 •

edited

Loading

reneme commented Jun 11, 2024

reneme left a comment

reneme Jun 12, 2024

reneme Jun 12, 2024

randombit Jun 12, 2024

randombit Jun 15, 2024

reneme Jun 12, 2024

randombit Jun 15, 2024

randombit Jun 15, 2024

randombit Jun 22, 2024

randombit commented Jun 16, 2024

randombit commented Jun 22, 2024

randombit commented Jun 25, 2024

reneme Jun 25, 2024

reneme Jun 25, 2024

randombit Jun 27, 2024 •

edited

Loading

reneme Jul 10, 2024 •

edited

Loading

randombit Jul 10, 2024

randombit commented Jul 10, 2024

reneme commented Jul 10, 2024

randombit commented Jul 10, 2024

	stuffer.append(std::span{m_xy}.last(fe_bytes));
	stuffer.append(std::span{m_xy}.first(fe_bytes));

-         uint8_t carry = 0;
-         for(size_t i = 0; i != m_rng_out.size(); ++i) {
-            const uint8_t w = m_rng_out[i];
-            m_rng_out[i] = (w >> shift) | carry;
-            carry = w << (8 - shift);
-         }
+         uint8_t carry = 0;
+         BufferSlicer slicer(m_rng_out);
+         BufferStuffer stuffer(m_rng_out);
+         while(slicer.remaining() >= 8) {
+            const auto w = load_be(slicer.take<8>());
+            stuffer.append(store_be((w >> shift) | carry));
+            carry = w << (64 - shift);
+         }
+         while(!slicer.empty()) {
+            const uint8_t w = slicer.take_byte();
+            stuffer.append((w >> shift) | carry);
+            carry = w << (8 - shift);
+         }

Add EC_Scalar and EC_AffinePoint types #4042

Add EC_Scalar and EC_AffinePoint types #4042

Conversation

randombit commented May 5, 2024 • edited Loading

coveralls commented May 5, 2024 • edited Loading

reneme commented Jun 11, 2024

reneme left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

randombit commented Jun 16, 2024

randombit commented Jun 22, 2024

randombit commented Jun 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

randombit Jun 27, 2024 • edited Loading

Choose a reason for hiding this comment

reneme Jul 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

randombit commented Jul 10, 2024

reneme commented Jul 10, 2024

randombit commented Jul 10, 2024

randombit commented May 5, 2024 •

edited

Loading

coveralls commented May 5, 2024 •

edited

Loading

randombit Jun 27, 2024 •

edited

Loading

reneme Jul 10, 2024 •

edited

Loading