On Mon, Dec 22, 2025 at 12:23 AM Collin Funk <[email protected]> wrote:
>
> Pádraig Brady <[email protected]> writes:
>
> > On 21/12/2025 00:02, Collin Funk wrote:
> >> Hi,
> >> I'm considering changing the crypto modules to use the OpenSSL EVP
> >> APIs
> >> [1].
> >> A recent coreutils bug report found that 'sha256sum', etc. were not
> >> using SHA-NI instructions despite being supported by the CPU [2]. The
> >> cause of this was OpenSSL 3.6 silently removing
> >> "__attribute__ ((__constructor__)" on the function which calls cpuid
> >> (or equivalent instruction for non-X86 machines) [3][4].
> >
> > Did they need to remove the constructor for some reason,
> > or is it just a bug in openssl where they inadvertently dropped that?
>
> The full rationale can be found from this PR [1]. It was prompted by
> RISC-V's OPENSSL_cpuid_setup depending on BIO_snprintf which was not yet
> initialized when using the FIPS module, leading to a crash.
>
> >> Before that change the CPU features would be detected upon loading the
> >> shared library. Then the deprecated, yet still commonly used,
> >> $DIGEST_(Init|Update|Final) APIs would use them. After the change, this
> >> is not the case. One would have to call OPENSSL_init_crypto explicitly,
> >> which is not recommended [5], or use the EVP APIs which do that
> >> automatically.
> >
> > Looking at [5] is interesting. We saw previously that using
> > the EVP APIs resulted in an extra 4500 allocations!
> > https://lists.gnu.org/archive/html/bug-gnulib/2025-09/msg00058.html
> > [5] also suggests though that if we explicitly init early enough,
> > we might be able to pass OPENSSL_INIT_NO_ADD_ALL_CIPHERS or
> > OPENSSL_INIT_NO_LOAD_CONFIG etc. to avoid some of the overhead?
>
> Ouch, I should have revisited that thread. That is a lot of allocations.
>
> I'll do some experimenting with the OPENSSL_init_crypto flags and see if
> there is a light weight one that does OPENSSL_cpuid_setup and not much
> more.
>
> Collin
>
> [1] https://github.com/openssl/openssl/pull/27466

There may be another way, if you are willing to do extra work.  You
can use the Crytogams sources directly for the high speed crypto.  The
rub is, you have to do the translations using Perl to build the *.c
files from the *.pl files.  And you have to add your own cpuid code,
and determine when you can call the optimized routine.  But you avoid
_all_ of the allocations that OpenSSL does.

For an example using AES, see
<https://wiki.openssl.org/index.php/Cryptogams_AES> on the old OpenSSL
wiki.  SHA is not much different from AES.

And nowadays, I believe you can avoid the OpenSSL sources and use Andy
Polyakov's Cryptogam sources directly from
<https://github.com/dot-asm/cryptogams>.

Jeff

Reply via email to