Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a runtime dis/enabler of DIT Capability on AArch64. #1783

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

nebeid
Copy link
Contributor

@nebeid nebeid commented Aug 20, 2024

Issues:

Addresses #CryptoAlg-2503

Description of changes:

  • Provide runtime functions that mask out (and back in) the DIT CPU capability by clearing (setting) an additional bit in OPENSSL_armcap_P. This mechanism was chosen for the following reasons:

    • It does not require an additional global variable.
    • It avoids extra checks on the path of setting/resetting the DIT bit.
    • It avoids re-evaluating the CPU capability if we were to clear the DIT capability bit itself. That latter bit is now left intact.
      There were write locks added around changing OPENSSL_armcap_P. However, Thread Sanitizer warned about data race possibilities when trying to run a test with concurrent threads where one disables DIT at runtime and the other tries to check for the capability. Therefore, they are documented with a warning to use them only in initialization contexts.
  • Make the DIT functions (enable/disable and set/restore) available regardless of whether the build flag DENABLE_DATA_INDEPENDENT_TIMING_AARCH64=ON was used or not.

    • If the build flag was not used, then the DIT flag is not set and reset with every function and the instructions used for setting it and resetting after checking the capability are omitted and don't incur extra cost.
    • The user now has the choice, regardless of the build flag, to place armv8_set/restore_dit in the user's code.

Call-outs:

  • The (external) API armv8_enable_dit is renamed to armv8_set_dit.
  • armv8_enable_dit now means enable back the capability at runtime.

Testing:

  • The new functions armv8_disable_dit() and arm_enable_dit() were placed in Speed(), the benchmarking function, and their effects were confirmed on Apple M2.
  • There are also thread tests to confirm that the CPU DIT bit is context-switched at the thread level. The runtime dis/enabler is at the process scope because it manipulates OPENSSL_armcap_P.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

@nebeid nebeid requested a review from a team as a code owner August 20, 2024 20:30
@codecov-commenter
Copy link

codecov-commenter commented Aug 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.43%. Comparing base (7090b90) to head (434e11e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1783      +/-   ##
==========================================
- Coverage   78.46%   78.43%   -0.03%     
==========================================
  Files         585      585              
  Lines       99457    99457              
  Branches    14236    14237       +1     
==========================================
- Hits        78038    78013      -25     
- Misses      20784    20805      +21     
- Partials      635      639       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@WillChilds-Klein WillChilds-Klein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see changes to cpu_aarch64_linux.c and cpu_aarch64_apple.c. do we need to udpate cpu_aarch64_win.c as well?

CMakeLists.txt Show resolved Hide resolved
@@ -2682,10 +2681,12 @@ bool Speed(const std::vector<std::string> &args) {
}
}
#if defined(DIT_OPTION)
armv8_disable_dit(); // disable DIT capability at run-time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this call?

@@ -2545,8 +2544,8 @@ static const argument_t kArguments[] = {
{
"-dit",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if this flag is set on non-aarch64 platform?

@@ -90,7 +90,15 @@
#define ARMV8_NEOVERSE_V2 (1 << 14)

// ARMV8_DIT indicates support for the Data-Independent Timing (DIT) flag.
#define ARMV8_DIT (1 << 14)
#define ARMV8_DIT (1 << 15)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this indicates that DIT is supported on the current CPU, right? if so, suggest making the name more descriptive like ARMV8_DIT_SUPPORTED

// (TODO): See if we can detect the DIT capability in Windows environment

// armv8_enable_dit sets the DIT flag to 1 and returns its original value
// armv8_set_dit sets the DIT flag to 1 and returns its original value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which DIT flag? we have two now.

// before it was called.
uint64_t armv8_enable_dit(void);
OPENSSL_EXPORT uint64_t armv8_set_dit(void);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to make the caller track this? can we store it ourselves as a thread-local variable?

Copy link
Contributor

@justsmth justsmth Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, armv8_set_dit and armv8_restore_dit should be internal-only implementation details. They shouldn't be declared in this file. Consumers use-cases can be handled by armv8_disable_dit and armv8_enable_dit.

@@ -82,11 +82,20 @@ uint64_t armv8_enable_dit(void) {
}

void armv8_restore_dit(volatile uint64_t *original_dit) {
if (CRYPTO_is_ARMv8_DIT_capable() && *original_dit != 1) {
if (*original_dit != 1 && CRYPTO_is_ARMv8_DIT_capable()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (*original_dit != 1 && CRYPTO_is_ARMv8_DIT_capable()) {
if (original_dit != NULL && *original_dit != 1 && CRYPTO_is_ARMv8_DIT_capable()) {

#endif // MAKE_DIT_AVAILABLE && !OPENSSL_WINDOWS

void armv8_disable_dit(void) {
OPENSSL_armcap_P &= ~ARMV8_DIT_ALLOWED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here and for armv8_enable_dit, is this operation thread-safe? not sure what the HIDDEN macro expands to, but it's not clear that OPENSSL_armcap_P is thread-local.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OPENSSL_armcap_P is a global variable of type uint32_t which is shared a cross the entire address space.

Firstly, since the memory is shared it doesn't allow a thread to control whether it wants to enable DIT mode or not; its choice can be overwritten by any other thread.

Secondly, multiple writes/reads can now occur concurrently on OPENSSL_armcap_P since it's mostly just referenced directly and there therefore aren't any synchonisation. This was fine because it was only really readable after the initial construction of the memory content (the initialisation of the arm capability vector). This PR breaks that assumption.

Finally, is this a "threading issue". Well, yes and no. One thread can read the value while another writes to it. The writing operation only mutate a single byte of the 32-bit value. So, the value when read is either the bit set or not. It won't be a bogus value. So, no. On the other hand, from the first point I made above, any thread can mutate the value. So one thread can enable, while another immediately afterwards can disable the DIT. But that might just be what we can build - DIT is a process global CPU control no?

I think we weakly considered OPENSSL_armcap_P nominally const after the initial initialisation. It might not be the ideal memory to store this information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from the first point I made above, any thread can mutate the value. So one thread can enable, while another immediately afterwards can disable the DIT.

yes, this is my concern.

But that might just be what we can build - DIT is a process global CPU control no?

My understanding from nevine's description is that it can be configured per-thread. I suppose we could leave it up to the caller to not call these functions from >1 thread, and document this limitation very clearly. I don't think this would be an issue for ACCP, as we'd just set or unset DIT once on initialization, but it could be a concern for other consumers.

…isable is used for the (perceived) CPU capability.
@nebeid nebeid force-pushed the dit-flag-2 branch 3 times, most recently from 67b68c1 to 37f707e Compare September 13, 2024 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants