Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] feat: begin modernization of symmetric crypto [#268][#289][#357] #411

Open
wants to merge 60 commits into
base: main
Choose a base branch
from

Conversation

rmlibre
Copy link
Contributor

@rmlibre rmlibre commented Jul 16, 2024

Description

Generally, this PR is aimed at bringing best practices to the handling of secrets. That is done through making them more inaccessible, or by clearing them from memory as soon as possible. This PR also brings domain separation, input canonicalization, context commitment, & authenticated associated data to distinct encryption / decryption contexts.

More details & updates to come. Comments & reviews are open.

Domain Separation:

When using cryptography to authenticate data, or derive secrets, etc, it's important to tie those processes to their intended usage contexts by declaring the context with an unambiguous label. This allows both the production & consumption of cryptographic artifacts to ensure that they're in agreement over the label (domain) the artifact is supposed to be used within.

References:

What is meant by domain separation ...
What non-trivial benefit does including ...
Separate Your Domains: NIST PQC ...

Input Canonicalization:

Cryptography works best when its users are 100% sure & clear about what information they're processing. That can be subtlety, and surprisingly, challenging to do.

Example:

Say we're not using canonical input encoding, and we're hashing (H) a username: "green" & an email address: "housekeeper[at]email[dot]com". Here, if the delineation is unclear, then the information is not clear. Another user with username: "greenhouse" and email address: "keeper[at]email[dot]com" will produce the same hash.

H("green" + "[email protected]") == H("greenhouse" + "[email protected]")

References:

What is Canonicalization Attack?
Canonicalization Attacks Against ...

AEAD & Context Commitment:

Confusingly, the two most widely deployed authenticated ciphers (AES-GCM & ChaCha20-Poly1305) do not create ciphertexts which commit to a key. Which means that knowing the key allows one to create a single ciphertext which successfully decrypts to various different plaintext messages. In most cases this isn't considered an issue because it requires knowledge of the key. But, it does violate intuitive notions of what it means for something to be authentic when forgeries are easy to produce. It has also been shown to lead to practical problems:

Partitioning oracles can arise when encryption schemes are not committing with respect to their keys. We detail adaptive chosen ciphertext attacks that exploit partitioning oracles to efficiently recover passwords and de-anonymize anonymous communications.

~ Source: USENIX Security 2021

Because Fernet uses HMAC-SHA-256, instead of the universal hashes of the previously discussed ciphers, it doesn't suffer from this problem. But, it also doesn't have a direct interface for authenticating domain labels or other associated data. Borrowing partially from Efficient Schemes for Committing ..., which proposes deriving a fresh key by running a keyed PRF over all of the inputs, it can elegantly support the authentication of additional values.

References:

Key Committing AEADs
How to Abuse and Fix Authenticated Encryption ...
Succinctly-Committing Authenticated Encryption
Key commitment in GCM (or AEAD in general)

Remediations

... detailed descriptions coming soon


Passing Workflows:

@rmlibre rmlibre marked this pull request as ready for review July 16, 2024 10:21
Copy link
Collaborator

@jeremywmoore jeremywmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a first pass review. Another pass may yield some more suggestions but I think there are some things to work out around managing the salt.

hushline/__init__.py Outdated Show resolved Hide resolved
hushline/__init__.py Outdated Show resolved Hide resolved
hushline/__init__.py Outdated Show resolved Hide resolved
hushline/crypto/secrets_manager.py Outdated Show resolved Hide resolved
hushline/crypto/secrets_manager.py Outdated Show resolved Hide resolved
hushline/crypto/secrets_manager.py Show resolved Hide resolved
hushline/crypto/secrets_manager.py Outdated Show resolved Hide resolved
Copy link

gitguardian bot commented Jul 18, 2024

⚠️ GitGuardian has uncovered 6 secrets following the scan of your pull request.

Please consider investigating the findings and remediating the incidents. Failure to do so may lead to compromising the associated services or software components.

Since your pull request originates from a forked repository, GitGuardian is not able to associate the secrets uncovered with secret incidents on your GitGuardian dashboard.
Skipping this check run and merging your pull request will create secret incidents on your GitGuardian dashboard.

🔎 Detected hardcoded secrets in your pull request
GitGuardian id GitGuardian status Secret Commit Filename
- - Generic High Entropy Secret 7f35ab2 docker-compose.yaml View secret
- - Generic High Entropy Secret d6af3e3 dev_env.sh View secret
- - Generic High Entropy Secret 0924906 dev_env.sh View secret
- - Generic High Entropy Secret 7f35ab2 dev_env.sh View secret
- - Generic High Entropy Secret 294338b dev_env.sh View secret
- - Generic High Entropy Secret 294338b docker-compose.yaml View secret
🛠 Guidelines to remediate hardcoded secrets
  1. Understand the implications of revoking this secret by investigating where it is used in your code.
  2. Replace and store your secrets safely. Learn here the best practices.
  3. Revoke and rotate these secrets.
  4. If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.

To avoid such incidents in the future consider


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

hushline/model.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@jeremywmoore jeremywmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does having a single shared salt compare to generating a per-user salt? Does using a kdf which incorporates the domain (and possible aad) obviate this need?

Although since we may still want to generate a per-user salt for passwords to combat precompute attacks.

hushline/__init__.py Outdated Show resolved Hide resolved
hushline/crypto/secrets_manager.py Outdated Show resolved Hide resolved
hushline/crypto/secrets_manager.py Outdated Show resolved Hide resolved
hushline/crypto/secrets_manager.py Show resolved Hide resolved
hushline/model.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@rmlibre rmlibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit to the password hash algorithm's associated encodings

hushline/model.py Outdated Show resolved Hide resolved
hushline/model.py Outdated Show resolved Hide resolved
rmlibre added a commit to rmlibre/hushline that referenced this pull request Jul 22, 2024
rmlibre added a commit to rmlibre/hushline that referenced this pull request Jul 23, 2024
Add a hushline specific domain label, & add name of destination hash
to pre-hash domain label. These are additional good practice measures
to mitigate hash shucking by eliminating external datasets from being
possible sources of correlation.

References:
a. https://www.youtube.com/watch?v=OQD3qDYMyYQ
b. https://security.stackexchange.com/a/234795

Reference (b) excerpt:
"But the crucial point is that [a pre-hash] can be an otherwise
uncracked [pre-hash] that happens to be present in some other leak,
which you can then attack at massively increased speeds."
…idsg#411]

The embedded timestamps in Fernet ciphertexts can be utilized in
the case of server compromise to correlate analyzed web traffic
originating from a targeted user, even when using Tor, with the
encrypted database values hosted by servers running hushline.

This commit switches from the Fernet cipher to the AEAD cipher
ChaCha20Poly1305 which doesn't automatically generate timestamps,
removing this source of unintended activity tracking correlation.
Using the DKNA method (derived inputs: key, nonce, associated
data), the 12 byte nonce is easily upgraded to a 32 byte salt,
making repeats infeasible: >=2**-85.33 chance for each additional
message after ~2**85.33 encryptions in the same context. This is
referred to as the optimal bound for birthday bound security.
The DKNA method also provides resistance to commitment attacks
by binding the non-message inputs together into pseudo-random
internal cipher states.

This switch also comes with space efficiency improvements for
the database. 32.65% more plaintext can fit into the same
fields, due to the new raw bytes ciphertext being on average
~31.12% more efficient than Fernet's base64 encoded ciphertext.

Fernet:
-------        (Base64 encoded)
  1 B   |    8 B    | 16 B |     X B    | 32 B
Version ‖ Timestamp ‖  IV  ‖ Ciphertext ‖ HMAC

ChaCha20Poly1305 with DKNA:
------- (Raw bytes) -------
32 B  |     X B    |   16 B
Salt  ‖ Ciphertext ‖ Poly-Tag

References:
https://dataprot.net/articles/no-log-vpn/
https://martinolivier.com/open/timestamps.pdf
https://github.com/Attacks-on-Tor/Attacks-on-Tor#correlation-attacks
https://github.com/rmlibre/hushline/blob/7799379088e833282a0a8b9e8f9fd21756321522/docs/2-threat-model.md?plain=1#L75-L79
https://soatok.blog/2024/07/01/blowing-out-the-candles-on-the-birthday-bound/
https://crypto.stackexchange.com/q/112497
rmlibre added a commit to rmlibre/hushline that referenced this pull request Jul 30, 2024
Quick-fix addressing privacy concerns discussed in PR branch [scidsg#411]
commit (c772523).

References:
scidsg@c772523

Co-authored-by: Jeremy Moore <[email protected]>
jeremywmoore added a commit that referenced this pull request Jul 30, 2024
Quick-fix addressing privacy concerns discussed in PR branch [#411]
commit (c772523).

References:
c772523

Co-authored-by: Jeremy Moore <[email protected]>
jeremywmoore pushed a commit that referenced this pull request Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants