Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New osrandom_engine in C #3229

Merged
merged 12 commits into from
Dec 9, 2016
Merged

New osrandom_engine in C #3229

merged 12 commits into from
Dec 9, 2016

Conversation

tiran
Copy link
Contributor

@tiran tiran commented Nov 10, 2016

Inspired by Python/random.c and the old implementation. Let's ask travis if it works on all platforms.

Signed-off-by: Christian Heimes [email protected]

  • Test /dev/urandom path
  • Test getrandom() path
  • Test getentropy() path
  • Test CryptGenRandom path

@mention-bot
Copy link

@tiran, thanks for your PR! By analyzing the history of the files in this pull request, we identified @reaperhulk, @glyph and @alex to be potential reviewers.

@glyph
Copy link
Contributor

glyph commented Nov 10, 2016

On the face of it, this is a terrible idea; ~400 lines of C to replace ~40 lines of Python; 2 open file descriptors for your whole process lifetime instead of 1.

However, knowing the history of this I am sure there's a very good reason for this change. Can you link it to an issue describing said reasons, please? I think it's super important to thoroughly document what is going on here, since it is incredibly confusing; without very clear explication I am concerned that a future maintainer will haplessly reintroduce the bug that this is fixing.

(Secretly, of course, I want to read that description and have a brilliant idea that will salvage the Python implementation and not have to drag around half a thousand lines of duplicative tricky wrapper C code; alas, if I haven't by now, I very much doubt I'll be able to.)

@tiran
Copy link
Contributor Author

tiran commented Nov 10, 2016

Ha, mentionbot betrayed me to @glyph :)

@reaperhulk ask me to look into the matter. My patch is still in a very early phase. Basically I'm just abusing the build and CI testing to test my quick hack on other platforms. I'll take care of documentation later.

@reaperhulk
Copy link
Member

@glyph: The answer is that we run into endless problems with calling into Python from C on the margins. The most recent of which were apps like Kodi, which embed Python as a scripting option. A Kodi plugin adds cryptography, which registers a callback into Python...but there's only one instance of OpenSSL in the process so now things that aren't even aware of Python are attempting to call back into Python. You could argue that disabling the osrandom engine is the right solution in such a scenario but "right" and "things people will do" are not always overlapping sets.

We also continue to have subinterpreter problems which are probably soluble but I am so very tired of trying to fix it when moving it entirely to C will instantly resolve this entire class of issue.

Combine that with our interpreter shutdown issues in newer Pythons and the fact that we need to entirely abandon our Python-based locking callbacks (see: #3226 and linked issues) and there's a strong incentive to move this entirely to C because frankly users will have less problems.

The cost is that we are once again hoisting the complexity CPython formerly provided for us into our own domain and will be responsible for managing that. I don't believe we can continue to say "this isn't a big deal" though. Every time cryptography fails in a weird, undebuggable way when dangerous, unsupported libraries like PyCrypto do not we lose.

Sorry for the semi-rambling response, but I thought you deserved a reasonable dump of my thoughts on why this has become a desirable scenario. The tl;dr is "being reliably functional in all the environments Python is used in trumps the ideological purity of using os.urandom"

@reaperhulk
Copy link
Member

@tiran Thanks for the awesome work here so far. I look forward to reviewing it.

As an aside: CCRandomGenerateBytes might be an avenue we could pursue in a future PR on the mac side (available in 10.10+). SecRandomCopyBytes is also available since 10.7 but we'd have to investigate the implementation of each of those to see if they're truly suitable.

@tiran tiran force-pushed the new_osrandom_engine branch 4 times, most recently from 77e9f2b to 1d38d6f Compare November 12, 2016 16:37
@tiran tiran changed the title New osrandom_engine in C (WIP) New osrandom_engine in C Nov 12, 2016
@tiran
Copy link
Contributor Author

tiran commented Nov 12, 2016

@reaperhulk The PR is ready for review. I added a CCRandomGenerateBytes implementation.

@tiran
Copy link
Contributor Author

tiran commented Nov 12, 2016

All Net/Free/Open BSD seem to use the same syscall name SYS_getentropy. I'm using #ifdef SYS_getentropy to select getentropy() instead of /dev/urandom.
https://bz-attachments.freebsd.org/attachment.cgi?id=149067

Copy link
Member

@reaperhulk reaperhulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of review, haven't looked closely at the actual functionality yet

/****************************************************************************
* BSD getentropy
*/
#if CRYPTOGRAPHY_OSRANDOM_ENGINE == CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY

static int osrandom_rand_status(void) {
return 1;
}
#endif /* CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be CRYPTOGRAPHY_OSRANDOM_ENGINE_CC_RANDOM

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, I messed up some find and replace operations. The error was consistent over all files so it just worked.

static int osrandom_rand_status(void) {
return 1;
}
#endif /* CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY

}
}
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a CRYPTOGRAPHY_OSRANDOM_NEEDS_DEV_URANDOM comment?

* Linux getrandom engine with fallback to dev_urandom
*/

#if CRYPTOGRAPHY_OSRANDOM_ENGINE == CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

}
return 1;
}
#endif /* CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

/* for defined(BSD) */
#include <sys/param.h>

#ifdef BSD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does FreeBSD define this? I think they typically use __FreeBSD__. What about macOS (which apparently has SYS_getentropy as of 10.12)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_CC_RANDOM
#elif defined(BSD) && defined(SYS_getentropy)
/* OpenBSD 5.6+ */
#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY with the naming fixes requested above

#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM
#elif defined(__linux__) && defined(SYS_getrandom)
/* Linux 3.4.17+ */
#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this should be CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

#define CRYPTOGRAPHY_OSRANDOM_NEEDS_DEV_URANDOM 1
#endif

/* Open SSL 1.1.0+ has no ERR_R_RAND_LIB */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about always passing 0 and not worrying about a macro here?

@reaperhulk
Copy link
Member

Coverage is failing because _osrandom_rand_bytes and _osrandom_rand_status need to be removed from src/cryptography/hazmat/bindings/openssl/binding.py

@tiran tiran force-pushed the new_osrandom_engine branch 2 times, most recently from da3f779 to 807f437 Compare November 13, 2016 12:50
assert name
if sys.platform.startswith('linux'):
assert name in ['getrandom', '/dev/urandom']
elif sys.platform == 'win32':
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our CI can't cover this branch (we don't get coverage from our windows builders, although this is being executed) so we'll need to figure out a way to rewrite this. Maybe make the asserts a separate function that takes a string _engine_name_assert(sys.platform) and then

def _engine_name_assert(platform):
    if platform.startswith('linux'):
        assert name in ['getrandom', '/dev/urandom']
   elif platform == 'win32':
        assert name == 'CryptGenRandom'

and then a separate test for _engine_name_assert that explicitly passes win32?

@alex
Copy link
Member

alex commented Nov 13, 2016

Failing docs tests are relevant; looks like fallback needs to be added to the wordlist :-)

@tiran tiran force-pushed the new_osrandom_engine branch 2 times, most recently from 952ebe9 to 2f00fae Compare November 14, 2016 10:43
@tiran
Copy link
Contributor Author

tiran commented Nov 14, 2016

I added fallback to spelling wordlist. On my machine the doc tests are failing with a different error. My spell checker knows about fallback but does not recognize personalization:

hazmat/primitives/cryptographic-hashes.rst:129: (personalization)

@reaperhulk
Copy link
Member

https://codecov.io/gh/pyca/cryptography/compare/3a15b03e92c9fdeadff04ddd2ce505028b279b86...075de30232f1fd6f1e0d541844e01eab93a006b3#74657374732F68617A6D61742F6261636B656E64732F746573745F6F70656E73736C2E7079-280...281 is our coverage problem that's causing the status check failures.

We do have FreeBSD 10 (but not 11) in our jenkins cluster, but no OpenBSD and we don't get coverage from those builders right now. 😢

In an ideal world what would CI look like here?

@tiran tiran force-pushed the new_osrandom_engine branch 3 times, most recently from 669295e to d4ae881 Compare November 15, 2016 07:48
if (urandom_cache.fd >= 0) {
do {
n = close(fd);
} while (n < 0 && errno == EINTR);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be kind to call PyErr_CheckSignals() to fail fast on CTRL+C: PEP 475.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The engine does not not use any Python API at all. I don't consider SIGINT a problem here. The engine is a low level part of OpenSSL's CPRNG. The CPRNG API should take care of of interrupts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is closign an fd in a loop safe? I seem to recall it being dangerous :-/

Copy link
Member

@reaperhulk reaperhulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this needs a rebase as well now

@@ -40,6 +40,10 @@ greater.
Activates the OS random engine. This will effectively disable OpenSSL's
default CSPRNG.

.. method:: osrandom_engine_implementation()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. versionadded:: 1.7 please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

The code used in the OpenSSL locking callback is derived from the same in
Python itself, and is licensed under the terms of the PSF License Agreement.
The code used in the OpenSSL locking callback and OS random engine is derived
from the same in Python itself, and is licensed under the terms of the PSF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, nit, not technically a bug in your PR (but if I go fix it separately it'll just cause a conflict). Can you s/Python/CPython/

| Windows | ``CryptGenRandom()`` |
+------------------------------------------+------------------------------+
| Linux >= 3.4.17 with working | ``getrandom(GRND_NONBLOCK)`` |
| ``SYS_getrandom`` syscall | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the leading SYS here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you really insist, I'd rather keep SYS_getrandom because that is what the random engine actually uses. It's going to make it easier to distinguish between platforms with the syscall and platforms with a glibc wrapper later.

#ifndef GRND_NONBLOCK
#define GRND_NONBLOCK 0x0001
#endif /* GRND_NONBLOCK */
#endif /* _linux__ */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing leading _ here

#ifdef __linux__
/* for SYS_getrandom */
#include <sys/syscall.h>
#ifndef GRND_NONBLOCK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what circumstances it this not defined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constant is defined in user land headers. Sometimes the running Kernel has the syscall but the userland hasn't caught up yet.

@@ -282,6 +271,23 @@ def test_activate_builtin_random_already_active(self):
e = backend._lib.ENGINE_get_default_RAND()
assert e == backend._ffi.NULL

def test_osrandom_engine_implementation(self):
name = backend.osrandom_engine_implementation()
assert name in ['/dev/urandom', 'CCRandomGenerateBytes',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CCRandomGenerateBytes is not used anywhere.

}
}

static const char* osurandom_get_implementation(void) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* next to the function name, rather than the type please.

if (urandom_cache.fd >= 0) {
do {
n = close(fd);
} while (n < 0 && errno == EINTR);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is closign an fd in a loop safe? I seem to recall it being dangerous :-/

/* Can fail with:
* - ENOSYS: Kernel does not support the syscall.
* - ENOPERM: seccomp prevents syscall.
* - EAGAIN: Kernel CRPNG has not been seeded yet.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the even of EAGAIN I'm pretty sure we want to still use getrandom, not fall back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's discuss this matter on IRC later. I don't want to repeat the /dev/urandom discussion from python-dev on github. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you know how to ping me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please give link to that conversation.
  2. It would be nice to place explanation in documentation.

long n;
while (size > 0) {
do {
n = syscall(SYS_getrandom, buffer, size, GRND_NONBLOCK);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do on large (>256) size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://man7.org/linux/man-pages/man2/getrandom.2.html

If the
/dev/urandom pool has been initialized, reads of up to 256 bytes will
always return as many bytes as requested and will not be interrupted
by signals. No such guarantees apply for larger buffer sizes. For
example, if the call is interrupted by a signal handler, it may
return a partially filled buffer, or fail with the error EINTR.

}

static int osrandom_rand_status(void) {
if (urandom_cache.fd < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplifies to return urandom_cache.fd > 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>= 0 or != -1. Theoretically the fd can by 0,

tiran and others added 11 commits December 6, 2016 10:30
Inspired by Python/random.c and the old implementation.

Signed-off-by: Christian Heimes <[email protected]>
* Fix naming bug caused by search 'n replace mistake
* Make it easier to override osrandom auto-detection
* Add engine ctrl and backend API to get implementation from ENGINE

Signed-off-by: Christian Heimes <[email protected]>
Signed-off-by: Christian Heimes <[email protected]>
* read() returns size_t

Signed-off-by: Christian Heimes <[email protected]>
This change allows us to test all the engines in our CI:
* getentropy (tested by macOS sierra)
* getrandom (tested on several linux builders)
* /dev/urandom (tested on FreeBSD, OS X 10.11 and below, & older linux)
* CryptGenRandom (tested on windows builders)

I also fixed bugs preventing compilation in the getentropy code
@alex
Copy link
Member

alex commented Dec 7, 2016

The EAGAIN issue is the last remaining issue for me.

Add error reporting strings for various error cases. This gives us much
nicer and understandable error messages.

SYS_getrandom() EAGAIN is now an error. Cryptography refuses to
initialize its osrandom engine when the Kernel's CPRNG hasn't been
seeded yet.

Signed-off-by: Christian Heimes <[email protected]>
@reaperhulk reaperhulk merged commit 2e71776 into pyca:master Dec 9, 2016
@tiran tiran deleted the new_osrandom_engine branch December 9, 2016 16:51
jsonn referenced this pull request in jsonn/pkgsrc Dec 19, 2016
1.7.1 - 2016-12-13
~~~~~~~~~~~~~~~~~~

* Fixed a regression in ``int_from_bytes`` where it failed to accept
  ``bytearray``.

1.7 - 2016-12-12
~~~~~~~~~~~~~~~~

* Support for OpenSSL 1.0.0 has been removed. Users on older version of OpenSSL
  will need to upgrade.
* Added support for Diffie-Hellman key exchange using
  :meth:`~cryptography.hazmat.primitives.asymmetric.dh.DHPrivateKeyWithSerialization.exchange`
* The OS random engine for OpenSSL has been rewritten to improve compatibility
  with embedded Python and other edge cases. More information about this change
  can be found in the
  `pull request <https://github.com/pyca/cryptography/pull/3229>`_.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants