Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod_ssl: ClientHello variable collection #483

Closed
wants to merge 22 commits into from

Conversation

csmutz
Copy link

@csmutz csmutz commented Sep 25, 2024

This patch implements collection of variables from the ClientHello which are later made available in the same manner as the other environment variables of mod_ssl. A new directive, SSLClientHelloVars, enables collection of the raw variables during the clienthello callback which are then formated and provided as environment variables for all the requests in that connection, subject to the standard StdEnvVars functionality. If SSLClientHelloVars is not enabled or if openssl prior to 1.1.1 is used, this option should have little impact--no clienthello information is collected. The environment variables are populated with null if openssl < 1.1.1 is used or SSLClientHelloVars is not set to on.

The ClientHello variables are provided as hex encoded data the same as the raw network protocol/what is returned from openssl.

This patch has been tested on ubuntu 24.10 apache 2.4.62/openssl 3.3.1 verifying correct operation of the config directive when not set (default to off), set to off, and set to on. The variables have been tested in the context of both environment variables for cgi scripts and CustomLog entries. The variables have been used to succesfully generate the correct ja3 and ja4 fingerprints for a small number of common clients (as validated by comparison to zeek implementations of ja3/ja4).

This patch complements my prior pull request, #477.

@notroj
Copy link
Collaborator

notroj commented Sep 25, 2024

Very interesting indeed - thanks for another PR.

Higher-level comment: does this need to be a new directive? I think it would be better as another SSLOptions option, it's quite similar to SSLOptions +ExportCertData . Secondly, if the only point of exporting these is to create the JA3/JA4 fingerprints, would it be better to do that within mod_ssl? If the algorithm is simple enough and stable, it would not be inappropriate feature I guess.

Copy link
Collaborator

@notroj notroj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also would be easier to review if you squashed it :)


if (strEQ(var, "VERSION")) {
return apr_psprintf(p, "%04x", (uint16_t) clienthello_vars->version);
} else if (strEQ(var, "CIPHERS") && (clienthello_vars->ciphers_len > 0)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style nit: newline before else if

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

value = apr_palloc(p, clienthello_vars->extids_len * 4 + 1);
for (i = 0; i < clienthello_vars->extids_len; i++)
{
snprintf(value + i * 4, 5, "%04x", (uint16_t) clienthello_vars->extids_data[i]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use apr_snprintf here

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, using apr_snprintf

} else if (strEQ(var, "EXTENSIONS") && (clienthello_vars->extids_len > 0)) {
value = apr_palloc(p, clienthello_vars->extids_len * 4 + 1);
for (i = 0; i < clienthello_vars->extids_len; i++)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: no newline before {

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@notroj
Copy link
Collaborator

notroj commented Sep 25, 2024

I pushed your fork to get a CI run and there are a couple of compiler warnings too https://github.com/notroj/httpd/actions/runs/11028425131

sslcon->clienthello_vars = apr_palloc(c->pool, sizeof(*clienthello_vars));
clienthello_vars = sslcon->clienthello_vars;

if (clienthello_vars) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apr_palloc() cannot return NULL in httpd, no need for this conditional.

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, removed unnecessary conditional

SSL_client_hello_get0_ext(ssl, 0x10, &data, &clienthello_vars->alpn_len);
clienthello_vars->alpn_data = apr_pmemdup(c->pool, data, clienthello_vars->alpn_len);
SSL_client_hello_get0_ext(ssl, 0x2b, &data, &clienthello_vars->versions_len);
clienthello_vars->versions_data = apr_pmemdup(c->pool, data, clienthello_vars->versions_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't these magic 0x.. extensions have a #define somewhere?

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, using extension definitions from openssl (consistent with definition used in clienthello callback to get SNI).

@@ -2520,6 +2566,12 @@ int ssl_callback_ClientHello(SSL *ssl, int *al, void *arg)

give_up:
init_vhost(c, ssl, servername);

s = mySrvFromConn(c);
sc = mySrvConfig(s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using mySrvConfigFromConn(c)->clienthello_vars directly could do probably.

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, using much more concise code as recommended

apr_size_t ciphers_len;
const unsigned char *ciphers_data;
apr_size_t extids_len;
int *extids_data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const int *?

Copy link
Author

@csmutz csmutz Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, changed to const int *

@csmutz
Copy link
Author

csmutz commented Sep 25, 2024

These are the critical questions in implementation approach here--I'm willing to change.

Higher-level comment: does this need to be a new directive? I think it would be better as another SSLOptions option, it's quite similar to SSLOptions +ExportCertData .

I used a directive to help clarify that the SSLClientHelloVars directive only applies at the server/virtual host level, not at directory level, and because this directive has preformance impacts for every connection where it is enabled, regardless if StdEnvVars is on or not. This is because the clienthello variables have to be collected at the start of the connection (you can't retrieve them later if they aren't stored) and the default needs to be to not collect these (most users don't want the additional overhead, even if small). I viewed this as different from the other SSLOptions which are focused on ENV variable creation/clutter only (the certs are available, ExportCertData just switches additional parsing, env vars).

If you want me to convert this to an SSLOptions, I will do so. Then we would just document that this specific option only applies at the server, virtualhost level (is ignored elsewhere) and this option impacts more than env var clutter, it has a performance and memory impact on every connection for the life of the connection?

Secondly, if the only point of exporting these is to create the JA3/JA4 fingerprints, would it be better to do that within mod_ssl? If the algorithm is simple enough and stable, it would not be inappropriate feature I guess.

The fact that we have both ja3 and ja4 speak to the longevity of these. I don't know how long it will be until there is yet another TLS fingerprint method. For my purposes, both ja3 and ja4 don't meet all my needs and I would like to be able to use the raw values in both logs and cgi scripts. I think there are others who will want to do so. Ex. to answer questions like what percentage of clients still advertise TLS 1.0 or TLS 1.1 in supported versions? I think exposing the raw values is best to cover all possible use cases. At the very least, this patch makes implementing ja3 and ja4 easier as the source values are readily available. Regarding implementation difficulty, ja3 is pretty simple. ja4 has a lot of moving parts and edge cases (I've seen not quite right ja4 implementations). If you want to make acceptance of PR contingent upon implementation of ja3 and ja4 in addition to exposing the raw data I would probably do that, but removing access to the raw variables would be a mistake IMO.

I'll work on the fixes requested. Can you give me a pointer to methodology to compile/test against trunk so I can replicate your test pipeline? I've been testing on 2.4 release and simply manually splicing diffs in trunk. This worked in #477 but this is much bigger patch.

@notroj
Copy link
Collaborator

notroj commented Sep 25, 2024

Thanks a lot for the detailed response @csmutz. Makes sense to me - keep it as a new directive, and keep it exporting everything.

Regarding dev methodology, I recommend starting with git or SVN trunk. If you develop against trunk (whether from git or SVN), you can run the test suite directly. Something like this is a good starting point:

$ svn co https://svn.apache.org/repos/asf/httpd/test/framework/trunk test/framework
$ ./configure --with-test-suite=./test/framework --enable-maintainer-mode=c11 --enable-mods-shared=reallyall ...
$ make && make check

... should be documented somewhere but maybe isn't. Unfortunately because of quirks of our hybrid SVN/git config the CI doesn't run for non-committer PRs currently. I think it should run when you push to your personal fork but maybe you need GitHub Actions enabled or something.

@csmutz
Copy link
Author

csmutz commented Sep 26, 2024

@notroj

In addition to all the above fixes, I have two things I'd like you to double check for me. I really don't want to introduce any subtle errors.

  1. Mismatch in define statements for openssl version 1.1.1.

In the clienthello callback I piggyback on the existing define statement that includes a check for LIBRESSL_VERSION_NUMBER:

#if OPENSSL_VERSION_NUMBER >= 0x10101000L && !defined(LIBRESSL_VERSION_NUMBER)

everywhere else the define statements I use only check openssl version:

#if OPENSSL_VERSION_NUMBER >= 0x10101000L

Is there a potential to cause a problem (failure to compile) with this mismatch? I don't understand why both openssl and libressl versions would be defined at the same time but if that never occurs, why would this condition exist? I could easily duplicate the libressl check but I don't want to duplicate it if I don't understand it.

  1. Reliance on calloc in initialization of sslconn to test for allocation and population of clienthello_vars struct.

At variable formatting time (per request) I check to see if the clienthello variables were populated previously by testing the pointer to the clienthello_vars ssl_engine_vars.c line 991 struct which should fail if clienthello_vars weren't populated because of the calloc used to create sslconn mod_ssl.c line 482. I've convinced myself the logic here is sound (and testing doesn't result in crashes) and I think this follows conventions I see in other places, but could you double check that you agree there's no way an uninitialized pointer could be used here?

Thanks very much for the help!

@csmutz
Copy link
Author

csmutz commented Sep 26, 2024

I've fixed all the requested changes and have fixed the compiler warnings.

clienthello_vars->ciphers_len = SSL_client_hello_get0_ciphers(ssl, &data);
clienthello_vars->ciphers_data = apr_pmemdup(c->pool, data, clienthello_vars->ciphers_len);
if (SSL_client_hello_get1_extensions_present(ssl, &ids, &clienthello_vars->extids_len) == 1) {
clienthello_vars->extids_data = apr_pmemdup(c->pool, ids, clienthello_vars->extids_len * sizeof(int));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work if ->extids_len == 0 can be returned succesfully?

Also, to avoid a possible dangling ->extids_data I would either:

else {
    clienthello_vars->extids_data = NULL;
    clienthello_vars->extids_len = 0;
}

or use apr_pcalloc() above, the additional/implicit memset() would be in the noise in this path anyway (I think), possibly more future extensions proof too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. We have a big issue if extids_len is never set.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were a couple serious bugs and other issues here. Much of this code assumed that the openssl functions would set the associated lengths correctly even if the functions returned failure which is not right.

To fix these issues, I've rewritten this whole code block doing the following:

  • using calloc for struct (strictly shouldn't be necessary but agreed this is not expensive)
  • correctly check return codes from various SSL_client_hello_* functions, explicitly set len vars to 0 on failure
  • conditions to check len > 0 before memdup, explicitly set pointers to null if len == 0

I also have tested with diversity of clients including some lacking some collected extensions that prior to fixes would cause crash and other undefined behavior. These cases were all handled correctly after fixes.

#if OPENSSL_VERSION_NUMBER >= 0x10101000L
char *value;
modssl_clienthello_vars *clienthello_vars;
int i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apr_size_t i to match the type of ->extids_len

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@ylavic
Copy link
Member

ylavic commented Sep 26, 2024

1. Mismatch in define statements for openssl version 1.1.1.

The problem here is that libressl always defines OPENSSL_VERSION_NUMBER to 0x20000000L (i.e. 2.0 has been appropriated while openssl was still in 1.x), that's why openssl went to 3.x directly after 1.x too.
Anyway, we have the situation where OPENSSL_VERSION_NUMBER >= 0x10101000L is always true for libressl, although all the new openssl-1.x APIs have not made it to libressl..
So using OPENSSL_VERSION_NUMBER >= 0x10101000L with or without checking LIBRESSL_VERSION_NUMBER depends on whether libressl actually has the enclosed feature.
If you don't (want to) know, require !defined(LIBRESSL_VERSION_NUMBER) and someone who care about libressl will either remove it or change to the exact LIBRESSL_VERSION_NUMBER where implemented.

2. Reliance on calloc in initialization of sslconn to test for allocation and population of clienthello_vars struct.

Looks fine here for the "sslconn->clienthello_vars" pointer itself.
Maybe zero the pointed to data here too (see new comment there).

@csmutz
Copy link
Author

csmutz commented Sep 26, 2024

The problem here is that libressl always defines OPENSSL_VERSION_NUMBER to 0x20000000L (i.e. 2.0 has been appropriated while openssl was still in 1.x), that's why openssl went to 3.x directly after 1.x too.

Oh, that's ugly.

Since the clienthello callback was already wrapped in the !libressl check, I'll replicate this same check across all the other places. As it is, this won't compile for libressl.

@csmutz
Copy link
Author

csmutz commented Sep 27, 2024

Thanks for your patience and help.

I've made the preprocessor checks for openssl version consistent (libressl won't have this functionality) and addressed the requested items above.

if (clienthello_vars->ciphers_len > 0) {
clienthello_vars->ciphers_data = apr_pmemdup(c->pool, data, clienthello_vars->ciphers_len);
} else {
clienthello_vars->ciphers_data = NULL;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're now using apr_pcalloc the explicit setting of NULL / 0 through here is not necessary.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've removed the explicit/redundant assignments of _len to 0 and _data to NULL which should be fine as long as failed calls to the openssl client hello functions never modify _len. I've left the checks for _len >0 before memdup.

I've tested on conditions that triggered bugs due to uninitialized or 0 _len before by testing connections where some of the collected client hello extensions is not present and these cases functioned correctly.

Copy link
Member

@ylavic ylavic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thanks for this PR!

@notroj
Copy link
Collaborator

notroj commented Oct 1, 2024

Seconded, thanks again.

@asfgit asfgit closed this in e9915b2 Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants