-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot bootstrap with SSL using 1.4.5 on python 3.7 #1741
Comments
I suspect this is a small bug that crept in somewhere through this change: #1736 Can you post DEBUG-level logs? Does this happen immediately upon start, or does it work fine for a while and then you hit this? (Since the purpose of the change was to infinitely retry) Do you also hit this with If there's a bug, it will be a |
Would also love to get integration tests for ssl brokers... If you have
time / motivation, maybe give a shot at configuring a ssl broker fixture
for our test suite? As of now we do not have automated testing for SSL, so
regressions / bugs do happen from time to time. (I personally do not use
ssl brokers)
…On Fri, Mar 15, 2019, 12:30 PM Jeff Widman ***@***.***> wrote:
Can you post DEBUG-level logs?
I suspect this is a small bug that crept in somewhere through this change:
#1736 <#1736>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1741 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAzetKChWglcZdYUsz2h4XCt1iY4v9eFks5vW-bcgaJpZM4b3Bjo>
.
|
Can you try 1.4.5 w/ python 3.6 and see if you get the same results? My hunch is this is related to ssl changes in python 3.7. |
This happens immediately after KafkaProducer tries to initialize. I haven't tried this with the Consumer as I only need to post some messages.
This last log keeps going indefinitely until it crashes my app. |
With this configuration this error keeps getting logged:
For now only combination that works is python 3.6/ kafka-python 1.4.4 |
Thanks for the logs! It looks like the Re: the difference between python 3.6 and 3.7 in kafka-python, PR #1694 looks like it covers ssl.CertificateError in 3.7 but not 3.6, which means this error is silently ignored on 3.7 when really it should be raised. |
Also, you might try setting |
Adding this option works for 3.6/ 1.4.5 resolving above ssl.CertificateError. But with python 3.7 I keep getting the same log from the #1741 (comment) . The certificate files are valid as I succeeded sending messages to kafka topic with 3.6/1.4.4/ and 1.4.5(with ssl_check_hostname=False option). |
Interesting... Are you able to hotpatch kafka/conn.py ? If so, try removing these lines from 1.4.5 and run w/ py 3.7 (i expect that it will still fail, but we should get more insight into the underlying exception): kafka/conn.py
We probably should not have accepted this PR. I'm sorry about that. But again I want to emphasize that SSL connections are currently untested and will remain so until some kind soul such as yourself helps us out by submitting a PR to setup SSL broker testing via our test/fixtures.py ... |
Looking deeper into cpython ssl implementation, it looks like a change in 3.7 may have simply broken non-blocking ssl connections where the first connect_ex attempt does not succeed: python/cpython#5252 sigh. If I'm reading ssl.py correctly, I don't think there will be an easy workaround for this. The best I can think of is to somehow switch to blocking connects when using ssl. |
The error excepted in this place was:
|
I've managed to patch v1.4.4 to work with py3.7 by adding ValueError except around connect_ex in kafka.conn.py:
and OSError execept around do_handshake:
This fixed all problems for me. The new release 1.4.5 already has these excepts, so I suppose some new bugs were introduced through PR. I will try to localize them. |
Thanks for investigating. Unfortunately I don't think catching ValueError
or OSError are permanent solutions. And it will be difficult for us to
address this correctly without ssl integration tests. So if you are
inclined to invest more time here, taking a stab at adding ssl support to
test/fixtures would go a long long way!
|
Hey, I resolved my problem with #1745 . The problem was in the _generate_bootstrap_brokers function. |
Right on. Unfortunately your changes to _generate_bootstrap_brokers would break in other contexts, so it is not a general solution. |
After some debugging I think I understand the underlying problem here. From version 3.7 python uses openssl to match hostnames during SSLSocket.do_handshake() (https://docs.python.org/3/library/ssl.html.) You can not provide ip_address as server_hostname to wrap_socket() it must be domain name, though this worked up until python3.6. (https://bugs.python.org/issue34440).
|
Correct!
…On Wed, Mar 20, 2019, 9:50 AM Marija Ševković ***@***.***> wrote:
After some debugging I think I understand the underlying problem here.
From version 3.7 python uses openssl to match hostnames during
*SSLSocket.do_handshake()* (https://docs.python.org/3/library/ssl.html.)
<https://docs.python.org/3/library/ssl.html.> You can not provide
ip_address as server_hostname to wrap_socket() it must be domain name,
though this worked up until python3.6.
(https://bugs.python.org/issue34440) <https://bugs.python.org/issue34440>.
When I logged all the errors around do_handshake() in kafka.conn.py I got:
DEBUG - <BrokerConnection node_id=bootstrap host=xx.xx.xxx.xxx:9093
<connecting> [IPv4 ('xx.xx.xxx.xxx', 9093)]>: initiating SSL handshake
[kafka.conn:389] ERROR - The operation did not complete (read)
(_ssl.c:1056) [kafka.conn:495] ERROR - [SSL: CERTIFICATE_VERIFY_FAILED]
certificate verify failed: IP address mismatch, certificate is not valid
for 'xx.xx.xxx.xxx'. (_ssl.c:1056) [kafka.conn:499] ERROR - [SSL:
UNKNOWN_STATE] unknown state (_ssl.c:1056) [kafka.conn:499]
By calling dns_lookup in __generate_bootstrap_brokers and yielding
BrokerMetadata(...host=ip...) you implicitly set up BrokerConnection with
host=ip (kafka.client_async._maybe_connect) and this creates problems later
on during ssl handshake if *ssl_check_hostname* option is set to True
which is by default.
P.S. By setting *ssl_check_hostname=False* option things will work out
with py3.7 but this is bypassing the problem.(I made a mistake in earlier
post by saying this works only on versions less than 3.7)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1741 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAzetPZJ3nzD8sGMHaQFwLkwDuHuwM8uks5vYmbggaJpZM4b3Bjo>
.
|
Should be fixed on master. You may need to configure |
Hey, thanks a lot man. When are you planning to release this? |
After switching to new kafka-python 1.4.5 and upgrading to python 3.7 I keep getting these logs in infinite loop:
I have noticed that broker list is always empty and node-id=bootstrap . Running the same code with 1.4.4 and python 3.6 worked perfectly.
KafkaProducer( bootstrap_servers='{}:{}'.format( config.get('tracing_host'), config.get('tracing_port') ), api_version=(0, 10, 1), security_protocol='SSL', ssl_check_hostname=True, ssl_cafile=config.get('ssl_ca_file'), ssl_certfile=config.get('ssl_cert_file'), ssl_keyfile=config.get('ssl_key_file') )
Any ideas what to try next?
The text was updated successfully, but these errors were encountered: