Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 2.1.0 segfaults when subscribing to non-existent topic #1547

Open
4 of 7 tasks
ffissore opened this issue Apr 6, 2023 · 14 comments
Open
4 of 7 tasks

Version 2.1.0 segfaults when subscribing to non-existent topic #1547

ffissore opened this issue Apr 6, 2023 · 14 comments

Comments

@ffissore
Copy link
Contributor

ffissore commented Apr 6, 2023

Description

Since confluent-kafka 2.1.0, subscribing to a non-existent topic causes python to segfault.

How to reproduce

Using the files in this gist, run docker compose, then run script test.py

  1. at first, with no modifications
  2. then, with the admin client part commented out: in short, don't create the topic

The first run will be successful. The second run will log Segmentation fault (core dumped)

With confluent-kafka 2.0.2, poll returns a message with value b'Subscribed topic not available: test-cf67de5e-e79e-48f7-9300-763fb5b8bc05: Broker: Unknown topic or partition'

Checklist

Please provide the following information:

  • confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()): ('2.1.0', 33619968) ('2.1.0', 33620223)
  • Apache Kafka broker version: 2.1.1 (confluent 5.1.2)
  • Client configuration: {...}: in the gist
  • Operating system: ubuntu 20.04
  • Provide client logs (with 'debug': '..' as necessary)
  • Provide broker log excerpts
  • Critical issue
@ffissore
Copy link
Contributor Author

ffissore commented Apr 6, 2023

Using gdb I got this far in debugging (not much actually)

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff62f7115 in rd_kafka_message_leader_epoch () from /home/federico/.pyenv/versions/3.11.2-debug/envs/test-env/lib/python3.11/site-packages/confluent_kafka/../confluent_kafka.libs/librdkafka-e24e6ccd.so.1

@emasab
Copy link
Contributor

emasab commented Apr 6, 2023

Hello Federico, thanks. I'm aware of the issue. Have done a fix in the .NET client but I'm going to do it for librdkafka too.

@Richetto
Copy link

Same issue here, upgrading to 2.1.0 from 1.6.0. Any update on the fix?
Thank you

@emasab
Copy link
Contributor

emasab commented Apr 23, 2023

Yes, we've merged the fix here and are planning a maintenance release soon.

@lpsinger
Copy link
Contributor

lpsinger commented May 4, 2023

This is still broken in confluent-kafka-python 2.1.1.

lpsinger added a commit to lpsinger/gcn-kafka-python that referenced this issue May 4, 2023
Version 2.1.1 of confluent-kafka-python still suffers from a
segfault bug (see confluentinc/confluent-kafka-python#1547).
lpsinger added a commit to lpsinger/gcn-kafka-python that referenced this issue May 4, 2023
Version 2.1.1 of confluent-kafka-python still suffers from a
segfault bug (see confluentinc/confluent-kafka-python#1547).
lpsinger added a commit to lpsinger/gcn-kafka-python that referenced this issue May 4, 2023
Version 2.1.1 of confluent-kafka-python still suffers from a
segfault bug. See confluentinc/confluent-kafka-python#1547.
lpsinger added a commit to nasa-gcn/gcn-kafka-python that referenced this issue May 4, 2023
Version 2.1.1 of confluent-kafka-python still suffers from a
segfault bug. See confluentinc/confluent-kafka-python#1547.
@emasab
Copy link
Contributor

emasab commented May 4, 2023

@lpsinger are you sure? I've just tried with a non existing topic and it gives

Traceback (most recent call last):
  File "consumer.py", line 98, in <module>
    raise KafkaException(msg.error())
cimpl.KafkaException: KafkaError{code=UNKNOWN_TOPIC_OR_PART,val=3,str="Subscribed topic not available: this_topic_doesnt_exist: Broker: Unknown topic or partition"}

@lpsinger
Copy link
Contributor

lpsinger commented May 4, 2023

We have a lightweight client library that wraps confluent-kaka-python and adds the configuration presets for our Kafka cluster: https://github.com/nasa-gcn/gcn-kafka-python

It's segfaulting with unknown topics. So it might be something with confluent-kafka-python 2.1.1 plus unknown topics plus OpenID Connect.

@lpsinger
Copy link
Contributor

lpsinger commented May 4, 2023

To test, go to https://gcn.nasa.gov, click "Start streaming GCN Notices", and follow the instructions.

@emasab
Copy link
Contributor

emasab commented May 5, 2023

@lpsinger Thanks for that, I've installed and reproduced. The error happens in a different place than the initial one for a fix we did to the consume batch in 2.1.0. It happens with the Consumer.consume method but not with Consumer.poll, with non-existent topics.

@ffissore
Copy link
Contributor Author

ffissore commented May 5, 2023

I confirm that our code is working with version 2.1.1: our tests are green
As I'm not sure if you prefer another issue to track the bug with Consumer.consume, I'll leave closing this issue to you.

@milibopp
Copy link

milibopp commented Aug 2, 2023

When I run the test suite on version 2.1.1, one of the tests seems to segfault, see gist. The same happens, when I upgrade both librdkafka and this package to 2.2.0.

It seems like the same issue.

@emasab
Copy link
Contributor

emasab commented Aug 2, 2023

@milibopp It doesn't seem the same issue, in that test test_oauth_cb_principal_sasl_extensions it's not subscribing to any topics. I couldn't reproduce it by running that test, could your provide some hint to reproduce it?

@ksajan
Copy link

ksajan commented Sep 7, 2023

I am facing the similar issue in the current package version v2.2.0 in python 3.10.

@Vikash08Mishra
Copy link

I still see similar issue even in version >= 2.0.2 when using SSL, more details in ticket: #1690

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants