Skip to content

Commit

Permalink
fix(ingest): loosen confluent-kafka dep requirement (#5489)
Browse files Browse the repository at this point in the history
  • Loading branch information
hsheth2 authored Jul 26, 2022
1 parent 639feaf commit 97d508f
Showing 1 changed file with 29 additions and 1 deletion.
30 changes: 29 additions & 1 deletion metadata-ingestion/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,12 +64,40 @@ def get_long_description():
}

kafka_common = {
# The confluent_kafka package provides a number of pre-built wheels for
# various platforms and architectures. However, it does not provide wheels
# for arm64 (including M1 Macs) or aarch64 (Docker's linux/arm64). This has
# remained an open issue on the confluent_kafka project for a year:
# - https://github.com/confluentinc/confluent-kafka-python/issues/1182
# - https://github.com/confluentinc/confluent-kafka-python/pull/1161
#
# When a wheel is not available, we must build from source instead.
# Building from source requires librdkafka to be installed.
# Most platforms have an easy way to install librdkafka:
# - MacOS: `brew install librdkafka` gives latest, which is 1.9.x or newer.
# - Debian: `apt install librdkafka` gives 1.6.0 (https://packages.debian.org/bullseye/librdkafka-dev).
# - Ubuntu: `apt install librdkafka` gives 1.8.0 (https://launchpad.net/ubuntu/+source/librdkafka).
#
# Moreover, confluent_kafka 1.9.0 introduced a hard compatibility break, and
# requires librdkafka >=1.9.0. As such, installing confluent_kafka 1.9.x on
# most arm64 Linux machines will fail, since it will build from source but then
# fail because librdkafka is too old. Hence, we have added an extra requirement
# that requires confluent_kafka<1.9.0 on non-MacOS arm64/aarch64 machines, which
# should ideally allow the builds to succeed in default conditions. We still
# want to allow confluent_kafka >= 1.9.0 for M1 Macs, which is why we can't
# broadly restrict confluent_kafka to <1.9.0.
#
# Note that this is somewhat of a hack, since we don't actually require the
# older version of confluent_kafka on those machines. Additionally, we will
# need monitor the Debian/Ubuntu PPAs and modify this rule if they start to
# support librdkafka >= 1.9.0.
"confluent_kafka>=1.5.0",
'confluent_kafka<1.9.0; platform_system != "Darwin" and (platform_machine == "aarch64" or platform_machine == "arm64")',
# We currently require both Avro libraries. The codegen uses avro-python3 (above)
# schema parsers at runtime for generating and reading JSON into Python objects.
# At the same time, we use Kafka's AvroSerializer, which internally relies on
# fastavro for serialization. We do not use confluent_kafka[avro], since it
# is incompatible with its own dep on avro-python3.
"confluent_kafka>=1.5.0,<1.9.0",
"fastavro>=1.2.0",
}

Expand Down

0 comments on commit 97d508f

Please sign in to comment.