Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iceberg): Upgrade Iceberg ingestion source to pyiceberg 0.4.0 #8357

Merged
merged 42 commits into from
Aug 31, 2023
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
e9ba858
Instead of raising an exception when a workunit is not created, just …
cccs-Dustin Nov 23, 2022
de36ef7
Added a comment to show that the code changes made in this PR are cus…
cccs-Dustin Nov 24, 2022
a560505
Upgrade to pyiceberg (#12)
cccs-eric Feb 16, 2023
d8b3f6b
Update to HEAD
cccs-eric Jul 3, 2023
1b10b63
Merge branch 'master' into dh-pr-pyiceberg
cccs-eric Jul 3, 2023
89bc932
Update to pyiceberg 0.4.0
cccs-eric Jul 3, 2023
516c6e6
Remove ADLS specific code
cccs-eric Jul 4, 2023
9c2d51e
Update to integration tests
cccs-eric Jul 4, 2023
2560467
Remove obsolete code
cccs-eric Jul 4, 2023
fc00001
Change documentation to reflect pyiceberg's catalog configuration
cccs-eric Jul 4, 2023
2eb4813
Change documentation to reflect pyiceberg's catalog configuration
cccs-eric Jul 4, 2023
ea3e664
Update metadata-ingestion/src/datahub/ingestion/source/iceberg/iceber…
cccs-eric Jul 5, 2023
bf88b57
Update metadata-ingestion/src/datahub/ingestion/source/iceberg/iceber…
cccs-eric Jul 5, 2023
a0571d6
Update metadata-ingestion/tests/integration/iceberg/docker-compose.yml
cccs-eric Jul 5, 2023
d8f5c94
Update metadata-ingestion/tests/integration/iceberg/docker-compose.yml
cccs-eric Jul 5, 2023
acbe2f0
Code review changes
cccs-eric Jul 5, 2023
56577be
Update metadata-ingestion/src/datahub/ingestion/source/iceberg/iceber…
cccs-eric Jul 7, 2023
5ac237b
Update metadata-ingestion/src/datahub/ingestion/source/iceberg/iceber…
cccs-eric Jul 7, 2023
51d403b
Update metadata-ingestion/src/datahub/ingestion/source/iceberg/iceber…
cccs-eric Jul 7, 2023
da8603d
Update metadata-ingestion/src/datahub/ingestion/source/iceberg/iceber…
cccs-eric Jul 7, 2023
640ddd9
Code review changes
cccs-eric Jul 7, 2023
68b55f4
Merge branch 'dh-pr-pyiceberg' of github.com:CybercentreCanada/datahu…
cccs-eric Jul 7, 2023
1356e44
More code review changes
cccs-eric Jul 7, 2023
eac3e1f
Merge branch 'master' into dh-pr-pyiceberg
cccs-eric Jul 7, 2023
5bfe541
Merge branch 'master' into dh-pr-pyiceberg
cccs-eric Jul 10, 2023
4cf684c
Remove boto3 constraint
cccs-eric Jul 11, 2023
721888b
pyiceberg not compatible with Python <=3.7
cccs-eric Jul 11, 2023
5277128
pyiceberg not compatible with Python <=3.7
cccs-eric Jul 11, 2023
802a80e
pyiceberg not compatible with Python <=3.7
cccs-eric Jul 11, 2023
706fdc9
Disable pyiceberg imports when Python < 3.8
cccs-eric Jul 13, 2023
4a2359c
Disable pyiceberg imports when Python < 3.8
cccs-eric Jul 13, 2023
b4ca73f
Disable pyiceberg imports when Python < 3.8
cccs-eric Jul 13, 2023
fca07d2
Introduce get datasets generator and better exception handling
cccs-eric Aug 21, 2023
25ab98d
Remove unused report method
cccs-eric Aug 21, 2023
f26d4da
Tweaking test mark for unit tests
cccs-eric Aug 21, 2023
a3a6261
Tweaking test mark for unit tests
cccs-eric Aug 21, 2023
06d4da7
PR review changes
cccs-eric Aug 21, 2023
08b412e
Merge branch 'master' into dh-pr-pyiceberg
cccs-eric Aug 21, 2023
e1e6715
Disable Iceberg unit test for Python < 3.8
cccs-eric Aug 22, 2023
de4a9fa
Merge branch 'master' into dh-pr-pyiceberg
cccs-eric Aug 22, 2023
e4a7022
Lint fix
cccs-eric Aug 22, 2023
3c3c4c4
Merge branch 'master' into dh-pr-pyiceberg
cccs-eric Aug 31, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 11 additions & 8 deletions metadata-ingestion/docs/sources/iceberg/iceberg_recipe.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,17 @@ source:
type: "iceberg"
config:
env: PROD
adls:
# Will be translated to https://{account_name}.dfs.core.windows.net
account_name: my_adls_account
# Can use sas_token or account_key
sas_token: "${SAS_TOKEN}"
# account_key: "${ACCOUNT_KEY}"
container_name: warehouse
base_path: iceberg
catalog:
name: my_iceberg_catalog
type: rest
# Catalog configuration follows pyiceberg's documentation (https://py.iceberg.apache.org/configuration)
config:
uri: http://localhost:8181
s3.access-key-id: admin
s3.secret-access-key: password
s3.region: us-east-1
warehouse: s3a://warehouse/wh/
s3.endpoint: http://localhost:9000
platform_instance: my_iceberg_catalog
table_pattern:
allow:
Expand Down
9 changes: 5 additions & 4 deletions metadata-ingestion/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -215,8 +215,8 @@ def get_long_description():

iceberg_common = {
# Iceberg Python SDK
"acryl-iceberg-legacy==0.0.4",
"azure-identity==1.10.0",
"pyiceberg==0.4.0",
cccs-eric marked this conversation as resolved.
Show resolved Hide resolved
"pyarrow>=9.0.0, <13.0.0",
cccs-eric marked this conversation as resolved.
Show resolved Hide resolved
}

s3_base = {
Expand Down Expand Up @@ -462,7 +462,7 @@ def get_long_description():
"druid",
"elasticsearch",
"feast" if sys.version_info >= (3, 8) else None,
"iceberg",
"iceberg" if sys.version_info >= (3, 8) else None,
"json-schema",
"ldap",
"looker",
Expand Down Expand Up @@ -517,7 +517,7 @@ def get_long_description():
"druid",
"hana",
"hive",
"iceberg",
"iceberg" if sys.version_info >= (3, 8) else None,
"kafka-connect",
"ldap",
"mongodb",
Expand All @@ -527,6 +527,7 @@ def get_long_description():
"redash",
# "vertica",
]
if plugin
for dependency in plugins[plugin]
),
}
Expand Down
Empty file.

This file was deleted.

Loading
Loading