Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-2757] [Stacked over 4175] Implement Sync for AWS Glue Catalog #4080

Closed
wants to merge 25 commits into from

Conversation

rmahindra123
Copy link
Contributor

@rmahindra123 rmahindra123 commented Nov 23, 2021

What is the purpose of the pull request

This feature will start by being experimental.

Extend the hive sync tool to enable AWS Glue meta sync.
Deltastreamer can also be configured to sync with the AWS Glue metastore asynchronously.

Brief change log

  • Implement HoodieGlueMetaSync that extends the abstract hive sync class to sync meta with AWS Glue stores.
  • Move some common functionality with HoodieHiveSync to the abstract class.
  • Integrate with HiveSyncTool and deltastreamer (using GlueSyncTool)

@rmahindra123 rmahindra123 changed the title [HUDI-2757] Extend the hive sync tool to enable AWS Glue meta sync [HUDI-2757] WIP Extend the hive sync tool to enable AWS Glue meta sync Nov 23, 2021
@rmahindra123 rmahindra123 changed the title [HUDI-2757] WIP Extend the hive sync tool to enable AWS Glue meta sync [HUDI-2757] Extend the hive sync tool to enable AWS Glue meta sync Nov 24, 2021
Copy link
Member

@xushiyan xushiyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. just some naming suggestions.

Copy link
Member

@vinothchandar vinothchandar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should land this one, this late. There are some moving around of classes here, that probably needs a closer look

@rmahindra123 rmahindra123 changed the title [HUDI-2757] Extend the hive sync tool to enable AWS Glue meta sync [HUDI-2757] [WIP] Extend the hive sync tool to enable AWS Glue meta sync Nov 27, 2021
@rmahindra123 rmahindra123 marked this pull request as draft November 27, 2021 00:02
@rmahindra123 rmahindra123 changed the title [HUDI-2757] [WIP] Extend the hive sync tool to enable AWS Glue meta sync [HUDI-2757] [WIP] Implement Sync for AWS Glue Catalog Dec 7, 2021
@rmahindra123 rmahindra123 changed the title [HUDI-2757] [WIP] Implement Sync for AWS Glue Catalog [HUDI-2757] [Stacked over 4175] Implement Sync for AWS Glue Catalog Dec 7, 2021
@hudi-bot
Copy link

hudi-bot commented Dec 7, 2021

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@rubenssoto
Copy link

Hey Guys, do you think it will be merged any time soon?

Thank you

@sshah90
Copy link

sshah90 commented Jan 21, 2022

Hi, Any update on this PR?

Just wondering if we are still on track to release this feature with Hudi 0.11 version.

@nsivabalan nsivabalan added the priority:critical production down; pipelines stalled; Need help asap. label Feb 8, 2022
@rubenssoto
Copy link

Hey Guys,

Any chance to merge this soon?

@xushiyan
Copy link
Member

xushiyan commented Mar 20, 2022

Hey Guys,

Any chance to merge this soon?

@rubenssoto @sshah90 Taking over this. Porting over to #5076

@xushiyan xushiyan closed this Mar 20, 2022
@xushiyan xushiyan added priority:blocker and removed priority:critical production down; pipelines stalled; Need help asap. labels Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants