Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-8266] Ensure CleanPlanner is serializable #12015

Merged
merged 1 commit into from
Sep 27, 2024

Conversation

the-other-tim-brown
Copy link
Contributor

Change Logs

  • Avoids serializing a FileSystemView as part of the CleanPlanner object by leveraging the methods for the HoodieTable when needed
  • Avoids serializing full commit timeline as well

Impact

  • Avoids serialization issues if SpillableMap is used for the file system view

Risk level (write none, low medium or high below)

None

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@github-actions github-actions bot added the size:S PR with lines of changes in (10, 100] label Sep 27, 2024
@apache apache deleted a comment from hudi-bot Sep 27, 2024
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@danny0405
Copy link
Contributor

danny0405 commented Sep 27, 2024

@the-other-tim-brown Thanks for the contribution, can you elaborate a little more why the CleanPlanner needs to be serializable?

Okay, it makes sense because it implements the Serializable interface.

@danny0405 danny0405 merged commit 215deed into apache:master Sep 27, 2024
43 checks passed
@the-other-tim-brown
Copy link
Contributor Author

@the-other-tim-brown Thanks for the contribution, can you elaborate a little more why the CleanPlanner needs to be serializable?

Okay, it makes sense because it implements the Serializable interface.

More specifically, it is referenced from a lambda function passed to the Hoodiecontext.map so it needs to be serialized then.

@the-other-tim-brown the-other-tim-brown deleted the HUDI-8266 branch September 27, 2024 12:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S PR with lines of changes in (10, 100]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants