Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-5686] Fixes data loss due to rollbacks #7828

Closed
wants to merge 5 commits into from

Conversation

pushpavanthar
Copy link

@pushpavanthar pushpavanthar commented Feb 2, 2023

Change Logs

Fixes #7757

Refer to JIRA HUDI-5686 for detailed description of the issue.
Approach here is to avoid creation of new instants (timestamps) for rollbacks when there are rollbacks of incomplete commits in the timeline created by previous runs. Instead of creating new instants, I'm reusing the timestamp to create rollback instant which abides with the chronological order of the commits.

Impact

Describe any public API or user-facing feature change or any performance impact.
No breaking changes

Risk level (write none, low medium or high below)

If medium or high, explain what verification was done to mitigate the risks.
Low, verified data correctness against source database in production for 50+ HoodieDeltaStreamer jobs running in both batch and continuous modes.

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

  • The config description must be updated if new configs are added or the default value of the configs are changed
  • Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
    ticket number here and follow the instruction to make
    changes to the website.

Contributor's checklist

  • Read through contributor's guide
  • Change Logs and Impact were stated clearly
  • Adequate tests were added if applicable
  • CI passed

@pushpavanthar
Copy link
Author

pushpavanthar commented Feb 2, 2023

@codope kindly review

@@ -734,7 +734,7 @@ protected List<String> getInstantsToRollback(HoodieTableMetaClient metaClient, H
@Deprecated
public boolean rollback(final String commitInstantTime, Option<HoodiePendingRollbackInfo> pendingRollbackInfo, boolean skipLocking) throws HoodieRollbackException {
LOG.info("Begin rollback of instant " + commitInstantTime);
final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(HoodieActiveTimeline.createNewInstantTime());
final String rollbackInstantTime = pendingRollbackInfo.map(entry -> entry.getRollbackInstant().getTimestamp()).orElse(commitInstantTime);
final Timer.Context timerContext = this.metrics.getRollbackCtx();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why rollback the instant itself under the current transaction.

Copy link
Author

@pushpavanthar pushpavanthar Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @danny0405, i just realised that this way of picking timestamp from instant itself will make features like rollbackToInstant fail. This approach fails to place rollback instant in chronological order in timeline.
I'll update the PR with below approach,
use approach similar to below method, which actually takes care of rollbacks first before creating instant for the commit.
org.apache.hudi.client.BaseHoodieWriteClient#startCommit(java.lang.String, org.apache.hudi.common.table.HoodieTableMetaClient)

@pushpavanthar
Copy link
Author

@danny0405 @nsivabalan can you please take a look at the revised changes?

@pratyakshsharma
Copy link
Contributor

@hudi-bot run azure

@danny0405
Copy link
Contributor

The dataloss is not because of rollback, it is the timeline server refresh is problematic for release 0.11.x, I have put a fix in release 0.12.0: #6179

@danny0405 danny0405 self-assigned this Feb 6, 2023
@danny0405 danny0405 added the data-loss loss of data only, use data-consistency label for inconsistent view label Feb 6, 2023
@hudi-bot
Copy link

hudi-bot commented Feb 6, 2023

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan
Copy link
Contributor

Can you close out the patch if its not valid. I assume you are testing it w/ hudi 0.12.0 or higher version.

@nsivabalan nsivabalan added pr:wip Work in Progress/PRs priority:minor everything else; usability gaps; questions; feature reqs and removed data-loss loss of data only, use data-consistency label for inconsistent view labels Feb 8, 2023
@danny0405
Copy link
Contributor

Close it because it had been fixed, feel to re-open when you find any issues.

@danny0405 danny0405 closed this Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr:wip Work in Progress/PRs priority:minor everything else; usability gaps; questions; feature reqs
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

[SUPPORT] missing records when HoodieDeltaStreamer run in continuous mode
5 participants