-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-1794] Moved static COMMIT_FORMATTER to thread local variable as SimpleDateFormat is not thread safe. #2819
[HUDI-1794] Moved static COMMIT_FORMATTER to thread local variable as SimpleDateFormat is not thread safe. #2819
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2819 +/- ##
============================================
- Coverage 52.55% 52.53% -0.02%
- Complexity 3708 3709 +1
============================================
Files 484 485 +1
Lines 23182 23227 +45
Branches 2461 2465 +4
============================================
+ Hits 12183 12203 +20
- Misses 9926 9948 +22
- Partials 1073 1076 +3
Flags with carried forward coverage won't be shown. Click here to find out more. |
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieActiveTimeline.java
Outdated
Show resolved
Hide resolved
eb7d058
to
3deb5e7
Compare
Codecov Report
@@ Coverage Diff @@
## master #2819 +/- ##
============================================
+ Coverage 50.04% 54.60% +4.56%
- Complexity 3685 4010 +325
============================================
Files 526 530 +4
Lines 25466 26026 +560
Branches 2886 2992 +106
============================================
+ Hits 12744 14212 +1468
+ Misses 11454 10411 -1043
- Partials 1268 1403 +135
Flags with carried forward coverage won't be shown. Click here to find out more.
|
3deb5e7
to
199e377
Compare
@hudi-bot run azure |
@vinothchandar @ssdong This is ready for a review again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prashantwason Few places where COMMIT_FORMATTER is still being used:
StreamerUtil
HoodieSqlUtils
TestTimeTravelQuery
Probably, they were added after this patch. Could you please fix them as well?
@prashantwason : can you address the feedback place. We want to get this in for the upcoming release. |
…ds to parse and generate Instant timestamps. Replaced SimpleDateFormat with DateTimeFormatter as the former is not thread safe. Added unit test to ensure new instant time can be generated in multiple threads correctly.
199e377
to
d59d595
Compare
@nsivabalan Updated the PR. |
@prashantwason Can you take a look at the CI failure?
I think the expectation is that instant should be 14chars yyyyMMddHHmmss. But i'm not sure why it was not failing earlier. |
…nst the SOLO_COMMIT_TIMESTAMP for metadata table.
@codope Fixed the test. |
@hudi-bot azure run |
@hudi-bot run azure |
1 similar comment
@hudi-bot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All comments addressed. CI succeeded. Good to land.
Added unit test to ensure new instant time can be generated in multiple threads correctly.
What is the purpose of the pull request
When generating a new instant time in HoodieActiveTimeline, a static instance of SimpleDateFormat is used. This class is not thread safe.
We have a production usecase where multiple HUDI datasets are processed in parallel in different threads of a ThreadPool. Each of these threads creates its own SparkRDDBackedWriteClient and calls startCommit() which generates a new commit time. Because SimpleDateFormat is not thread safe, we get corrupted instant times in several threads.
The solution is to use a thread-specific instance of the SimpleDateFormat for generating new instant times.
Brief change log
Moved static COMMIT_FORMATTER to thread local variable
Verify this pull request
This change added tests and can be verified as follows:
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.