Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master pull #5

Merged
merged 33 commits into from
Mar 23, 2022
Merged

master pull #5

merged 33 commits into from
Mar 23, 2022

Conversation

fengjian428
Copy link
Owner

Tips

What is the purpose of the pull request

(For example: This pull request adds quick-start document.)

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

nsivabalan and others added 30 commits March 13, 2022 21:11
….compact.inline.max.delta.commits (#4976)

* Update CompactionHoodiePathCommand.scala

fix NPE when run schdule using spark-sql if the commits time < hoodie.compact.inline.max.delta.commits

* Update CompactionHoodiePathCommand.scala

fix IndexOutOfBoundsException when there`s no schedule for compaction

* Update CompactionHoodiePathCommand.scala

fix CI issue
… Maxwell json string (#4987)

* [HUDI-3547] Introduce MaxwellSourcePostProcessor to extract data from Maxwell json string

* add ut

* Address comment
* [HUDI-3633] Allow non-string values to be set in TypedProperties

* Override getProperty to ignore instanceof string check
* [HUDI-3607] Support backend switch in HoodieFlinkStreamer

* [HUDI-3607] Support backend switch in HoodieFlinkStreamer
1. checkstyle fix

* [HUDI-3607] Support backend switch in HoodieFlinkStreamer
1. change the msg
…afe HashMap (#5028)

- Change HashMap in HoodieROTablePathFilter to ConcurrentHashMap
…t executors (#4856)

- Adopt HoodieData in Spark action commit executors
- Make Spark independent DeleteHelper, WriteHelper, MergeHelper in hudi-client-common
- Make HoodieTable in WriteClient APIs have raw type to decouple with Client's generic types
…lways be consistent with input operator (#5049)

for chaining purpose

Co-authored-by: jerryyue <[email protected]>
…cation (#4877)

Refactoring Spark DataSource Relations to avoid code duplication. 

Following Relations were in scope:

- BaseFileOnlyViewRelation
- MergeOnReadSnapshotRelaation
- MergeOnReadIncrementalRelation
…able commit (#5070)

* Fixed metadata conversion util to extract schema from `HoodieCommitMetadata`

* Fixed failure to fetch columns to index in empty table

* Abort indexing seq in case there are no columns to index

* Fallback to index at least primary key columns, in case no writer schema could be obtained to index all columns

* Fixed `getRecordFields` incorrectly ignoring default value

* Make sure Hudi metadata fields are also indexed
…eption

Actually method FlinkWriteHelper#deduplicateRecords does not guarantee the records sequence, but there is a
implicit constraint: all the records in one bucket should have the same bucket type(instant time here),
the BucketStreamWriteFunction breaks the rule and fails to comply with this constraint.

close #5018
pratyakshsharma and others added 3 commits March 21, 2022 20:06
- Provided option to trigger clean every nth commit with default number of commits as 1 so that existing users are not affected.
Co-authored-by: sivabalan <[email protected]>
…andardize configs (#4175)

- Refactor hive sync tool / config to use reflection and standardize configs

Co-authored-by: sivabalan <[email protected]>
Co-authored-by: Rajesh Mahindra <[email protected]>
Co-authored-by: Raymond Xu <[email protected]>
@fengjian428 fengjian428 merged commit 4885c98 into fengjian428:master Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.