Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HUDI-3869] Improve error handling of loading Hudi conf #5311

Merged

Conversation

yihua
Copy link
Contributor

@yihua yihua commented Apr 13, 2022

What is the purpose of the pull request

This PR improves error handling of loading Hudi conf, addressing comments from #4167.

Brief change log

  • Catches exception if there is error loading default conf and uses try-with-resources statement for the reader in DFSPropertiesConfiguration.

Verify this pull request

Using spark-shell, when the default config file is not present:

scala> df.write.format("hudi").
     |         options(getQuickstartWriteConfigs).
     |         option(PRECOMBINE_FIELD.key(), "ts").
     |         option(RECORDKEY_FIELD.key(), "uuid").
     |         option(PARTITIONPATH_FIELD.key(), "partitionpath").
     |         option(TBL_NAME.key(), tableName).
     |         mode(Overwrite).
     |         save(basePath)
22/04/12 23:12:16 WARN DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf
22/04/12 23:12:16 WARN DFSPropertiesConfiguration: Properties file file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props file
22/04/12 23:12:17 WARN HoodieBackedTableMetadata: Metadata table was not found at path file:///tmp/hudi_trips_cow/.hoodie/metadata
22/04/12 23:12:19 WARN MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-hbase.properties,hadoop-metrics2.properties

When the default config file is present:

df.write.format("hudi").
     |         options(getQuickstartWriteConfigs).
     |         option(PRECOMBINE_FIELD.key(), "ts").
     |         option(RECORDKEY_FIELD.key(), "uuid").
     |         option(PARTITIONPATH_FIELD.key(), "partitionpath").
     |         option(TBL_NAME.key(), tableName).
     |         mode(Overwrite).
     |         save(basePath)
22/04/12 23:17:35 WARN HoodieSparkSqlWriter$: hoodie table at file:/tmp/hudi_trips_cow already exists. Deleting existing data & overwriting with new data.
22/04/12 23:17:35 WARN HoodieBackedTableMetadata: Metadata table was not found at path file:///tmp/hudi_trips_cow/.hoodie/metadata

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@yihua yihua force-pushed the HUDI-3869-address-error-handling-dfs-conf branch from 6478a8f to dca808f Compare April 13, 2022 06:03
@hudi-bot
Copy link

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants