Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Parquet with different schemes fail in databricks loader (close #1085)
We have an issue where we read data from multiple parquet files with different schemas (optional column only exist in some of the files). It generates the following exception in Databricks: `com.databricks.backend.common.rpc.SparkDriverExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: [MISSING_COLUMN] Column 'unstruct_event_com_lego_3dcatalogue_like_product_1' does not exist. Did you mean one of the following?` Recreating the issue in Databricks within a notebook and testing different options revealed we had to add the FORMAT_OPTIONS with mergeSchema to fix the issue.
- Loading branch information