-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-17992][SQL] Return all partitions from HiveShim when Hive throws a metastore exception when attempting to fetch partitions by filter #15673
Conversation
Test build #67713 has finished for PR 15673 at commit
|
.asInstanceOf[JArrayList[Partition]] | ||
} catch { | ||
case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] => | ||
logWarning("Caught MetaException attempting to get partitions by filter from Hive", ex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change the msg to say we are falling back to fetch all partitions' medatadata?
cc @ericl |
Could we enable this fallback only when the conf is set to false? Otherwise, it might mask legitimate bugs. I also wonder if some of our flaky tests around this issue are due to the conf being leaked by some suites... |
Certainly, but my intent with this PR is to prevent a (painful and confusing) regression for some Hive users of Spark 2.1 which can occur, because Spark 2.1 enables our new partition pruning implementation by default. I mentioned one case where this will happen, but we can't be sure this is the only case. If we make the conditions under which we use a fallback too narrow, we are assuming that other configurations of Hive are compatible with partition pruning outside of the specific conditions we check. I think that's a bit too risky. In fact, before submitting this PR I had written the catch block to catch and fall back for all types of |
The current merge conflict is from d2d438d, which touches the same code. I'll wait for that to be settled before rebasing. |
Test build #67772 has finished for PR 15673 at commit
|
For large tables, the degraded performance should be considered a bug as well. How about this.
That way, we will know if there are cases where metastore pruning fails with direct sql enabled. |
@ericl I've pushed a commit with the changes you recommended. |
Test build #67834 has finished for PR 15673 at commit
|
It looks like all the unit tests passed, however one of the forked test java processes exited with nonzero status for some unknown reason. |
case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] && | ||
tryDirectSql => | ||
throw new RuntimeException("Caught Hive MetaException attempting to get partition " + | ||
"metadata by filter from Hive. Set the Spark configuration setting " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want word it to suggest disabling partition management as a workaround only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some revisions. LMK what you think.
This looks good to me. cc @cloud-fan |
Test build #3382 has finished for PR 15673 at commit
|
LGTM |
Test build #67859 has finished for PR 15673 at commit
|
@mallman can you bring this up-to-date? |
@rxin I believe https://issues.apache.org/jira/browse/SPARK-18168 will need to be resolved before I can rebase this PR. |
@mallman shall we go ahead and revert that in this PR? It didn't help with debugging the flaky test much. |
@ericl I can do that, yes. I'm current tied down. I will push a new commit later today or tonight. |
a metastore exception when attempting to fetch partitions by filter
all table partitions
fails and Hive's direct SQL is enabled
1ed3301
to
8d468ac
Compare
Rebased. |
Test build #67949 has finished for PR 15673 at commit
|
Merging in master. Thanks. |
Happy to help. |
…ws a metastore exception when attempting to fetch partitions by filter (Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-17992) ## What changes were proposed in this pull request? We recently added table partition pruning for partitioned Hive tables converted to using `TableFileCatalog`. When the Hive configuration option `hive.metastore.try.direct.sql` is set to `false`, Hive will throw an exception for unsupported filter expressions. For example, attempting to filter on an integer partition column will throw a `org.apache.hadoop.hive.metastore.api.MetaException`. I discovered this behavior because VideoAmp uses the CDH version of Hive with a Postgresql metastore DB. In this configuration, CDH sets `hive.metastore.try.direct.sql` to `false` by default, and queries that filter on a non-string partition column will fail. Rather than throw an exception in query planning, this patch catches this exception, logs a warning and returns all table partitions instead. Clients of this method are already expected to handle the possibility that the filters will not be honored. ## How was this patch tested? A unit test was added. Author: Michael Allman <[email protected]> Closes apache#15673 from mallman/spark-17992-catch_hive_partition_filter_exception.
getAllPartitionsMethod.invoke(hive, table).asInstanceOf[JSet[Partition]] | ||
case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] && | ||
tryDirectSql => | ||
throw new RuntimeException("Caught Hive MetaException attempting to get partition " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mallman sorry to disturb you here, but what is the reason that when direct sql isn't set only a warning is logged?and why when direct sql is set a runtime exception is being raised instead of just a warning like no direct sql case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @rezasafi
I believe the reasoning is if the user has disabled direct sql, we will try to fetch the partitions for the requested partition predicate anyway. However, since we don't expect that call to succeed, we just log a warning and fallback to the legacy behavior.
On the other hand, if the user has enabled direct sql, then we expect the call to Hive to succeed. If it fails, we consider that an error and throw an exception.
I hope that helps clarify things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for the explanation @mallman. I appreciate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mallman Your assumption is incorrect. If Hive on direct sql fails, it will retry with ORM. So in this case, I am able to reproduce a issue with postgres where direct sql fails and if it retries with ORM, spark fails! Hive has fallback behavior for direct sql.
Filed SPARK-25561
(Link to Jira issue: https://issues.apache.org/jira/browse/SPARK-17992)
What changes were proposed in this pull request?
We recently added table partition pruning for partitioned Hive tables converted to using
TableFileCatalog
. When the Hive configuration optionhive.metastore.try.direct.sql
is set tofalse
, Hive will throw an exception for unsupported filter expressions. For example, attempting to filter on an integer partition column will throw aorg.apache.hadoop.hive.metastore.api.MetaException
.I discovered this behavior because VideoAmp uses the CDH version of Hive with a Postgresql metastore DB. In this configuration, CDH sets
hive.metastore.try.direct.sql
tofalse
by default, and queries that filter on a non-string partition column will fail.Rather than throw an exception in query planning, this patch catches this exception, logs a warning and returns all table partitions instead. Clients of this method are already expected to handle the possibility that the filters will not be honored.
How was this patch tested?
A unit test was added.