Skip to content

Commit

Permalink
[SPARK-48806][SQL] Pass actual exception when url_decode fails
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

Pass actual exception for url_decode.

Follow-up to https://issues.apache.org/jira/browse/SPARK-40156

### Why are the changes needed?

Currently url_decode function ignores actual exception, which contains information that is useful for quickly locating the problem.

Like executing this sql:
```
select url_decode('https%3A%2F%2spark.apache.org');
```
We only get the error message:
```
org.apache.spark.SparkIllegalArgumentException: [CANNOT_DECODE_URL] The provided URL cannot be decoded: https%3A%2F%2spark.apache.org. Please ensure that the URL is properly formatted and try again.
    at org.apache.spark.sql.errors.QueryExecutionErrors$.illegalUrlError(QueryExecutionErrors.scala:376)
    at org.apache.spark.sql.catalyst.expressions.UrlCodec$.decode(urlExpressions.scala:118)
    at org.apache.spark.sql.catalyst.expressions.UrlCodec.decode(urlExpressions.scala)
```
However, the actual useful exception information is ignored:
```
java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - Error at index 1 in: "2s"
```

After this pr we will get:

```
org.apache.spark.SparkIllegalArgumentException: [CANNOT_DECODE_URL] The provided URL cannot be decoded: https%3A%2F%2spark.apache.org. Please ensure that the URL is properly formatted and try again. SQLSTATE: 22546
	at org.apache.spark.sql.errors.QueryExecutionErrors$.illegalUrlError(QueryExecutionErrors.scala:372)
	at org.apache.spark.sql.catalyst.expressions.UrlCodec$.decode(urlExpressions.scala:119)
	at org.apache.spark.sql.catalyst.expressions.UrlCodec.decode(urlExpressions.scala)
	...
Caused by: java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - Error at index 1 in: "2s"
	at java.base/java.net.URLDecoder.decode(URLDecoder.java:237)
	at java.base/java.net.URLDecoder.decode(URLDecoder.java:147)
	at org.apache.spark.sql.catalyst.expressions.UrlCodec$.decode(urlExpressions.scala:116)
	... 135 more
```

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47211 from wForget/SPARK-48806.

Lead-authored-by: wforget <[email protected]>
Co-authored-by: Kent Yao <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
  • Loading branch information
wForget and yaooqinn committed Jul 4, 2024
1 parent c73c412 commit 310f8ea
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ object UrlCodec {
UTF8String.fromString(URLDecoder.decode(src.toString, enc.toString))
} catch {
case e: IllegalArgumentException =>
throw QueryExecutionErrors.illegalUrlError(src)
throw QueryExecutionErrors.illegalUrlError(src, e)
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -374,10 +374,11 @@ private[sql] object QueryExecutionErrors extends QueryErrorsBase with ExecutionE
cause = e)
}

def illegalUrlError(url: UTF8String): Throwable = {
def illegalUrlError(url: UTF8String, e: IllegalArgumentException): Throwable = {
new SparkIllegalArgumentException(
errorClass = "CANNOT_DECODE_URL",
messageParameters = Map("url" -> url.toString)
messageParameters = Map("url" -> url.toString),
cause = e
)
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

package org.apache.spark.sql

import org.apache.spark.{SPARK_DOC_ROOT, SparkRuntimeException}
import org.apache.spark.{SPARK_DOC_ROOT, SparkIllegalArgumentException, SparkRuntimeException}
import org.apache.spark.sql.catalyst.expressions.Cast._
import org.apache.spark.sql.execution.FormattedMode
import org.apache.spark.sql.functions._
Expand Down Expand Up @@ -1273,4 +1273,13 @@ class StringFunctionsSuite extends QueryTest with SharedSparkSession {
)
)
}

test("SPARK-48806: url_decode exception") {
val e = intercept[SparkIllegalArgumentException] {
sql("select url_decode('https%3A%2F%2spark.apache.org')").collect()
}
assert(e.getCause.isInstanceOf[IllegalArgumentException] &&
e.getCause.getMessage
.startsWith("URLDecoder: Illegal hex characters in escape (%) pattern - "))
}
}

0 comments on commit 310f8ea

Please sign in to comment.