-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to query fields of type Decimal and data is 0 #265
Comments
I think
|
BTW, thanks for reporting this issue, PR is welcome~ |
I'm not sure what caused it. In my local environment, it works after modifying to 'asText' |
Could you try this way? case d: DecimalType if jsonNode.isBigDecimal =>
Decimal(jsonNode.decimalValue, d.precision, d.scale)
case d: DecimalType if jsonNode.isFloat | jsonNode.isDouble =>
Decimal(BigDecimal(jsonNode.doubleValue, new MathContext(d.precision)), d.precision, d.scale)
+ case d: DecimalType if jsonNode.inInt =>
+ Decimal(BigDecimal(jsonNode.intValue, new MathContext(d.precision)), d.precision, d.scale)
case d: DecimalType =>
Decimal(BigDecimal(jsonNode.textValue, new MathContext(d.precision)), d.precision, d.scale) |
Seems there are other unhandled cases here, like isLong, isBigInteger ... |
|
So that's the root cause, would u like to send PR to fix it, and other potential cases? |
PTAL |
I found that the |
issue is still present. any updates? @eye-gu why issue is closed? it's still present in 0.7.2 with 0.4.5 jdbc driver with spark 3.4.2 |
@paf91 I need to find some time to publish a new version containing this patch, maybe in a few days, you can try building the master branch before the publish is done. |
@paf91 FYI, 0.7.3 is available now, it includes this patch. |
@pan3793 could you please tell if you know what's the best practice to save resulting dataframe into clickhouse? I couldn't find it in docs https://housepower.github.io/spark-clickhouse-connector/quick_start/02_play_with_spark_sql/ and the best I could do it use clickhouse-jdbc like |
@paf91 Well, the "best practices" depend on a lot of stuff. The built-in JDBC data source is maintained by the Apache Spark community as a generic solution for interacting with RDBMS, just keep using it if it works well for your cases. Instead of providing best practices, I'd like to list some points that I think are worth careful consideration. Performance in the distributed system is a big topic, I wrote an article to explain how this connector improves the query performance. Convenience. For example, the data engineer may want to use Transactions. For a long time, transactions were not strict or even missing in the big data world. In a distributed system, the failure of a single node is normal. Without the guarantee of writing transactions, the resulting retries may lead to eventual data duplication. Have some thoughts about this topic previously #145 |
spark sql: select number_col1 from
clickhouse
.zhouwq
.user_test
;log
spark-sql> select number_col1 from
clickhouse
.zhouwq
.user_test
;23/08/10 18:11:38 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
java.lang.NullPointerException
at java.math.BigDecimal.(BigDecimal.java:831)
at scala.math.BigDecimal$.apply(BigDecimal.scala:290)
at xenon.clickhouse.read.format.ClickHouseJsonReader.decodeValue(ClickHouseJsonReader.scala:74)
at xenon.clickhouse.read.format.ClickHouseJsonReader.decode(ClickHouseJsonReader.scala:48)
at xenon.clickhouse.read.format.ClickHouseJsonReader.decode(ClickHouseJsonReader.scala:33)
at xenon.clickhouse.read.ClickHouseReader.get(ClickHouseReader.scala:89)
at xenon.clickhouse.read.ClickHouseReader.get(ClickHouseReader.scala:29)
The text was updated successfully, but these errors were encountered: