-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix for supported expressions #408
Conversation
Signed-off-by: Niranjan Artal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general comment not very related to this PR.
I notice that we use toLowerCase
frequently; especially in pluginTypeChecker.scala
Everytime we use toLowerCase
means that we iterate over the string which can be a significant slow down if it is redundant.
We may need to file an issue to revisit how we process the expressions, so that we can remove redundant.
Targets
Occurrences of 'toLowerCase' in Project with mask '*.scala'
Found Occurrences in Project with mask '*.scala' (26 usages found)
Unclassified (26 usages found)
rapids-4-spark-tools_2.12 (26 usages found)
com.nvidia.spark.rapids.tool.planparser (3 usages found)
DataWritingCommandExecParser.scala (1 usage found)
41 new ExecInfo(sqlID, s"${node.name.trim} ${wStub.dataFormat.toLowerCase.trim}", "",
SQLPlanParser.scala (2 usages found)
356 ignoreExpressions.contains(expr.toLowerCase)
704 functionMatches.map(_.group(1)).filter(_.toLowerCase() != "cast")
com.nvidia.spark.rapids.tool.profiling (8 usages found)
AutoTuner.scala (3 usages found)
626 .getOrElse("false").toLowerCase
1063 val sizesArr = size.toLowerCase.split("(?=[a-z])")
1075 val sizesArr = size.toLowerCase.split("(?=[a-z])")
ProfileOutputWriter.scala (1 usage found)
63 val suffix = header.replace(" ", "_").toLowerCase
QualificationInfoUtils.scala (4 usages found)
171 if (logType.toLowerCase.equals("dataset")) {
173 } else if (logType.toLowerCase.equals("udfds")) {
175 } else if (logType.toLowerCase.equals("udffunc")) {
177 } else if (logType.toLowerCase.equals("dsanddf")) {
com.nvidia.spark.rapids.tool.qualification (14 usages found)
PluginTypeChecker.scala (12 usages found)
115 x => (x._1.toLowerCase.replaceAll("\\`", ""), x._2))
136 val header = fileContents.head.split(",").map(_.toLowerCase)
189 val header = fileContents.head.split(",").map(_.toLowerCase)
197 val format = cols(0).toLowerCase
198 val direction = cols(1).toLowerCase()
203 }.keys.toSeq.map(_.toLowerCase)
236 val schemaLower = schema.toLowerCase
237 val formatInLower = format.toLowerCase
242 val nsFiltered = dtSupMap(NS).filter(t => schemaLower.contains(t.toLowerCase()))
261 val format = writeFormat.toLowerCase.trim
266 writeFormat.map(x => x.toLowerCase.trim).filterNot(
303 expr.toLowerCase.replace("_", "")
QualificationArgs.scala (2 usages found)
193 order.toLowerCase.startsWith("asc")
197 order.toLowerCase.startsWith("desc")
org.apache.spark.sql.rapids.tool.qualification (1 usage found)
QualificationAppInfo.scala (1 usage found)
179 s"${ds.format.toLowerCase()}[${ds.schema}]"
core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/PluginTypeChecker.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/com/nvidia/spark/rapids/tool/qualification/PluginTypeChecker.scala
Show resolved
Hide resolved
Filed followon issue - #409 |
This PR fixes #381 and fixes #382
For #381 -> Earlier a Map was created for Expression/SQL function name for all supported expressions. Here it was assumed that there is co-relation beween Expression and SQL function name For Example: Expression name : DateAdd and SQL function name: date_add. So we would remove
_
from the SQL function and create a Map. But this doesn't apply for few operators. Example: Expression: First , SQL function name: first_value. So we need 2 different keys in this case. So the fix is to include both SQL function name and expression name in the map.For #382 -> There was a bug where the keys of the map contains space i.e key of mod was
mod
. So it would be considered as unsupported expressions asmod
!=mod
. Updated the code to remove whitespaces for keys.