Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35143][SQL][SHELL] Add default log level config for spark-sql #32248

Closed
wants to merge 2 commits into from

Conversation

hddong
Copy link
Contributor

@hddong hddong commented Apr 20, 2021

What changes were proposed in this pull request?

Add default log config for spark-sql

Why are the changes needed?

The default log level for spark-sql is WARN. How to change the log level is confusing, we need a default config.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Change config log4j.logger.org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver=INFO in log4j.properties, then spark-sql's default log level changed.

HyukjinKwon
HyukjinKwon previously approved these changes Apr 21, 2021
@HyukjinKwon
Copy link
Member

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42232/

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42232/

@HyukjinKwon HyukjinKwon dismissed their stale review April 21, 2021 01:55

Oh wait. We already have -S mode, right? ./bin/spark-sql -S

@HyukjinKwon
Copy link
Member

cc @wangyum FYI

@wangyum
Copy link
Member

wangyum commented Apr 21, 2021

Yes. We already have -S mode.

@hddong
Copy link
Contributor Author

hddong commented Apr 21, 2021

@HyukjinKwon : I think there some different with -S.

  1. -S cannot change log to INFO(or other level)
  2. with log4j we can change spark-sql init log level.
  3. And they can worrk together.

With default:

> spark-sql
21/04/21 10:52:50 WARN Utils: Your hostname, hongddMac.local resolves to a loopback address: 127.0.0.1; using 172.20.30.177 instead (on interface en0)
21/04/21 10:52:50 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
21/04/21 10:52:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/04/21 10:52:53 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
21/04/21 10:52:53 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
21/04/21 10:52:54 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
21/04/21 10:52:54 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore [email protected]
Spark master: local[*], Application Id: local-1618973572072
spark-sql> show databases;
default
Time taken: 1.719 seconds, Fetched 1 row(s)

And -S not out line Time taken: 1.719 seconds, Fetched 1 row(s)

With log4j changed INFO:

>bin/spark-sql
21/04/21 10:56:14 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/04/21 10:56:15 INFO HiveConf: Found configuration file null
21/04/21 10:56:15 INFO SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir.
21/04/21 10:56:15 INFO SharedState: Warehouse path is 'file:/user/hive/warehouse'.
21/04/21 10:56:15 INFO SessionState: Created HDFS directory: /tmp/hive/hongdd/91f89606-aa0c-4514-96a7-1ff3bbff46bd
21/04/21 10:56:15 INFO SessionState: Created local directory: /var/folders/xl/jv1jvw6s6hv42ht1ty6gnk100000gn/T/hongdd/91f89606-aa0c-4514-96a7-1ff3bbff46bd
21/04/21 10:56:15 INFO SessionState: Created HDFS directory: /tmp/hive/hongdd/91f89606-aa0c-4514-96a7-1ff3bbff46bd/_tmp_space.db
21/04/21 10:56:15 INFO SparkContext: Running Spark version 3.2.0-SNAPSHOT
21/04/21 10:56:15 INFO ResourceUtils: ==============================================================
21/04/21 10:56:15 INFO ResourceUtils: No custom resources configured for spark.driver.
21/04/21 10:56:15 INFO ResourceUtils: ==============================================================
21/04/21 10:56:15 INFO SparkContext: Submitted application: SparkSQL::172.20.30.177
....
spark-sql> show databases;
21/04/21 10:56:52 INFO HiveMetaStore: 0: get_databases: *
21/04/21 10:56:52 INFO audit: ugi=hongdd        ip=unknown-ip-addr      cmd=get_databases: *
21/04/21 10:56:53 INFO CodeGenerator: Code generated in 129.111772 ms
21/04/21 10:56:53 INFO CodeGenerator: Code generated in 5.292995 ms
default
Time taken: 1.735 seconds, Fetched 1 row(s)
21/04/21 10:56:53 INFO SparkSQLCLIDriver: Time taken: 1.735 seconds, Fetched 1 row(s)

And -S effect when you query but not init.

1/04/21 10:57:40 INFO HiveMetaStore: 0: get_all_functions
21/04/21 10:57:40 INFO audit: ugi=hongdd        ip=unknown-ip-addr      cmd=get_all_functions
21/04/21 10:57:40 INFO HiveMetaStore: 0: get_database: default
21/04/21 10:57:40 INFO audit: ugi=hongdd        ip=unknown-ip-addr      cmd=get_database: default
21/04/21 10:57:40 INFO SparkSQLCLIDriver: Spark master: local[*], Application Id: local-1618973857876
spark-sql> show databases;
default
21/04/21 10:57:46 INFO SparkSQLCLIDriver: Time taken: 1.698 seconds, Fetched 1 row(s)

@SparkQA
Copy link

SparkQA commented Apr 21, 2021

Test build #137704 has finished for PR 32248 at commit cec327b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hddong
Copy link
Contributor Author

hddong commented Apr 23, 2021

@HyukjinKwon : Waht do you think about this PR and -S mode? If I have misunderstandings, please point it out.

@wangyum
Copy link
Member

wangyum commented Apr 23, 2021

cc @sarutak

@sarutak
Copy link
Member

sarutak commented Apr 23, 2021

The log level for spark-shell is determined here.
Users can change the the log level by log4j.logger.org.apache.spark.repl.Main and log4j.properties.template contains an example.

On the other hand, the log level for spark-sql is determined at the same place as spark-shell and users can change the log level by log4j.logger.org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver. But log4j.properties.template contains no example.

So, I think it's reasonable to have an example in log4j.properties.template like this PR suggests.
What do you think? @HyukjinKwon @wangyum

# log level for this class is used to overwrite the root logger's log level, so that
# the user can have different defaults for the shell and regular Spark apps.
# Set the default spark-shell/spark-sql log level to WARN. When running the
# spark-shell/spark-sql, the log level for this class is used to overwrite
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this class -> these classes?

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay looks fine.

@HyukjinKwon HyukjinKwon changed the title [SPARK-35143][SQL][SHELL]Add default log level config for spark-sql [SPARK-35143][SQL][SHELL] Add default log level config for spark-sql Apr 23, 2021
@HyukjinKwon
Copy link
Member

Merged to master.

@SparkQA
Copy link

SparkQA commented Apr 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42375/

@SparkQA
Copy link

SparkQA commented Apr 23, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42375/

@SparkQA
Copy link

SparkQA commented Apr 23, 2021

Test build #137845 has finished for PR 32248 at commit d95b28d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants