Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setup the benchmark #99

Open
idragus opened this issue Apr 3, 2017 · 3 comments
Open

setup the benchmark #99

idragus opened this issue Apr 3, 2017 · 3 comments

Comments

@idragus
Copy link

idragus commented Apr 3, 2017

Hi,

I'm using spark 1.6.0 and I want to run the benchmark. However, I need first to setup the benchmark (I guess).

In the tutorial it's written that we have to execute those lines:

import com.databricks.spark.sql.perf.tpcds.Tables
val tables = new Tables(sqlContext, dsdgenDir, scaleFactor)
tables.genData(location, format, overwrite, partitionTables, useDoubleForDecimal, clusterByPartitionColumns, filterOutNullPartitionValues)
// Create metastore tables in a specified database for your data.
// Once tables are created, the current database will be switched to the specified database.
tables.createExternalTables(location, format, databaseName, overwrite)
// Or, if you want to create temporary tables
tables.createTemporaryTables(location, format)
// Setup TPC-DS experiment
import com.databricks.spark.sql.perf.tpcds.TPCDS
val tpcds = new TPCDS (sqlContext = sqlContext)

I understood that I have to run "spark-shell" first in order to run those lines, but the problem is that when i do "import com.databricks.spark.sql.perf.tpcds.Tables" I got an error " error: object sql is not a member of package com.databricks.spark". In "com.databricks.spark" there is only the "avro" package (I don't really know what it is)

Could you help me please, maybe I understood something wrong?

Thanks

@jeevanks
Copy link

Make sure you create a jar of spark-sql-perf (using sbt) . When starting spark-shell use the command --jars and point it to that jar.
e.g., ./bin/spark-shell --jars /Users/xxx/yyy/zzz/spark-sql-perf/target/scala-2.11/spark-sql-perf_2.11-0.5.0-SNAPSHOT.jar

@gdchaochao
Copy link

I solve this problem with :require /path/to/file.jarin spark-shell

@juliuszsompolski
Copy link
Contributor

@gdchaochao maybe using a absolute path in --jars would also solve it?
In your previous comment you wrote that your command was spark-shell --conf spark.executor.cores=3 --conf spark.executor.memory=8g --conf spark.executor.memoryOverhead=2g --jars ./spark-perf/spark-sql-perf/target/scala-2.11/spark-sql-perf_2.11-0.5.1-SNAPSHOT.jar - with a relative path to the jar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants