You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using spark 1.6.0 and I want to run the benchmark. However, I need first to setup the benchmark (I guess).
In the tutorial it's written that we have to execute those lines:
import com.databricks.spark.sql.perf.tpcds.Tables
val tables = new Tables(sqlContext, dsdgenDir, scaleFactor)
tables.genData(location, format, overwrite, partitionTables, useDoubleForDecimal, clusterByPartitionColumns, filterOutNullPartitionValues)
// Create metastore tables in a specified database for your data.
// Once tables are created, the current database will be switched to the specified database.
tables.createExternalTables(location, format, databaseName, overwrite)
// Or, if you want to create temporary tables
tables.createTemporaryTables(location, format)
// Setup TPC-DS experiment
import com.databricks.spark.sql.perf.tpcds.TPCDS
val tpcds = new TPCDS (sqlContext = sqlContext)
I understood that I have to run "spark-shell" first in order to run those lines, but the problem is that when i do "import com.databricks.spark.sql.perf.tpcds.Tables" I got an error " error: object sql is not a member of package com.databricks.spark". In "com.databricks.spark" there is only the "avro" package (I don't really know what it is)
Could you help me please, maybe I understood something wrong?
Thanks
The text was updated successfully, but these errors were encountered:
Make sure you create a jar of spark-sql-perf (using sbt) . When starting spark-shell use the command --jars and point it to that jar.
e.g., ./bin/spark-shell --jars /Users/xxx/yyy/zzz/spark-sql-perf/target/scala-2.11/spark-sql-perf_2.11-0.5.0-SNAPSHOT.jar
@gdchaochao maybe using a absolute path in --jars would also solve it?
In your previous comment you wrote that your command was spark-shell --conf spark.executor.cores=3 --conf spark.executor.memory=8g --conf spark.executor.memoryOverhead=2g --jars ./spark-perf/spark-sql-perf/target/scala-2.11/spark-sql-perf_2.11-0.5.1-SNAPSHOT.jar - with a relative path to the jar.
Hi,
I'm using spark 1.6.0 and I want to run the benchmark. However, I need first to setup the benchmark (I guess).
In the tutorial it's written that we have to execute those lines:
I understood that I have to run "spark-shell" first in order to run those lines, but the problem is that when i do "import com.databricks.spark.sql.perf.tpcds.Tables" I got an error " error: object sql is not a member of package com.databricks.spark". In "com.databricks.spark" there is only the "avro" package (I don't really know what it is)
Could you help me please, maybe I understood something wrong?
Thanks
The text was updated successfully, but these errors were encountered: