Spark Evaluation Results

This page records some of the results of running the Spark evaluation scripts.

Date	Pipeline	Input data	Cluster	Command (see below)	Time (min)	Notes
2017-11-02	MD, BQSR, HC	Exome (18.4 GB)	10 nodes n1-standard-16	1	28.45
2017-11-02	Reads Pipeline	Exome (18.4 GB)	10 nodes n1-standard-16	2	24.08	15% faster
2017-11-02	MD, BQSR, HC	Genome (133.6 GB	20 nodes n1-standard-16	3	145.92
2017-11-02	Reads Pipeline	Genome (133.6 GB)	20 nodes n1-standard-16	4	99.29	32% faster

Commands

# 1
nohup ./run_gcs_cluster.sh copy_exome_to_hdfs_on_gcs.sh exome_md-bqsr-hc_hdfs.sh &
# 2
nohup ./run_gcs_cluster.sh copy_exome_to_hdfs_on_gcs.sh exome_reads-pipeline_hdfs.sh &
# 3
NUM_WORKERS=20 nohup ./run_gcs_cluster.sh copy_genome_to_hdfs_on_gcs.sh genome_md-bqsr-hc_hdfs.sh &
# 4
NUM_WORKERS=20 nohup ./run_gcs_cluster.sh copy_genome_to_hdfs_on_gcs.sh genome_reads-pipeline_hdfs.sh &

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark Evaluation Results

Commands

Clone this wiki locally