Skip to content
Fabian Buske edited this page Nov 28, 2013 · 5 revisions

Adding a new task/mod to NGSANE requires the following steps:

Write the shell script.

  1. There is a shell script template in the mods folder that provides a skeleton. Copy or rename the _template.sh to YourNewTask.sh.
  2. Open YourNewTask.sh with a text editor. Follow the TODO comments and populate your new task scripts. Have a look at existing tasks in the mod folder if you need inspiration.

Register the new task in the trigger.sh

  1. Open the bin/trigger.sh script
  2. Add a section for your new task at the end of the trigger.sh script. Make sure to specify the file suffix of the input files in the -e option (INPUT_FILE_SUFFIX). The suffix should be customisable in the corresponding configuration script for your task. NGSANE uses this patterns to automatically detect the samples/libraries for processing.

################################################################################
#   yourNewTask
################################################################################
if [ -n "$RUNYourNewTask" ]; then
    if [ -z "$TASKYourNewTask" ] || [ -z "$NODES_YourNewTask" ] || [ -z "$CPU_YourNewTask" ] || [ -z "$MEMORY_YourNewTask" ] || [ -z "$WALLTIME_YourNewTask" ]; then echo -e "\e[91m[ERROR]\e[0m Server misconfigured"; exit 1; fi

    $QSUB $ARMED -k $CONFIG -t $TASKYourNewTask -i $INPUT_YourNewTask -e INPUT_FILE_SUFFIX -n $NODES_YourNewTask -c $CPU_YourNewTask -m $MEMORY_YourNewTask"G" -w $WALLTIME_YourNewTask \
        --command "${NGSANE_BASE}/mods/YourNewTask.sh -k $CONFIG -f  -o $OUT/<DIR>/$TASKYourNewTask"        
fi

For example:


################################################################################
#   Mapping using Bowtie v1
################################################################################
if [ -n "$RUNMAPPINGBOWTIE" ]; then
    if [ -z "$TASKBOWTIE" ] || [ -z "$NODES_BOWTIE" ] || [ -z "$CPU_BOWTIE" ] || [ -z "$MEMORY_BOWTIE" ] || [ -z "$WALLTIME_BOWTIE" ]; then echo -e "\e[91m[ERROR]\e[0m Server misconfigured"; exit 1; fi

    $QSUB $ARMED -k $CONFIG -t $TASKBOWTIE -i $INPUT_BOWTIE -e $READONE.$FASTQ -n $NODES_BOWTIE -c $CPU_BOWTIE -m $MEMORY_BOWTIE"G" -w $WALLTIME_BOWTIE \
        --command "${NGSANE_BASE}/mods/bowtie.sh -k $CONFIG -f  -o $OUT/<DIR>/$TASKBOWTIE"        
fi

Specify resource defaults in the header.conf and sampleHeader.conf

  1. Open the conf/header.sh script
  2. If yourNewTask script uses programs that are new to NGSANE then register the corresponding cluster modules by adding a new variable in the Software Modules section, one for each additional environment module as well as the reference for the software in the Software reference section, i.e.

##############################################################
# Software Modules
##############################################################
NG_NEWSOFTWARE="software_module/version"
...
##############################################################
# Software reference
##############################################################
NG_CITE_NEWSOFTWARE=""

For example:


##############################################################
# Software Modules
##############################################################
NG_BOWTIE="bowtie/1.0.0"
...
##############################################################
# Software reference
##############################################################
NG_CITE_BOWTIE="Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. Epub 2009 Mar 4. 'Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.'; Langmead B, Trapnell C, Pop M, Salzberg SL."
  1. Register the new task and specify the default resource allocations by adding a file to the conf/header.d/yourNewTask. Therein, define the default environment for the new task by adding the following block, specifying the following resources:
  • TASK_YourNewTask: Name of the folder all results will be located in
  • WALLTIME_YourNewTask: Max amount of time for processing the task (walltime in hours)
  • MEMORY_YourNewTask: memory in gb
  • CPU_YourNewTask: number of CPUs
  • NODES_YourNewTask: number of nodes and cpus per node
  • INPUT_YourNewTask: previous task provides the input files or alternatively the name of the folder (e.g. "fastq")
  • MODULE_YourNewTask: the list of environment modules that need to be loaded to get access to the required bioinformatics software.

##############################################################
# YourNewTask
# URL of the utilized software
TASK_YourNewTask="taskname"
WALLTIME_YourNewTask=40:00:00
MEMORY_YourNewTask=40
CPU_YourNewTask=1
NODES_YourNewTask="nodes=1:ppn=1"
INPUT_YourNewTask=$[TASKINPUT]
MODULE_YourNewTask="${NG_NEWSOFTWARE1} ${NG_NEWSOFTWARE2}"

For example:


##############################################################
# Bowtie1 (1.0.0)
# http://bowtie-bio.sourceforge.net/index.shtml
TASK_BOWTIE="bowtie"
WALLTIME_BOWTIE=72:00:00
MEMORY_BOWTIE=60
CPU_BOWTIE=8
NODES_BOWTIE="nodes=1:ppn=8"
INPUT_BOWTIE="fastq"
MODULE_BOWTIE="${NG_JAVA} ${NG_SAMTOOLS} ${NG_IGVTOOLS} ${NG_R} ${NG_IMAGEMAGIC} ${NG_PICARD} ${NG_SAMSTAT} ${NG_UCSCTOOLS} ${NG_BEDTOOLS} ${NG_BOWTIE}"

Create a sample configuration file

In order to allow other users to use your new task script without having to study your shell script (although they ought to) create a sample configuration script specifying all variables that need to be specified.

  1. There is a template_config file in the sampleConfigs folder that provides a skeleton. Copy or rename the _template.sh to YourNewTask_config.txt.
  2. Open YourNewTask_config.txt with a text editor and populate it with mandatory and optional variables that you access in your YourNewTask.sh script.

Voilà, the new task should now be accessible from the trigger.sh