-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tune Juicer for Cheaha #2
Comments
Proposed changes are available in pull request #1 |
The juicer forum is a potential resource for customizing the slurm support. |
I've opened this issue requesting correction or clarification on running juicer with test data. |
Just for guidance, the slurm version of de-dup chimera reads awk script, splits the sam data at every 1million reads at a known non-duplicate boundary. It checks to see if any of last 6 fields of the "cb" record are different from the prior record. If they are, they will not be duplicates. It places all those reads in a file and submits a job to process those records. It continues this step until all reads are submitted for de-duping, so the max time for dedup will be the time it takes to process 1million records. The code should work fine but we will need to improved the following line in that script. It has a hard coded email address and host name. This should be driven by parameters. juicer/SLURM/scripts/split_rmdups_sam.awk Line 98 in 290e443
|
Is your feature request related to a problem? Please describe.
Juicer can't use the SLURM scheduler on Cheaha
Describe the solution you'd like
Run juicer.sh in a screen/tmux/byobu session on the login node and have all work submitted as jobs to the cluster.
Describe alternatives you've considered
Running on a single node but takes to long.
Additional context
We need to be able to demonstrate successful operation of juicer.sh on cheaha. This requires customizing the juicer environment to use the partitions and modules available on cheaha, as described here:
https://github.com/aidenlab/juicer/wiki/Running-Juicer-on-a-cluster
This demonstration needs to include an example data set that can be run quickly but accurately reflects a full-scale run.
The sample data listed at the above wiki docs link is no longer available.
The text was updated successfully, but these errors were encountered: