Skip to content
Jonathan Trow edited this page Oct 7, 2021 · 46 revisions

Welcome to the sra-tools wiki!

ANNOUNCEMENTS:

SRA data are now available either with full base quality scores (SRA Normalized Format), or with simplified quality scores (SRA Lite), depending on user preference. Both formats can be streamed on demand to the same filetypes (fastq, sam, etc.), so they are both compatible with existing workflows and applications that expect quality scores. However, the SRA Lite format is much smaller, enabling a reduction in storage footprint and data transfer times, allowing dumps to complete more rapidly. The SRA toolkit defaults to using the SRA Normalized Format that includes full, per-base quality scores, but users that do not require full base quality scores for their analysis can request the SRA Lite version to save time on their data transfers. To request the SRA Lite data when using the SRA toolkit, set the "Prefer SRA Lite files with simplified base quality scores" option on the main page of the toolkit configuration- this will instruct the tools to preferentially use the SRA Lite format when available (please be sure to use toolkit version 2.11.2 or later to access this feature). The quality scores generated from SRA Lite files will be the same for each base within a given read (quality = 30 or 3, depending on whether the Read Filter flag is set to 'pass' or 'reject'). Data in the SRA Normalized Format with full base quality scores will continue to have a .sra file extension, while the SRA Lite files have a .sralite file extension. For more information please see our data formats page.

SRA Toolkit 2.11.2 October 7, 2021

align, axf, sra-pileup, vdb, vfs: resolve reference sequences within output directory

cloud, kns, sra-tools: do not acquire CE more often than necessary

kget: renamed to vdb-get

kns, sra-tools: improved reporting of peer certificate information

kns, sra-tools: improved timeout management in CacheTeeFile

ncbi-vdb, ngs, ngs-tools, sra-tools: configure prints the version of compiler

prefetch: better control of reference sequences

prefetch: fixed failure when protected repository exists

prefetch, vdb, vfs: prefetch with "-O" will now correctly place references in output directory

prefetch, vfs: fixed error message 'multiple response SRR URLs for the same service...' when downloading

prefetch: will download any missed dependencies

prefetch: will not hang when failing to download dependencies

sratools: allow driver tool to handle debug output for modules

vdb-dump: using --info with URLs now works correctly

vfs, sra-tools: updated interaction with SRA Data Locator


With release 2.9.1 of sra-tools we have finally made available the tool fasterq-dump, a replacement for the much older fastq-dump tool. As its name implies, it runs faster, and is better suited for large-scale conversion of SRA objects into FASTQ files that are common on sites with enough disk space for temporary files. fasterq-dump is multi-threaded and performs bulk joins in a way that improves performance as compared to fastq-dump, which performs joins on a per-record basis (and is single-threaded).

fastq-dump is still supported as it handles more corner cases than fasterq-dump, but it is likely to be deprecated in the future.

You can get more information about fasterq-dump in this Wiki at https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump.