Does anyone know how to install uclust now that it's no longer available on Robert Edgar's website? #21

jcmcnch · 2021-02-09T19:44:41Z

I'm using pyNAST for a software pipeline, but am running into an issue. pyNAST depends on uclust, but it's no longer available - the links on the qiime1 install page and the pyNAST biocore page now redirect to USEARCH on Robert Edgar's website. Any suggestions @rcedgar @gregcaporaso @kylebittinger @jairideout ?

Thanks,
Jesse

rcedgar · 2021-02-09T19:52:30Z

Jesse -- What is the goal of the pipeline? If it's making NAST alignments the results are quite likely not very good and I'll bet there's a better way to approach the analysis. Robert.

…

On 2/9/2021 11:44 AM, Jesse McNichol wrote: I'm using pyNAST for a software pipeline, but am running into an issue. pyNAST depends on uclust, but it's no longer available - the links on the qiime1 install page and the pyNAST biocore page now redirect to USEARCH on Robert Edgar's website. Any suggestions @rcedgar <https://github.com/rcedgar> @gregcaporaso <https://github.com/gregcaporaso> @kylebittinger <https://github.com/kylebittinger> @jairideout <https://github.com/jairideout> ? Thanks, Jesse — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB4UI7FHOHKT6DCTI7LLM6DS6GGDXANCNFSM4XLTMXXA>.

jcmcnch · 2021-02-09T21:06:59Z

Hi Robert, thanks for your prompt reply. The goal is to align SSU rRNA fragments extracted from metagenomics and metatranscriptomics to a stable reference sequence so I can then subset to regions containing binding sites for oligonucleotide primers. The ultimate goal is to then compare those sequences with primers typically used for PCR amplicon generation to evaluate how well primers should work on a given environmental dataset (paper in review currently). When I was constructing the pipeline I did do a lot of manual inspection of the alignments, and as far as I could tell pyNAST did an acceptable job - some sequences got shifted a by a few bases relative to the reference (e.g. E. coli 16S), but they all got aligned to the generally right area of the SSU rRNA molecule. This was fine as a way of subsetting the pile of SSU rRNA I got out into general regions of the molecule that I could then query for the oligo sequence. I would be very open to improving the pipeline in the future, so if you have suggestions on how to get a better stable alignment or improve the pipeline in general I would be all ears. For the near term though, it would be fantastic if there was a way to access the older uclust executable as we need a quick way to make this analysis reproducible to respond to peer reviews that are due in a few weeks. I have an older uclust executable on my system that still works, but my collaborators are having trouble running the pipeline without uclust and it's not clear how to get this executable to them without violating your license agreement since it's no longer available on your website. Any thoughts? Best, Jesse

rcedgar · 2021-02-09T21:15:36Z

Jesse -- go ahead and share with your collaborators, this message grants permission for an exception to the license. A much better approach than NAST is to make individual pair-wise alignments of each metagenomic sequence (call it Q) to the top hit (call it T) in a large reference such as SILVA. The quality of the Q+T pair-wise alignment will usually be very good because the identity will be high, vastly better than a NAST alignment which is riddled with inadvertent errors (introduced in making the reference) and deliberately inserted errors (to force the query into the pre-set number of columns). The usearch_global command in usearch will do a good job of this; I recommend using -fulldp for this use-case -- it's slower but avoids occasional artifacts introduced by the default alignment algorithm. The search_oligodb command in usearch can be used to align an existing primer set to SILVA to get the coordinates for each primer in each reference sequence. Hope this helps. Robert.

…

On 2/9/2021 1:07 PM, Jesse McNichol wrote: Hi Robert, thanks for your prompt reply. The goal is to align SSU rRNA fragments extracted from metagenomics and metatranscriptomics to a stable reference sequence so I can then subset to regions containing binding sites for oligonucleotide primers. The ultimate goal is to then compare those sequences with primers typically used for PCR amplicon generation to evaluate how well primers should work on a given environmental dataset (paper in review currently). When I was constructing the pipeline I did do a lot of manual inspection of the alignments, and as far as I could tell pyNAST did an acceptable job - some sequences got shifted a by a few bases relative to the reference (e.g. /E. coli/ 16S), but they all got aligned to the generally right area of the SSU rRNA molecule. This was fine as a way of subsetting the pile of SSU rRNA I got out into general regions of the molecule that I could then query for the oligo sequence. I would be very open to improving the pipeline in the future, so if you have suggestions on how to get a better stable alignment or improve the pipeline in general I would be all ears. For the near term though, it would be fantastic if there was a way to access the older |uclust| executable as we need a quick way to make this analysis reproducible to respond to peer reviews that are due in a few weeks. I have an older |uclust| executable on my system that still works, but my collaborators are having trouble running the pipeline without |uclust| and it's not clear how to get this executable to them without violating your license agreement since it's no longer available on your website. Any thoughts? Best, Jesse — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB4UI7C3UH2WLS4ILV3I6D3S6GPYHANCNFSM4XLTMXXA>.

jcmcnch · 2021-02-09T22:39:54Z

Hi Robert, thanks a lot for your response. That makes sense and it sounds like usearch does exactly what I want to do in a much more robust way than pyNAST. I only wish I had asked you a year ago when I was constructing the pipeline! I will definitely keep this in mind for future iterations, which I may get to soon if there is sufficient community interest. Thank you also for agreeing that the executable can be shared with my collaborators. There may also be other people who wish to use the current iteration of my pipeline once the paper comes out (it's not pretty, but it works!). In this case, the easiest thing would be to host a copy of the usearch executable on my github page, but I don't know whether you'd be comfortable with this. If you'd prefer it not to be public, then maybe I could offer to share it by email so it cannot be freely downloaded? Would either of these options be acceptable for you? Best, Jesse

rcedgar · 2021-02-09T22:50:48Z

email sounds fine.

…

On 2/9/2021 2:40 PM, Jesse McNichol wrote: Hi Robert, thanks a lot for your response. That makes sense and it sounds like usearch does exactly what I want to do in a much more robust way than pyNAST. I only wish I had asked you a year ago when I was constructing the pipeline! I will definitely keep this in mind for future iterations, which I may get to soon if there is sufficient community interest. Thank you also for agreeing that the executable can be shared with my collaborators. There may also be other people who wish to use the current iteration of my pipeline once the paper comes out (it's not pretty, but it works!). In this case, the easiest thing would be to host a copy of the usearch executable on my github page, but I don't know whether you'd be comfortable with this. If you'd prefer it not to be public, then maybe I could offer to share it by email so it cannot be freely downloaded? Would either of these options be acceptable for you? Best, Jesse — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#21 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB4UI7BG3UGYBDRN4I4I2D3S6G2UTANCNFSM4XLTMXXA>.

jcmcnch · 2021-02-10T08:31:35Z

Thanks Robert, will share the executable by email to any one who needs it.

jcmcnch closed this as completed Feb 10, 2021

tillrobin mentioned this issue Apr 28, 2022

Template filepath does not exist: iMGMC-16SrRNA-alignment.fasta tillrobin/iMGMC#9

Closed

jcmcnch mentioned this issue May 20, 2022

License for use of uclust binary jcmcnch/MGPrimerEval#3

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does anyone know how to install uclust now that it's no longer available on Robert Edgar's website? #21

Does anyone know how to install uclust now that it's no longer available on Robert Edgar's website? #21

jcmcnch commented Feb 9, 2021

rcedgar commented Feb 9, 2021 via email

jcmcnch commented Feb 9, 2021

rcedgar commented Feb 9, 2021 via email

jcmcnch commented Feb 9, 2021

rcedgar commented Feb 9, 2021 via email

jcmcnch commented Feb 10, 2021

Does anyone know how to install uclust now that it's no longer available on Robert Edgar's website? #21

Does anyone know how to install uclust now that it's no longer available on Robert Edgar's website? #21

Comments

jcmcnch commented Feb 9, 2021

rcedgar commented Feb 9, 2021 via email

jcmcnch commented Feb 9, 2021

rcedgar commented Feb 9, 2021 via email

jcmcnch commented Feb 9, 2021

rcedgar commented Feb 9, 2021 via email

jcmcnch commented Feb 10, 2021