Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no markers found #2

Open
linzhi2013 opened this issue Oct 21, 2016 · 4 comments
Open

no markers found #2

linzhi2013 opened this issue Oct 21, 2016 · 4 comments

Comments

@linzhi2013
Copy link

linzhi2013 commented Oct 21, 2016

Dear developers of DOMINO,

I have run DOMINO for 20 samples (species), and it finished today. However, it gave me some information below, and I am not sure if this can only be caused by my data or parameters I used. Could you please give me some hints? thank you very much!

my data were RNA-seq data for 20 insect species. Do you think DOMINO can help me find markers from these data?

When I used three species to test, DOMINO did give me markers, which seemed to be good.

my parameters for 20 samples (not show -taxa_names, -user_contig_files, and -user_cleanRead_files here) was:

-VD 0.01 -CL 40 -VL 400 -CD 1 -SLCD 1e-06 -mp 4 -p 12 -DM discovery -option user_assembly_contigs -type_input pair_end -o test

The output of directory 201610130157_DM_mapping/ looks good, and it contains files like ARRAY_files_taxa_spA_clean_filtered.profile/ and taxa_spB_clean_filtered.sorted.bam.

However, the directories like /201610130157_DM_markers/markers_Ref_spC/MSA_markers was empty. Do you think it can be I used too strict parameters so that DOMINO can not find shared-markers of 20 samples?

The information DOMINO output:

###########################################################################     ######################
############################# Clustering markers for unique results    #############################
#################################################################################################
+ Merging different DOMINO markers according to the taxa of reference...
+ Generate a BLAST database...
Generating the database failed when trying to proccess the file... DOMINO would not stop in this step...

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ERROR !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Early termination of the DOMINO Marker Scan...

Try 'perl DOMINO/bin/DM_MarkerScan_v1.0.1.pl -h|--help or -man' for more inform
Exit program.


!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!


Please note that DOMINO could not find any markers for the parameters provided. Please Re-Run DOMINO using other parameters
@JFsanchezherrero
Copy link
Member

Dear Guanliang,

Nice to hear from you again. I was wondering how you were doing and if you have successfully finished your DOMINO experiment.

I can see what you send me and apparently everything is ok.

Just a couple of tips:

  • VD option: This is the level of variation expected within any pairwise comparison. Bear in mind you are expecting to find a marker with at least a 1% of variation within ALL pairwise comparisons of the 20 samples you entered as input. Maybe you can try to set VD option to some value < 0 [ Ex. -VD -1 ] so you would retrieve markers with some variation within any pairwise comparisons and a collected variation for the whole tree.
  • Take into account you are expecting to find markers with a fixed length of Variable (VL) and Conserved (CL) region size for the 20 samples provided. Maybe you can try to set -MCT|minimum_number_taxa_covered to any value < 20. You should find markers with at least the MCT=number provided up to the maximun (20)
  • Once the mapping is done, if you are not changing any parameter related to the mapping such as mp, SLCD provide the option -NPG|No_Profile_Generation for speeding up the computation, as profiles of variation are already generated, the mapping and filtering would not be done. Take into account you should always keep the folder and file names as DOMINO named in previous runs to allow NPG option to work.

I can see it has taken you a bit to map and check all the samples provided. We are now working on multi-parallel processing of the markers and I am really hoping to speeding up the computation a few times. Although so far we are using threads, not all the steps allow to parallelise the computation so we are working on that. Anyway, please check new commits and new releases.

Also, we are working on VL and CL variables as we understand obtaining a marker of a fixed number of bases might be quite tricky. So, we will make possible to provide a range of min and max bases for each region. So, again, check new commits and releases of the software.

Please take into account the tips provided and re-run DOMINO for the search of new markers.

Looking forward to hearing from you with good news and hundreds of markers!

Best wishes,

Jose F. Sanchez

@linzhi2013
Copy link
Author

Thank you for your tips, Jose!
I will try different parameters, and let you know when it have updates.

Best wishes,
Guanliang

@JFsanchezherrero
Copy link
Member

Dear Guanliang,

Nice to hear from you again. I was wondering how you were doing and if you have successfully finished your DOMINO experiment.

I can see what you send me and apparently everything is ok.

Just a couple of tips:

  • VD option: This is the level of variation expected within any pairwise comparison. Bear in mind you are expecting to find a marker with at least a 1% of variation within ALL pairwise comparisons of the 20 samples you entered as input. Maybe you can try to set VD option to some value < 0 [ Ex. -VD -1 ] so you would retrieve markers with some variation within any pairwise comparisons and a collected variation for the whole tree.
  • Take into account you are expecting to find markers with a fixed length of Variable (VL) and Conserved (CL) region size for the 20 samples provided. Maybe you can try to set -MCT|minimum_number_taxa_covered to any value < 20. You should find markers with at least the MCT=number provided up to the maximun (20)
  • Once the mapping is done, if you are not changing any parameter related to the mapping such as mp, SLCD provide the option -NPG|No_Profile_Generation for speeding up the computation, as profiles of variation are already generated, the mapping and filtering would not be done. Take into account you should always keep the folder and file names as DOMINO named in previous runs to allow NPG option to work.

I can see it has taken you a bit to map and check all the samples provided. We are now working on multi-parallel processing of the markers and I am really hoping to speeding up the computation a few times. Although so far we are using threads, not all the steps allow to parallelise the computation so we are working on that. Anyway, please check new commits and new releases.

Also, we are working on VL and CL variables as we understand obtaining a marker of a fixed number of bases might be quite tricky. So, we will make possible to provide a range of min and max bases for each region. So, again, check new commits and releases of the software.

Please take into account the tips provided and re-run DOMINO for the search of new markers.

Looking forward to hearing from you with good news and hundreds of markers!

Best wishes,

Jose F. Sanchez

On 21 Oct 2016, at 18:09, linzhi2013 <[email protected]mailto:[email protected]> wrote:

Dear developers of DOMINO,

I have run DOMINO for 20 samples, and it finished today. However, it gave me some information below, and I am not sure if this can only be caused by my data or parameters I used. Could you please give me some hints? thank you very much!

my parameters (not show -taxa_names, -user_contig_files, and -user_cleanRead_files here) was:

-VD 0.01 -CL 40 -VL 400 -CD 1 -SLCD 1e-06 -mp 4 -p 12 -DM discovery -option user_assembly_contigs -type_input pair_end -o test

The information DOMIN output:

########################################################################### ######################
############################# Clustering markers for unique results #############################
#################################################################################################

  • Merging different DOMINO markers according to the taxa of reference...
  • Generate a BLAST database...
    Generating the database failed when trying to proccess the file... DOMINO would not stop in this step...

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ERROR !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Early termination of the DOMINO Marker Scan...

Try 'perl /ifs4/NGB_ENV/USER/mengguanliang/soft/DOMINO/build/DOMINO/bin/DM_MarkerScan_v1.0.1.pl -h|--help or -man' for more inform
Exit program.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Please note that DOMINO could not find any markers for the parameters provided. Please Re-Run DOMINO using other parameters


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://github.com//issues/2, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ATTootN8dDTgCczmklLBi_z6BXQvN5bJks5q2OO9gaJpZM4KdZEO.

Aquest correu electrònic i els annexos poden contenir informació confidencial o protegida legalment i està adreçat exclusivament a la persona o entitat destinatària. Si no sou el destinatari final o la persona encarregada de rebre’l, no esteu autoritzat a llegir-lo, retenir-lo, modificar-lo, distribuir-lo, copiar-lo ni a revelar-ne el contingut. Si heu rebut aquest correu electrònic per error, us preguem que n’informeu al remitent i que elimineu del sistema el missatge i el material annex que pugui contenir. Gràcies per la vostra col·laboració.

Este correo electrónico y sus anexos pueden contener información confidencial o legalmente protegida y está exclusivamente dirigido a la persona o entidad destinataria. Si usted no es el destinatario final o la persona encargada de recibirlo, no está autorizado a leerlo, retenerlo, modificarlo, distribuirlo, copiarlo ni a revelar su contenido. Si ha recibido este mensaje electrónico por error, le rogamos que informe al remitente y elimine del sistema el mensaje y el material anexo que pueda contener. Gracias por su colaboración.

This email message and any documents attached to it may contain confidential or legally protected material and are intended solely for the use of the individual or organization to whom they are addressed. We remind you that if you are not the intended recipient of this email message or the person responsible for processing it, then you are not authorized to read, save, modify, send, copy or disclose any of its contents. If you have received this email message by mistake, we kindly ask you to inform the sender of this and to eliminate both the message and any attachments it carries from your account. Thank you for your collaboration.

@JFsanchezherrero
Copy link
Member

Dear Guanliang,

We have updated a new subversion of DOMINO that includes multi-threading and we have made possible to provide a range of min and max bases for conserved and variable regions. You may need to repeat the mapping.

Please give it a try and check how it works for you. It may need some debugging so please let us know any further bugs or comments.

Regards,

Jose F. Sanchez

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants