-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KeyError: 'Target_Species' #12
Comments
Hi @larsmoret |
(checker) 130 lmoret@ubuntudesktopc:~/data/volume_2$ ls compleasmoutput/CBS1922/fungi_odb10/ Total file size is: with per file: |
**Hello, I am running into the same issue as @larsmoret. Attached is my submission script.** Here are the contents of the "arthropoda_odb10" directory: -rw-r--r-- 1 kcd88651 tcglab 9676547 Nov 4 17:49 miniprot_output.gff This is my error output: Traceback (most recent call last): The above exception was the direct cause of the following exception: Traceback (most recent call last): |
Thanks for providing the script. Could you specify a different output folder name for each input assembly, instead of using "$D2" for all the assemblies? |
Hi @larsmoret @katiecdillon , I have added some checks in the code to understand why something went wrong. The reason for KeyError "Target_species" is that there is no candidate alignment hits satisfying the BUSCO threshold. Could you clone the source code and re-run the failed case in the existing compleasm env? e.g.
Thanks! |
Ive tried it, and now it loads the fungi_obd10 but it can not build the index. Thanks in advance, (checker) 2 lmoret@ubuntudesktopc:~/data/volume_2/compleasm$ compleasm run -a ~/finalassemblies/CBS1922.fasta -l fungi -o ~/compleasmoutput/ -t 14 |
To @larsmoret The error "failed to open/build the index" is reported in miniprot. You can test the alignment manually by "miniprot --trans -u -I --outs=0.95 -t 20 --gff ~/finalassemblies/CBS1922.fasta mb_downloads/fungi_odb10/refseq_db.faa.gz > out.gff". I guess the problem occurs in creating the index of genome. |
Hello @huangnengCSU it looks like the output directory was in fact the issue. Thank you! |
Hi @huangnengCSU, Kind regards, [M::main] CMD: /data/volume_2/compleasm_kit/miniprot --trans -u -I --outs=0.95 -t 14 --gff finalassemblies/CBS.fasta mb_downloads/eukaryota_odb10/refseq_db.faa.gz S:0.00%, 0 Download lineage: 0.00(s)Run miniprot: 72.29(s)Analyze miniprot: 46.34(s)Total runtime: 118.63(s) |
Hi @larsmoret, All BUSCO genes are missing is because that there is no gene can be aligned to the assembly and pass the BUSCO's threshold, which means the genes are quite different from the assembly result. It may be the quality of assembly result or choosing the wrong lineage file. Meanwhile, if the assembly with high divergence, miniprot may not align well. Did you try BUSCO and how about the assessment result of BUSCO? |
Dear all,
I must say, I am quite intrigued comparing it to BUSCO
However, I came across an error while trying to run it and i have no idea where to look.
While trying to run Compleasm, it suddenly stops and displays KeyError: 'Target_Species'
Has anyone had the same issue or any idea where the problem might be?
Thanks in advance,
Lars Moret
P.S.
This is my entire log, please note that i have installed Compleasm using conda.
(checker) lmoret@ubuntudesktopc:/data/volume_2$ compleasm run -a finalassemblies/CBS1922.fasta -o compleasmoutput/CBS1922 -l fungi -t 14
Searching for miniprot in the path where compleasm.py is located
Searching for miniprot in the current execution path
Searching for hmmsearch in the path where compleasm.py is located
Searching for hmmsearch in the current execution path
miniprot execute command:
/data/volume_2/compleasm_kit/miniprot
lineage: fungi_odb10
hmmsearch execute command:
/data/volume_2/compleasm_kit/hmmsearch
Traceback (most recent call last):
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Target_species'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/lmoret/miniconda3/envs/checker/bin/compleasm", line 10, in
sys.exit(main())
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/compleasm.py", line 2534, in main
args.func(args)
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/compleasm.py", line 2426, in run
mr.Run()
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/compleasm.py", line 2142, in Run
miniprot_alignment_parser.Run()
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/compleasm.py", line 1158, in Run
self.Run_busco_mode()
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/compleasm.py", line 1234, in Run_busco_mode
filtered_species = records_df["Target_species"].unique()
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/pandas/core/frame.py", line 3458, in getitem
indexer = self.columns.get_loc(key)
File "/home/lmoret/miniconda3/envs/checker/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 'Target_species'
(checker) 1 lmoret@ubuntudesktopc:/data/volume_2$
The text was updated successfully, but these errors were encountered: