Skip to content

Commit

Permalink
Merge branch 'main' of github.com:zy-optimistic/GAEP into main
Browse files Browse the repository at this point in the history
  • Loading branch information
zy-optimistic committed Aug 23, 2023
2 parents a41cc48 + b67b113 commit 847199f
Show file tree
Hide file tree
Showing 2 changed files with 34 additions and 7 deletions.
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ cd GAEP
```
gaep <command> [options]
pipe let GAEP to determine the module to be executed based on the input data
stat report genome basic information
macc base accuracy based on reads mapping
kacc base accuracy based on K-mer
bkp misassembly breakpoints detected
snvcov SNV-coverage dot plot
busco run busco v5
pipe (NGS,TGS,trans) let GAEP to determine the module to be executed based on the input data
stat report genome basic information
macc (NGS) base accuracy based on reads mapping
kacc (NGS) base accuracy based on K-mer
bkp (TGS) misassembly breakpoints detected
snvcov (NGS,TGS) SNV-coverage dot plot
busco run busco v5
```

## Running example
Expand Down
27 changes: 27 additions & 0 deletions test/simulate/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,11 @@ These scripts are used to simulate the misassemblies using a template genome.

# Running
```bash
#At first, we recommend to remove the contigs less than 500k in template.fasta and split the template.fasta from Ns.

#Index
samtools faidx template.fasta

#Randomly generate the positions of misassemlies
perl simu_misassembly_posi.pl template.fasta > position.txt

Expand All @@ -11,4 +16,26 @@ sort -k1V -k2n position.txt | perl move_redundant.pl > position_redun.txt

#Introduce misassemblies by positions. Two FASTA files will be output: one for the reference and one for simulation.
perl simu_misassembly.pl template.fasta

mv *ref.fasta template.ref.fasta
mv *simu.fasta template_simu.fasta

#PacBio reads simulation
pbsim template.ref.fasta --prefix simu_pb --depth 50 --length-min 5000 --length-max 50000 --hmm_model XXX/P6C4.model --length-mean 20000

#The template_simu.fasta is the final assembly with simulated misassemblies, and the template.ref.fasta is the reference.
```

# Output
The positions of simulated misassemblies can be found in STDOUT.
```
#Format:
No strand contig type length start end
41 1 chr1 ins 48762 14640630 14640631
41 2 chr1 ins 48762 14697478 14746240
#For strand:
#1: Misassemblies corresponding to positions in the reference sequence.
#2: Misassembly positions in the simulated sequence.
```
Typically, retaining only the rows with a strand value of 2 will provide the positional information of the final simulated misassemblies.

0 comments on commit 847199f

Please sign in to comment.