conda create -n GVA-prokka -c conda-forge -c bioconda -c defaults prokka conda activate GVA-prokka |
prokka --version prokka --listdb prokka --help |
Note the somewhat novel '--listdb' call. Since prokka is a program that works largely by comparing sequences to other sources, knowing what references it has access to is of equal importance as having the program working. In such situations the the program, and the associated databases may be updated independently.
prokka 1.14.6 Looking for databases in: /work2/01821/ded/stampede2/miniconda3/envs/GVA-prokka/db * Kingdoms: Archaea Bacteria Mitochondria Viruses * Genera: Enterococcus Escherichia Staphylococcus * HMMs: HAMAP * CMs: Archaea Bacteria Viruses help command should give list of options you are familiar with by now |
If you have already run the SPAdes tutorial for assembling full bacterial genomes from simulated reads, it is recommended that you use one or more of the set of assembled contigs.
The contigs.fa file corresponding to the "400_1500_3000" data set gives the highest quality assembly given the larger insert sizes and higher overall coverage. |
mkdir $SCRATCH/GVA_Prokka cd $SCRATCH/GVA_Prokka cp ../GVA_SPAdes_tutorial |
Using the prokka --help command, what options seem particularly useful or important to you?
Options important for controlling files and program speed.
Options important for determining what predictions will be made.
|
For our example, we will leave proteins, evalue and covereage all at their defaults making our command rather simple.
mkdir gene_annotations prokka --outdir gene_annotations --prefix mygenome contigs.fa |