Running dnaapler
For all subcommands, dnaapler
requires an input FASTA file using the -i
or --input
parameters.
It is also highly recommended to specify an output directory using the -o
or --output
parameters, otherwise dnaapler
will write the output to a directory named output.dnaapler
by default.
You can modify the prefix for the output files from dnaapler
to whatever you please with the -p
or --prefix
parameters.
You can use BLAST with multiple threads using the -t
or --threads
parameters and modify the BLAST evalue with the -e
or --evalue
parameter.
dnaapler
will not overwrite an output directory if it already exists by default. To force overwrite, please use -f
or --force
.
Finally, for the BLAST based subcommands (chromosome
, phage
, plasmid
, custom
or all
), if no BLAST hit is found, by default dnaapler
will error and exit.
However, you can decide to autocomplete dnaapler
using the -a
or --autocomplete
parameters along with mystery
or nearest
, which will then run those subcommands to reorient your sequence.
Also, a seed value using --seed_value
can be specified with dnaapler
to ensure that dnaapler mystery
(or when austocomplete is used with -a mystery
) to ensure dnaapler
is reproducible in workflows.
all
dnaapler all
is designed to simultaneously orient multiple contigs that can be a mix of chromosomes, plasmids and phages. It will also work on just 1 contig.
If a contig has BLAST hits for both dnaA and terL or repA, dnaA will be chosen for reorientation.
If a contig has BLAST hits for both terL and repA (but not dnaA), repA will be chosen for reorientation.
You can also specify a text file with --ignore
that lists all contigs (based on their header) to be ignored during reorientation.
e.g. the file (ignored_contigs.txt
) needs to be formatted as follows:
contig_1
contig_2
Example usage to reorient a number of contigs in input.fasta
, ignoring all contigs with headers denoted in ignored_contigs.txt
dnaapler all -i input.fasta -o output_directory_path -t 8 --ignore ignored_contigs.txt
Usage: dnaapler all [OPTIONS]
Reorients contigs to begin with any of dnaA, repA or terL
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
-e, --evalue TEXT e value for blastx [default: 1e-10]
--ignore PATH Text file listing contigs (one per row) that are to
be ignored
-a, --autocomplete TEXT Choose an option to autocomplete reorientation if
BLAST based approach fails. Must be one of: none,
mystery, largest, or nearest [default: none]
--seed_value INTEGER Rand
chromosome
Example usage with mystery
as the autocomplete command and a random seed of 245 for reproducibility and with 8 threads for BLAST:
dnaapler chromosome -i input.fasta -o output_directory_path -p my_bacteria_name -t 8 -a mystery --seed_value 245
Usage: dnaapler chromosome [OPTIONS]
Reorients your genome to begin with the dnaA chromosomal replication
initiation gene
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
-e, --evalue TEXT e value for blastx [default: 1e-10]
-a, --autocomplete TEXT Choose an option to autocomplete reorientation if
BLAST based approach fails. Must be one of: none,
mystery or nearest [default: none]
--seed_value INTEGER Random seed to ensure reproducibility. [default:
13]
phage
Example usage with no autocomplete command:
dnaapler phage -i input.fasta -o output_directory_path -p my_phage_name -t 8
Usage: dnaapler phage [OPTIONS]
Reorients your genome to begin with the terL large terminase subunit
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
-e, --evalue TEXT e value for blastx [default: 1e-10]
-a, --autocomplete TEXT Choose an option to autocomplete reorientation if
BLAST based approach fails. Must be one of: none,
mystery or nearest [default: none]
--seed_value INTEGER Random seed to ensure reproducibility. [default:
13]
plasmid
Example usage with no autocomplete command:
dnaapler plasmid -i input.fasta -o output_directory_path -p my_plasmid_name -t 8
Usage: dnaapler plasmid [OPTIONS]
Reorients your genome to begin with the repA replication initiation gene
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
-e, --evalue TEXT e value for blastx [default: 1e-10]
-a, --autocomplete TEXT Choose an option to autocomplete reorientation if
BLAST based approach fails. be one of: none,
mystery or nearest [default: none]
--seed_value INTEGER Random seed to ensure reproducibility. [default:
13]
custom
To run dnaapler custom
, you need to prefix an Amino Acid FASTA file containing the desired custom database gene using -c
or --custom_db
.
Example usage:
dnaapler custom -i input.fasta -o output_directory_path -p my_plasmid_name -t 8 -c custom_db.faa
Usage: dnaapler custom [OPTIONS]
Reorients your genome with a custom database
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
-e, --evalue TEXT e value for blastx [default: 1e-10]
-c, --custom_db PATH FASTA file with amino acids that will be used as a
custom blast database to reorient your sequence
however you want. [required]
-a, --autocomplete TEXT Choose an option to autocomplete reorientation if
BLAST based approach fails. Must be one of: none,
mystery or nearest [default: none]
--seed_value INTEGER Random seed to ensure reproducibility. [default:
13]
mystery
dnaapler mystery
will reorient your genome to begin with a random coding sequence (CDS) (as predicted by Pyrodigal).
Example usage:
dnaapler mystery -i input.fasta -o output_directory_path -t 8
Usage: dnaapler mystery [OPTIONS]
Reorients your genome with a random CDS
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
--seed_value INTEGER Random seed to ensure reproducibility. [default: 13]
nearest
dnaapler nearest
will reorient your genome to begin the first coding sequence (CDS) as predicted by Pyrodigal.
Example usage:
dnaapler nearest -i input.fasta -o output_directory_path -t 8
Usage: dnaapler nearest [OPTIONS]
Reorients your genome the begin with the first CDS as called by pyrodigal
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
largest
dnaapler largest
will reorient your genome to begin the largest coding sequence (CDS) as predicted by Pyrodigal.
Example usage:
dnaapler largest -i input.fasta -o output_directory_path -t 8
Usage: dnaapler nearest [OPTIONS]
Reorients your genome the begin with the largest CDS as called by pyrodigal
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
bulk
dnaapler bulk
is designed to simultaneously orient multiple genomes.
You must also specify -m
or --mode
with either chromosome
, phage
, plasmid
or custom
to tell dnaapler
what mode to run. It will default to -m chromosome
. Additionally, if you choose -m custom
, then you must also specify a custom database amino acid file using -c
or --custom_db
.
Your input FASTA must also have at least 2 contigs.
Example usage to reorient a number of bacterial chromosomes in input.fasta
to begin with the dnaA gene:
dnaapler bulk -i input.fasta -o output_directory_path -t 8 -m chromosome
Usage: dnaapler bulk [OPTIONS]
Reorients multiple genomes to begin with the same gene
Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.
-i, --input PATH Path to input file in FASTA format [required]
-o, --output PATH Output directory [default: output.dnaapler]
-t, --threads INTEGER Number of threads to use with BLAST [default: 1]
-p, --prefix TEXT Prefix for output files [default: dnaapler]
-f, --force Force overwrites the output directory
-e, --evalue TEXT e value for blastx [default: 1e-10]
-m, --mode TEXT Choose an mode to reorient in bulk. Must be one of:
chromosome, plasmid, phage or custom [default:
chromosome]
-c, --custom_db PATH FASTA file with amino acids that will be used as a
custom blast database to reorient your sequence
however you want. Must be specified if -m custom is
specified.