Arguments:
|
|
|
The basename of the genome index to be searched. The basename is the name of any of the index files up to but not including the first period.
Bowtie first looks in the current directory for the index files, then looks in the indexes
subdirectory under the directory where the
currently-running bowtie executable is located,
then looks in the directory specified in the
BOWTIE_INDEXES
(or BOWTIE2_INDEXES) environment variable.
Please note that it is highly recommended that a FASTA file with the sequence(s) the genome being indexed be present in the same
directory with the Bowtie index files and having the name .fa. If not present, TopHat will automatically
rebuild this FASTA file from the Bowtie index files.
|
|
A comma-separated list of files containing reads in FASTQ or FASTA format.
When running TopHat with paired-end reads, this should be the *_1 ("left")
set of files.
|
<[reads1_2,...readsN_2]>
|
A comma-separated list of files containing reads in FASTQ or FASTA format.
Only used when running TopHat with paired end reads, and contains the
*_2 ("right") set of files. The *_2 files MUST appear
in the same order as the *_1 files.
|
Options:
|
|
-h/--help
|
Prints the help message and exits
|
-v/--version
|
Prints the TopHat version number and exits
|
-N/--read-mismatches
|
Final read alignments having more than these many mismatches are discarded.
The default is 2.
|
--read-gap-length
|
Final read alignments having more than these many total length of gaps are discarded.
The default is 2.
|
--read-edit-dist
|
Final read alignments having more than these many edit distance are discarded.
The default is 2.
|
--read-realign-edit-dist
|
Some of the reads spanning multiple exons may be mapped incorrectly as a
contiguous alignment to the genome even though the correct alignment
should be a spliced one - this can happen in the presence of processed
pseudogenes that are rarely (if at all)
transcribed or expressed. This option can direct TopHat to re-align
reads for which the edit distance of an alignment obtained in a previous
mapping step is above or equal to
this option value. If you set this option to 0, TopHat will map
every read in all the mapping steps (transcriptome if you provided gene
annotations,
genome, and finally splice variants detected by TopHat), reporting the
best possible alignment found in any of these mapping steps.
This may greatly increase the mapping accuracy at the expense of an increase in running time.
The default value for this option is set such that TopHat will not try to realign reads already mapped in earlier steps.
|
--bowtie1
|
Uses Bowtie1 instead of Bowtie2.
If you use colorspace reads, you need to use this option
as Bowtie2 does not support colorspace reads.
|
-o/--output-dir
|
Sets the name of the directory in which TopHat will write all of its
output. The default is "./tophat_out".
|
-r/--mate-inner-dist
|
This is the expected (mean) inner distance between mate pairs. For,
example, for paired end runs with fragments selected at 300bp, where each
end is 50bp, you should set -r to be 200. The default is 50bp.
|
--mate-std-dev
|
The standard deviation for the distribution on inner distances between
mate pairs. The default is 20bp.
|
-a/--min-anchor-length
|
The "anchor length". TopHat will report junctions spanned by reads
with at least this many bases on each side of the junction. Note that
individual spliced alignments may span a junction with fewer than this
many bases on one side. However, every junction involved in spliced
alignments is supported by at least one read with this many bases on each
side. This must be at least 3 and the default is 8.
|
-m/--splice-mismatches
|
The maximum number of mismatches that may appear in the "anchor" region
of a spliced alignment. The default is 0.
|
-i/--min-intron-length
|
The minimum intron length. TopHat will ignore donor/acceptor pairs
closer than this many bases apart. The default is 70.
|
-I/--max-intron-length
|
The maximum intron length. When searching for junctions ab initio,
TopHat will ignore donor/acceptor pairs farther than this many bases
apart, except when such a pair is supported by a split segment alignment
of a long read. The default is 500000.
|
--max-insertion-length
|
The maximum insertion length. The default is 3.
|
--max-deletion-length
|
The maximum deletion length. The default is 3.
|
--solexa-quals
|
Use the Solexa scale for quality values in FASTQ files.
|
--solexa1.3-quals
|
As of the Illumina GA pipeline version 1.3, quality scores are encoded
in Phred-scaled base-64. Use this option for FASTQ files from pipeline 1.3 or later.
|
-Q/--quals
|
Separate quality value files - colorspace read files (CSFASTA) come with separate qual files.
|
--integer-quals
|
Quality values are space-delimited integer values, this becomes default when you specify -C/--color.
|
-C/--color
|
Colorspace reads, note that it uses a colorspace bowtie index and requires Bowtie 0.12.6 or higher.
Common usage: tophat --color --quals [other options]*
[reads1_2,...readsN_2]
[quals1_2,...qualsN_2]
|
-p/--num-threads
|
Use this many threads to align reads. The default is 1.
|
-g/--max-multihits
|
Instructs TopHat to allow up to this many alignments to the reference
for a given read, and choose the alignments based on their alignment
scores if there are more than this number.
The default is 20 for read mapping. Unless you use
--report-secondary-alignments, TopHat will report the alignments with
the best alignment score.
If there are more alignments with the same score than this number,
TopHat will randomly report only this many alignments.
In case of using --report-secondary-alignments, TopHat will try to
report alignments up to this option value, and TopHat may randomly
output some of the alignments with the same score to meet this number.
|
--report-secondary-alignments
| By default TopHat reports best or primary alignments based on alignment scores (AS). Use this option
if you want to output additional or secondary alignments (up to
20 alignments will be reported this way, this limit can be changed by
using the -g/--max-multihits option above).
|
--no-discordant
| For paired reads, report only concordant mappings. |
--no-mixed
| For paired reads, only report read alignments
if both reads in a pair can be mapped (by default, if TopHat cannot find
a concordant or discordant alignment for both reads in a pair, it will find and report
alignments for each read separately; this option disables that
behavior).
|
--no-coverage-search
|
Disables the coverage based search for junctions.
|
--coverage-search
|
Enables the coverage based search for junctions. Use when coverage search
is disabled by default (such as for reads 75bp or longer), for maximum sensitivity.
|
--microexon-search
|
With this option, the pipeline will attempt to find alignments incident
to micro-exons. Works only for reads 50bp or longer.
|
--library-type
|
The default is unstranded (fr-unstranded). If either fr-firststrand or fr-secondstrand is specified, every read alignment will have an XS attribute tag as explained below. Consider supplying library type options below to select the correct RNA-seq protocol.
|