Category
Mapping
Usage
STAR --runMode genomeGenerate --option1-name option1-value(s) ...
Manual
This document is generated with STAR 2.7.1a.
This command generates indexes for STAR
to align reads to the genome.
Parameters
Run Parameters
- --runThreadN int: number of threads to run STAR (default 1)
- --runDirPerm string: permissions for the directories created at the run-time.
- User_RWX: user-read/write/execute (default)
- All_RWX: all-read/write/execute (same as chmod 777)
- --runRNGseed int: random number generator seed. (default: 777)
Genome Parameters
- --genomeDir string: path to the directory where genome files are stored. (default: ./GenomeDir/)
- --genomeFastaFiles string(s): path(s) to the fasta files with the genome sequences, separated by spaces. These files should be plain text FASTA files, they *cannot* be zipped. Required for the genome generation (
--runMode genomeGenerate
). Can also be used in the mapping (--runMode alignReads
) to add extra (new) sequences to the genome (e.g. spike-ins). (default: -)
--genomeConsensusFile string: VCF file with consensus SNPs (i.e. alternative allele is the major (AF>0.5) allele). Deprecated since 2.7.7a. Use --genomeTransformVCF and --genomeTransformType options instead.
Genome Indexing Parameters
- --genomeChrBinNbits int: =log2(chrBin), where chrBin is the size of the bins for genome storage: each chromosome will occupy an integer number of bins. For a genome with large number of contigs, it is recommended to scale this parameter as min(18, log2[max(GenomeLength/NumberOfReferences,ReadLength)]). (default: 18)
- --genomeSAindexNbases int: length (bases) of the SA pre-indexing string. Typically between 10 and 15. Longer strings will use much more memory, but allow faster searches. For small genomes, the parameter --genomeSAindexNbases must be scaled down to min(14, log2(GenomeLength)/2 - 1). (default: 14)
- --genomeSAsparseD int>0: suffux array sparsity, i.e. distance between indices: use bigger numbers to decrease needed RAM at the cost of mapping speed reduction. (default: 1)
- --genomeSuffixLengthMax int: maximum length of the suffixes, has to be longer than read length. -1 = infinite. (default: -1)
Splice Junctions Database
- --sjdbFileChrStartEnd string(s): path to the files with genomic coordinates (chr start end strand) for the splice junction introns. Multiple files can be supplied wand will be concatenated. (default: -)
- --sjdbGTFfile string: path to the GTF file with annotations. (default: -)
- --sjdbGTFchrPrefix string: prefix for chromosome names in a GTF file (e.g. 'chr' for using ENSMEBL annotations with UCSC genomes). (default: -)
- --sjdbGTFfeatureExon string: feature type in GTF file to be used as exons for building transcripts. (default: exon)
- --sjdbGTFtagExonParentTranscript string: GTF attribute name for parent transcript ID (default "transcript_id" works for GTF files) (default: transcript_id)
- --sjdbGTFtagExonParentGene string: GTF attribute name for parent gene ID (default "gene_id" works for GTF files) (default: gene_id)
- --sjdbGTFtagExonParentGeneName string(s): GTF attrbute name for parent gene name. (default: gene_name)
- --sjdbGTFtagExonParentGeneType string(s): GTF attrbute name for parent gene type. (default: gene_type gene_biotype)
- --sjdbOverhang int>0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate_length - 1). (default: 100)
- --sjdbScore int: extra alignment score for alignmets that cross database junctions. (default: 2)
- --sjdbInsertSave string: which files to save when sjdb junctions are inserted on the fly at the mapping step
- Basic: only small junction / transcript files (default)
- All: all files including big Genome, SA and SAindex - this will create a complete genome directory
Variation parameters
- --varVCFfile string: path to the VCF file that contains variation data. (default: -)
Limits
- --limitGenomeGenerateRAM int>0: maximum available RAM (bytes) for genome generation (default: 31000000000)
- --limitIObufferSize int>0: max available buffers size (bytes) for input/output, per thread (default: 150000000)
Output: general
- --outFileNamePrefix string: output files name prefix (including full or relative path). Can only be defined on the command line. (default: ./)
- --outTmpDir string: path to a directory that will be used as temporary by STAR. All contents of this directory will be removed! the temp directory will default to outFileNamePrefix_STARtmp. (default: -)
- --outTmpKeep string: whether to keep the tempporary files after STAR runs is finished
- None: remove all temporary files (default)
- All: keep all files
Windows, Anchors, Binning
- --winAnchorMultimapNmax int>0: max number of loci anchors are allowed to map to. (default: 50)
- --winBinNbits int>0: =log2(winBin), where winBin is the size of the bin for the windows/clustering, each window will occupy an integer number of bins. (default: 16)
- --winAnchorDistNbins int>0: max number of bins between two anchors that allows aggregation of anchors into one window. (default: 9)
- --winFlankNbins int>0: log2(winFlank), where win Flank is the size of the left and right flanking regions for each window. (default: 4)
- --winReadCoverageRelativeMin real>=0: minimum relative coverage of the read sequence by the seeds in a window, for STARlong algorithm only. (default: 0.5)
- --winReadCoverageBasesMin int>0: minimum number of bases covered by the seeds in a window , for STARlong algorithm only. (default: 0)
Miscs
- --versionGenome string: earliest genome index version compatible with this STAR release. Please do not change this value!
- --parametersFiles string: name of a user-defined parameters file, "-": none. Can only be defined on the command line. (default: -)
- --sysShell string: path to the shell binary, preferably bash, e.g. /bin/bash. the default shell is executed, typically /bin/sh. This was reported to fail on some Ubuntu systems - then you need to specify path to bash.
Notes
- STAR indexes generated by one version may not work with STAR of a different version!
File formats this tool works with
Share your experience or ask a question