Category

Mapping


Usage

STAR --runMode genomeGenerate --option1-name option1-value(s) ...


Manual

This document is generated with STAR 2.7.1a.

This command generates indexes for STAR to align reads to the genome.

Parameters

Run Parameters
  • --runThreadN int: number of threads to run STAR (default 1)
  • --runDirPerm string: permissions for the directories created at the run-time.
    • User_RWX: user-read/write/execute (default)
    • All_RWX: all-read/write/execute (same as chmod 777)
    • --runRNGseed int: random number generator seed. (default: 777)
Genome Parameters
  • --genomeDir string: path to the directory where genome files are stored. (default: ./GenomeDir/)
  • --genomeFastaFiles string(s): path(s) to the fasta files with the genome sequences, separated by spaces. These files should be plain text FASTA files, they *cannot* be zipped. Required for the genome generation (--runMode genomeGenerate). Can also be used in the mapping (--runMode alignReads) to add extra (new) sequences to the genome (e.g. spike-ins). (default: -)
  • --genomeConsensusFile string: VCF file with consensus SNPs (i.e. alternative allele is the major (AF>0.5) allele). Deprecated since 2.7.7a. Use --genomeTransformVCF and --genomeTransformType options instead.
Genome Indexing Parameters
  • --genomeChrBinNbits int: =log2(chrBin), where chrBin is the size of the bins for genome storage: each chromosome will occupy an integer number of bins. For a genome with large number of contigs, it is recommended to scale this parameter as min(18, log2[max(GenomeLength/NumberOfReferences,ReadLength)]). (default: 18)
  • --genomeSAindexNbases int: length (bases) of the SA pre-indexing string. Typically between 10 and 15. Longer strings will use much more memory, but allow faster searches. For small genomes, the parameter --genomeSAindexNbases must be scaled down to min(14, log2(GenomeLength)/2 - 1). (default: 14)
  • --genomeSAsparseD int>0: suffux array sparsity, i.e. distance between indices: use bigger numbers to decrease needed RAM at the cost of mapping speed reduction. (default: 1)
  • --genomeSuffixLengthMax int: maximum length of the suffixes, has to be longer than read length. -1 = infinite. (default: -1)
Splice Junctions Database
  • --sjdbFileChrStartEnd string(s): path to the files with genomic coordinates (chr start end strand) for the splice junction introns. Multiple files can be supplied wand will be concatenated. (default: -)
  • --sjdbGTFfile string: path to the GTF file with annotations. (default: -)
  • --sjdbGTFchrPrefix string: prefix for chromosome names in a GTF file (e.g. 'chr' for using ENSMEBL annotations with UCSC genomes). (default: -)
  • --sjdbGTFfeatureExon string: feature type in GTF file to be used as exons for building transcripts. (default: exon)
  • --sjdbGTFtagExonParentTranscript string: GTF attribute name for parent transcript ID (default "transcript_id" works for GTF files) (default: transcript_id)
  • --sjdbGTFtagExonParentGene string: GTF attribute name for parent gene ID (default "gene_id" works for GTF files) (default: gene_id)
  • --sjdbGTFtagExonParentGeneName string(s): GTF attrbute name for parent gene name. (default: gene_name)
  • --sjdbGTFtagExonParentGeneType string(s): GTF attrbute name for parent gene type. (default: gene_type gene_biotype)
  • --sjdbOverhang int>0: length of the donor/acceptor sequence on each side of the junctions, ideally = (mate_length - 1). (default: 100)
  • --sjdbScore int: extra alignment score for alignmets that cross database junctions. (default: 2)
  • --sjdbInsertSave string: which files to save when sjdb junctions are inserted on the fly at the mapping step
    • Basic: only small junction / transcript files (default)
    • All: all files including big Genome, SA and SAindex - this will create a complete genome directory
Variation parameters
  • --varVCFfile string: path to the VCF file that contains variation data. (default: -)
Limits
  • --limitGenomeGenerateRAM int>0: maximum available RAM (bytes) for genome generation (default: 31000000000)
  • --limitIObufferSize int>0: max available buffers size (bytes) for input/output, per thread (default: 150000000)
Output: general
  • --outFileNamePrefix string: output files name prefix (including full or relative path). Can only be defined on the command line. (default: ./)
  • --outTmpDir string: path to a directory that will be used as temporary by STAR. All contents of this directory will be removed! the temp directory will default to outFileNamePrefix_STARtmp. (default: -)
  • --outTmpKeep string: whether to keep the tempporary files after STAR runs is finished
    • None: remove all temporary files (default)
    • All: keep all files
Windows, Anchors, Binning
  • --winAnchorMultimapNmax int>0: max number of loci anchors are allowed to map to. (default: 50)
  • --winBinNbits int>0: =log2(winBin), where winBin is the size of the bin for the windows/clustering, each window will occupy an integer number of bins. (default: 16)
  • --winAnchorDistNbins int>0: max number of bins between two anchors that allows aggregation of anchors into one window. (default: 9)
  • --winFlankNbins int>0: log2(winFlank), where win Flank is the size of the left and right flanking regions for each window. (default: 4)
  • --winReadCoverageRelativeMin real>=0: minimum relative coverage of the read sequence by the seeds in a window, for STARlong algorithm only. (default: 0.5)
  • --winReadCoverageBasesMin int>0: minimum number of bases covered by the seeds in a window , for STARlong algorithm only. (default: 0)
Miscs
  • --versionGenome string: earliest genome index version compatible with this STAR release. Please do not change this value!
  • --parametersFiles string: name of a user-defined parameters file, "-": none. Can only be defined on the command line. (default: -)
  • --sysShell string: path to the shell binary, preferably bash, e.g. /bin/bash. the default shell is executed, typically /bin/sh. This was reported to fail on some Ubuntu systems - then you need to specify path to bash.

Notes

  • STAR indexes generated by one version may not work with STAR of a different version!

File formats this tool works with
FASTA

Share your experience or ask a question