Genome Variant Analysis


java -jar GenomeAnalysisTK.jar -T MuTect2 -R reference.fasta -I:tumor tumor.bam -I:normal normal.bam [--dbsnp dbSNP.vcf] [--cosmic COSMIC.vcf] [-L targets.interval_list] -o output.vcf


Argument name(s)Default valueSummary
Optional Inputs
noneSet of alleles to use in genotyping
[]VCF file of COSMIC sites
nonedbSNP file
[]VCF file of sites observed in normal
Optional Outputs
NAOutput the active region to this IGV formatted file
NAOutput the raw activity profile results in IGV format
NAWrite debug assembly graph information to this file
stdoutFile to which variants should be written
Optional Parameters
0.0Fraction of contamination to aggressively remove
5.5LOD threshold for calling normal non-variant at dbsnp sites
NAtrace this read name through the calling process
DISCOVERYSpecifies how to determine the alternate alleles to use for genotyping
[]One or more classes/groups of annotations to apply to variant calls
0.001Heterozygosity value used to compute prior likelihoods for any locus
0.01Standard deviation of eterozygosity for SNP and indel calling.
1.25E-4Heterozygosity for indel calling
0.5Initial LOD threshold for calling normal variant
4.0Initial LOD threshold for calling tumor variant
0.03Threshold for maximum alternate allele fraction in normal
1Threshold for maximum alternate allele counts in normal
20Threshold for maximum alternate allele quality score sum in normal
1000Maximum reads in an active region
10Minimum base quality required to consider a base for calling
5Minimum number of reads sharing the same alignment start for each genomic location in an active region
2.2LOD threshold for calling normal non-germline
3.0threshold for clustered read position artifact MAD
10.0threshold for clustered read position artifact median
30Phred scale quality score constant to use in power calculations
2Ploidy per sample. For pooled data, set to (Number of samples in each pool * Sample Ploidy).
10.0The minimum phred-scaled confidence threshold at which variants should be called
6.3LOD threshold for calling tumor variant
Optional Flags
falseAnnotate number of alleles observed
falseturn on clustered read position filter
falseturn on strand artifact filter
falseUse new AF model instead of the so-called exact model
Advanced Inputs
NAUse this interval list file as the active regions to process
[]comparison VCF file
Advanced Outputs
NAFile to which assembled haplotypes should be written
Advanced Parameters
0.002Threshold for the probability of a profile state being active.
NAThe active region extension; if not provided defaults to Walker annotated default
NAThe active region maximum size; if not provided defaults to Walker annotated default
[DepthPerAlleleBySample, BaseQualitySumPerAlleleBySample, TandemRepeatAnnotator, OxoGReadCounts]One or more specific annotations to apply to variant calls
CALLED_HAPLOTYPESWhich haplotypes should be written to the BAM
NAThe sigma of the band pass filter Gaussian kernel; if not provided defaults to Walker annotated default
NAContamination per sample
NONEMode for emitting reference confidence scores
[SpanningDeletions]One or more specific annotations to exclude
10Flat gap continuation penalty for use in the Pair HMM
[]Input prior for calls
[10, 25]Kmer size to use in the read threading assembler
6Maximum number of alternate alleles to genotype
1024Maximum number of genotypes to consider at any site
100Maximum number of PL values to output
128Maximum number of haplotypes to consider for your population
30000Maximum reads per sample given to traversal map() function
10000000Maximum total reads given to traversal map() function
4Minimum length of a dangling branch to attempt recovery
2Minimum support to not prune paths in the graph
1Number of samples that must pass the minPruning threshold
EMIT_VARIANTS_ONLYWhich type of calls we should output
45The global assumed mismapping rate for reads
Advanced Flags
falseAllow graphs that have non-unique kmers in the reference
falseAnnotate all sites with PLs
falseEnable artifact detection for creating panels of normals
false1000G consensus mode
falsePrint out very verbose debug information about each triggering active region
falseDon't skip calculations in ActiveRegions with no variants
falseDisable physical phasing
falseDisable iterating over kmer sizes when graph cycles are detected
falseIf specified, we will not trim down the active region from the full region (active + extension) to just the active interval for genotyping
falseIf specified, we will not analyze soft clipped bases in the reads
falseEmit reads that are dropped for filtering, trimming, realignment failure
falseIf provided, all bases will be tagged as active
falsePrint out very verbose M2 debug information
falseUse the contamination-filtered read maps for the purposes of annotating variants

Share your experience or ask a question