Software Usage Function
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T LeftAlignAndTrimVariants -R reference.fasta --variant input.vcf -o output.vcf --splitMultiallelics Left-align indels in a variant callset
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T VariantsToVCF -R reference.fasta -o output.vcf --variant:RawHapMap input.hapmap Convert variants from other file formats to VCF format
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T FastaStats -R reference.fasta [-o output.txt] Calculate basic statistics about the reference sequence itself
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T LeftAlignAndTrimVariants -R reference.fasta --variant input.vcf -o output.vcf Left-align indels in a variant callset
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T SelectHeaders -R reference.fasta -V input.vcf -o output.vcf -hn FILTER -hn FORMAT -hn INFO Selects headers from a VCF source
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T ValidationSiteSelectorWalker -R reference.fasta -V input1.vcf -V input2.vcf -sn NA12878 -o output.vcf --numValidationSites 200 -sampleMode POLY_BASED_ON_GT -freqMode KEEP_AF_SPECTRUM Randomly select variant records according to specified options
GEMINI interactions gemini interactions -g CTBP2 -r 3 example.db Find genes among variants that are interacting partners.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T CountRODs -R reference.fasta -o output.txt --rod input.vcf Count the number of ROD objects encountered
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T GenotypeConcordance -R reference.fasta -eval test_set.vcf -comp truth_set.vcf -o output.grp Genotype concordance between two callsets
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R myrefernce.fasta -BQSR myrecal.table -plots BQSR.pdf Create plots to visualize base recalibration results
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -R reference.fasta -T CountBases -I input.bam [-L input.intervals] Count the number of bases in a set of reads
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T PrintReads -R reference.fasta -I input1.bam -I input2.bam -o output.bam --read_filter MappingQualityZero // Prints the first 2000 reads in the BAM file java -jar GenomeAnalysisTK.jar -T PrintReads -R reference.fasta -I input.bam -o output.bam -n 2000 // Downsamples BAM file to 25% java -jar GenomeAnalysisTK.jar -T PrintReads -R reference.fasta -I input.bam -o output.bam -dfrac 0.25 Write out sequence read data (for filtering, merging, subsetting etc)
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -R ref.fasta -T SelectVariants --variant input.vcf --maxFilteredGenotypes 5 --minFilteredGenotypes 2 --maxFractionFilteredGenotypes 0.60 --minFractionFilteredGenotypes 0.10 Select a subset of variants from a larger callset
VarScan java -jar VarScan.jar copynumber [normal_pileup] [tumor_pileup] [output] OPTIONS Call variants and identifies their somatic status (Germline/LOH/Somatic) using pileup files from a matched tumor-normal pair.
VarScan java -jar VarScan.jar somaticFilter [mutations file] OPTIONS Filter somatic mutation calls to remove clusters of false positives and SNV calls near indels. Note: this is a basic filter. More advanced filtering strategies consider mapping quality, read mismatches, soft-trimming, and other factors when deciding whether or not to filter a variant.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T ErrorRatePerCycle -R reference.fasta -I my_sequence_reads.bam -o error_rates.gatkreport.txt Compute the read error rate per position
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T LeftAlignAndTrimVariants -R reference.fasta --variant input.vcf -o output.vcf --dontTrimAlleles Left-align indels in a variant callset
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T ValidationSiteSelectorWalker -R reference.fasta -V:foo input1.vcf -V:bar input2.vcf --numValidationSites 200 -sf samples.txt -o output.vcf -sampleMode POLY_BASED_ON_GT -freqMode UNIFORM -selectType INDEL Randomly select variant records according to specified options
GEMINI autosomal_dominant gemini autosomal_dominant test.auto_dom.db --columns "chrom,start,end,gene" Find variants meeting an autosomal dominant model.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T LeftAlignAndTrimVariants -R reference.fasta --variant input.vcf -o output.vcf --splitMultiallelics --dontTrimAlleles --keepOriginalAC Left-align indels in a variant callset