Software Usage Function
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -R reference.fasta -T VariantAnnotator -V input.vcf -o output.vcf --resource:foo resource.vcf --expression foo.AF --expression foo.FILTER Annotate variant calls with context information
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T GCContentByInterval -R reference.fasta -o output.txt -L input.intervals Calculates the GC content of the reference sequence for each interval
vt vt decompose_blocksub -a calls.vcf | vt normalize -r FASTA_FILE - > calls.clean.vcf for comparison purposes, it's very useful to normalize the vcf output, especially for more complex graphs which can make large variant blocks that contain a lot of reference bases (Note: requires [vt](http://genome.sph.umich.edu/wiki/Vt)):
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T ValidateVariants -R reference.fasta -V input.vcf --dbsnp dbsnp.vcf Validate a VCF file with an extra strict set of criteria
read_NVC.py read_NVC.py -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam -o output This module is used to check the nucleotide composition bias. Due to random priming, certain patterns are over represented at the beginning (5’end) of reads. This bias could be easily examined by NVC (Nucleotide versus cycle) plot. NVC plot is generated by overlaying all reads together, then calculating nucleotide composition for each position of read (or each sequencing cycle). In ideal condition (genome is random and RNA-seq reads is randomly sampled from genome), we expect A%=C%=G%=T%=25% at each position of reads.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T SimulateReadsForVariants -R reference.fasta -V input_variants.vcf -o simulated_reads.bam --readDepth 50 --errorRate 25 Generate simulated reads for variants
GEMINI autosomal recessive gemini autosomal_recessive test.auto_rec.db --columns "chrom,start,end,gene" Find variants meeting an autosomal recessive model
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T LeftAlignAndTrimVariants -R reference.fasta --variant input.vcf -o output.vcf --reference_window_stop 208 Left-align indels in a variant callset
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T VariantsToBinaryPed -R reference.fasta -V variants.vcf -m metadata.fam -bed output.bed -bim output.bim -fam output.fam Convert VCF to binary pedigree file
VarScan java -jar VarScan.jar compare [file1] [file2] [type] [output] OPTIONS Performs set-comparison operations on two files of variants.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T CountMales -R reference.fasta -I samples.bam -o output.txt Count the number of reads seen from male samples
VarScan java -jar VarScan.jar filter [variants file] OPTIONS Filter variants in a file by coverage, supporting reads, variant frequency, or average base quality. It is for use with output from pileup2snp or pileup2indel.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R myrefernce.fasta -before recal2.table -after recal3.table -plots recalQC.pdf Create plots to visualize base recalibration results
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T VariantsToAllelicPrimitives -R reference.fasta -V input.vcf -o output.vcf Simplify multi-nucleotide variants (MNPs) into more basic/primitive alleles.
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R reference.fasta -I myinput.bam -knownSites bundle/my-trusted-snps.vcf \ # optional but recommended -knownSites bundle/my-trusted-indels.vcf \ # optional but recommended -o firstpass.table # Generate the second pass recalibration table file java -jar GenomeAnalysisTK.jar -T BaseRecalibrator -R reference.fasta -I myinput.bam -knownSites bundle/my-trusted-snps.vcf -knownSites bundle/my-trusted-indels.vcf -BQSR firstpass.table -o secondpass.table # Finally generate the plots and also keep a copy of the csv (optional) java -jar GenomeAnalysisTK.jar -T AnalyzeCovariates -R reference.fasta -before firstpass.table -after secondpass.table -csv BQSR.csv \ # optional -plots BQSR.pdf Create plots to visualize base recalibration results
GEMINI comp_hets gemini comp_hets my.db --columns "chrom, start, end" test.comp_het_default.2.db Identifying potential compound heterozygotes
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R reference.fasta -I input.bam --known indels.vcf -o forIndelRealigner.intervals Define intervals to target for local realignment
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T ValidateVariants -R reference.fasta -V input.vcf --validationTypeToExclude ALL Validate a VCF file with an extra strict set of criteria
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T FastaAlternateReferenceMaker -R reference.fasta -o output.fasta -L input.intervals -V input.vcf [--snpmask mask.vcf] Generate an alternative reference sequence over the specified interval
java -jar GenomeAnalysisTK.jar java -jar GenomeAnalysisTK.jar -T CheckPileup -R reference.fasta -I your_data.bam --pileup:SAMPileup pileup_file.txt -L chr1:257-275 -o output_file_name Compare GATK's internal pileup to a reference Samtools pileup