Genome Variant Analysis

java -jar GenomeAnalysisTK.jar
Function: Perform joint genotyping on gVCF files produced by HaplotypeCaller
Usage: java -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -R reference.fasta --variant sample1.g.vcf --variant sample2.g.vcf -o output.vcf
java -jar GenomeAnalysisTK.jar
Function: Select a subset of variants from a larger callset
Usage: java -jar GenomeAnalysisTK.jar -R ref.fasta -T SelectVariants --variant input.vcf -o output.vcf -se 'SAMPLE.+PARC' -select "QD > 10.0" -invertSelect
java -jar GenomeAnalysisTK.jar
Function: Select a subset of variants from a larger callset
Usage: java -jar GenomeAnalysisTK.jar -T SelectVariants -R reference.fasta -V hapmap.vcf --discordance myCalls.vcf -o output.vcf -sn mySample
java -jar GenomeAnalysisTK.jar
Function: Compare callability statistics
Usage: java -jar GenomeAnalysisTK.jar -R reference.fasta -T CompareCallableLoci -comp1 callable_loci_1.bed -comp2 callable_loci_2.bed [-L input.intervals \] -o comparison.table
java -jar GenomeAnalysisTK.jar
Function: Split a BAM file by sample
Usage: java -jar GenomeAnalysisTK.jar -T SplitSamFile -R reference.fasta -I input.bam --outputRoot myproject_
java -jar GenomeAnalysisTK.jar
Function: Collect quality metrics for a set of intervals
Usage: java -jar GenomeAnalysisTK.jar -T QualifyMissingIntervals -R reference.fasta -I input.bam -o output.grp -L input.intervals -cds cds.intervals -targets targets.intervals
vt
Function: for comparison purposes, it's very useful to normalize the vcf output, especially for more complex graphs which can make large variant blocks that contain a lot of reference bases (Note: requires [vt](http://genome.sph.umich.edu/wiki/Vt)):
Usage: vt decompose_blocksub -a calls.vcf | vt normalize -r FASTA_FILE - > calls.clean.vcf
java -jar GenomeAnalysisTK.jar
Function: Calculate genotype posterior likelihoods given panel data
Usage: java -jar GenomeAnalysisTK.jar -T CalculateGenotypePosteriors -R reference.fasta -V NA12878.wgs.HC.vcf -supporting 1000G_EUR.genotypes.combined.vcf -o NA12878.wgs.HC.posteriors.vcf
java -jar GenomeAnalysisTK.jar
Function: Select a subset of variants from a larger callset
Usage: java -jar GenomeAnalysisTK.jar -R ref.fasta -T SelectVariants --variant input.vcf -o output.vcf -xl_sn SAMPLE_1_PARC -xl_sn SAMPLE_1_ACTG -xl_se 'SAMPLE.+PARC'
java -jar GenomeAnalysisTK.jar
Function: Annotate variant calls with context information
Usage: java -jar GenomeAnalysisTK.jar -R reference.fasta -T VariantAnnotator -I input.bam -V input.vcf -o output.vcf -A Coverage -L input.vcf --dbsnp dbsnp.vcf
java -jar GenomeAnalysisTK.jar
Function: Genotype concordance between two callsets
Usage: java -jar GenomeAnalysisTK.jar -T GenotypeConcordance -R reference.fasta -eval test_set.vcf -comp truth_set.vcf -o output.grp
GEMINI region
Function: Extracting variants from specific regions.
Usage: gemini region --reg chr1:100-200 my.db
java -jar GenomeAnalysisTK.jar
Function: Analyze coverage distribution and validate read mates per interval and per sample
Usage: java -jar GenomeAnalysisTK.jar -T DiagnoseTargets -R reference.fasta -I sample1.bam -I sample2.bam -I sample3.bam -L intervals.interval_list -o output.vcf
java -jar GenomeAnalysisTK.jar
Function: Count contiguous regions in an interval list
Usage: java -jar GenomeAnalysisTK.jar -T CountIntervals -R reference.fasta -o output.txt -check intervals.list
java -jar GenomeAnalysisTK.jar
Function: Combine variant records from different sources
Usage: java -jar GenomeAnalysisTK.jar -T CombineVariants -R reference.fasta --variant:foo input1.vcf --variant:bar input2.vcf -o output.vcf -genotypeMergeOptions PRIORITIZE -priority foo,bar