Category

Genome Variant Analysis


Usage

java -Xmx4g -jar GenomeAnalysisTK.jar -T VariantRecalibrator -R reference.fasta -input raw_variants.withASannotations.vcf -AS -resource:hapmap,known=false,training=true,truth=true,prior=15.0 hapmap_3.3.b37.sites.vcf -resource:omni,known=false,training=true,truth=false,prior=12.0 1000G_omni2.5.b37.sites.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 1000G_phase1.snps.high_confidence.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 dbsnp_135.b37.vcf -an QD -an MQ -an MQRankSum -an ReadPosRankSum -an FS -an SOR -an InbreedingCoeff -mode SNP -recalFile output.AS.recal -tranchesFile output.AS.tranches -rscriptFile output.plots.AS.R


Manual

Argument name(s)Default valueSummary
Required Inputs
--input
NAOne or more VCFs of raw input variants to be recalibrated
--resource
[]A list of sites for which to apply a prior probability of being correct but which aren't used by the algorithm (training and truth sets are required to run)
Required Outputs
--recal_file
 -recalFile
NAThe output recal file used by ApplyRecalibration
--tranches_file
 -tranchesFile
NAThe output tranches file used by ApplyRecalibration
Required Parameters
--mode
SNPRecalibration mode to employ
--use_annotation
 -an
[]The names of the annotations which should used for calculations
Optional Inputs
--aggregate
NAAdditional raw input variants to be used in building the model
Optional Outputs
--model_file
 -modelFile
stdoutA GATKReport containing the positive and negative model fits
--rscript_file
 -rscriptFile
NAThe output rscript file generated by the VQSR to aid in visualization of the input data and learned model
Optional Parameters
--ignore_filter
 -ignoreFilter
[]If specified, the variant recalibrator will also use variants marked as filtered by the specified filter name in the input VCF file
--target_titv
 -titv
2.15The expected novel Ti/Tv ratio to use when calculating FDR tranches and for display on the optimization curve output figures. (approx 2.15 for whole genome experiments). ONLY USED FOR PLOTTING PURPOSES!
--TStranche
 -tranche
[100.0, 99.9, 99.0, 90.0]The levels of truth sensitivity at which to slice the data. (in percent, that is 1.0 for 1 percent)
Optional Flags
--ignore_all_filters
 -ignoreAllFilters
falseIf specified, the variant recalibrator will ignore all input filters. Useful to rerun the VQSR from a filtered output file.
--output_model
 -outputModel
falseIf specified, the variant recalibrator will output the VQSR model fit to the file specified by -modelFile or to stdout
--useAlleleSpecificAnnotations
 -AS
falseIf specified, the variant recalibrator will attempt to use the allele-specific versions of the specified annotations.
Advanced Parameters
--badLodCutoff
-5.0LOD score cutoff for selecting bad variants
--dirichlet
0.001The dirichlet parameter in the variational Bayes algorithm.
--max_attempts
1Number of attempts to build a model before failing
--maxGaussians
 -mG
8Max number of Gaussians for the positive model
--maxIterations
 -mI
150Maximum number of VBEM iterations
--maxNegativeGaussians
 -mNG
2Max number of Gaussians for the negative model
--maxNumTrainingData
2500000Maximum number of training data
--minNumBadVariants
 -minNumBad
1000Minimum number of bad variants
--MQCapForLogitJitterTransform
 -MQCap
0Apply logit transform and jitter to MQ values
--numKMeans
 -nKM
100Number of k-means iterations
--priorCounts
20.0The number of prior counts to use in the variational Bayes algorithm.
--shrinkage
1.0The shrinkage parameter in the variational Bayes algorithm.
--stdThreshold
 -std
10.0Annotation value divergence threshold (number of standard deviations from the means)
Advanced Flags
--trustAllPolymorphic
 -allPoly
falseTrust that all the input training sets' unfiltered records contain only polymorphic sites to drastically speed up the computation.


Share your experience or ask a question