
Genome Variant Analysis


java -jar GenomeAnalysisTK.jar -T ContEst -R reference.fasta -I:eval tumor.bam -I:genotype normal.bam --popFile populationAlleleFrequencies.vcf -L populationSites.interval_list [-L targets.interval_list] -isr INTERSECTION -o output.txt


Argument name(s)Default valueSummary
Required Inputs
NAthe variant file containing information about the population allele frequencies
Optional Inputs
nonethe genotype information for our sample
Optional Outputs
stdoutAn output file created by the walker. Will overwrite contents if file exists
Optional Parameters
NAWhere to write a full report about the loci we processed
0.95threshold for p(f>=0.5) to trim
HARD_THRESHOLDwhich approach should we take to getting the genotypes (only in array-free mode)
NAset to META (default), SAMPLE or READGROUP to produce per-bam, per-sample or per-lane estimates
NAwrite the likelihood values to the specified location
20threshold for minimum mapping quality score
20threshold for minimum base quality score
500what minimum number of bases do we need to see to call contamination in a lane / sample?
CEUevaluate contamination for just a single contamination population
0.1the degree of precision to which the contamination tool should estimate (e.g. the bin size)
unknownThe sample name; used to extract the correct genotypes from mutli-sample truth vcfs
0.01at most, what fraction of sites should be trimmed based on BETA_THRESHOLD
Optional Flags
falseshould we verify that the sample name is in the genotypes file?
Advanced Parameters
NAuse a constant epsilon (phred scale) for calculation
50what minimum depth is required to call a site in seq genotype mode
5.0the min log likelihood for UG to call a genotype
0.8the ratio of alt to other bases to call a site a hom non-ref variant
0minimum depth at a site to consider in calculation
0.0progressively trim from 0 to TRIM_FRACTION by this interval

Share your experience or ask a question