Randomly select variant records according to specified options
java -jar GenomeAnalysisTK.jar -T ValidationSiteSelectorWalker -R reference.fasta -V:foo input1.vcf -V:bar input2.vcf --numValidationSites 200 -sf samples.txt -o output.vcf -sampleMode POLY_BASED_ON_GT -freqMode UNIFORM -selectType INDEL
Argument name(s) | Default value | Summary | |
---|---|---|---|
Required Inputs | |||
--variant  -V | NA | Input VCF file, can be specified multiple times | |
Required Parameters | |||
--numValidationSites  -numSites | 0 | Number of output validation sites | |
Optional Inputs | |||
--sample_file  -sf | NA | File containing a list of samples (one per line) to include. Can be specified multiple times | |
Optional Outputs | |||
--out  -o | stdout | File to which variants should be written | |
Optional Parameters | |||
--frequencySelectionMode  -freqMode | KEEP_AF_SPECTRUM | Allele Frequency selection mode | |
--sample_expressions  -se | NA | Regular expression to select many samples from the ROD tracks provided. Can be specified multiple times | |
--sample_name  -sn | [] | Include genotypes from this sample. Can be specified multiple times | |
--sampleMode | NONE | Sample selection mode | |
--samplePNonref | 0.99 | GL-based selection mode only: the probability that a site is non-reference in the samples for which to include the site | |
--selectTypeToInclude  -selectType | [] | Select only a certain type of variants from the input file. Valid types are INDEL, SNP, MIXED, MNP, SYMBOLIC, NO_VARIATION. Can be specified multiple times | |
Optional Flags | |||
--ignoreGenotypes | false | If true, will ignore genotypes in VCF, will take AC,AF from annotations and will make no sample selection | |
--ignorePolymorphicStatus | false | If true, will ignore polymorphic status in VCF, and will take VCF record directly without pre-selection | |
--includeFilteredSites  -ifs | false | If true, will include filtered sites in set to choose variants from |