java -jar picard.jar |
java -jar picard.jar CollectRrbsMetrics R=reference_sequence.fasta I=input.bam M=basename_for_metrics_files |
Collects metrics from reduced representation bisulfite sequencing (Rrbs) data. |
java -jar picard.jar |
java -jar picard.jar VcfToIntervalList |
Converts a VCF or BCF file to a Picard Interval List. |
bamtools |
bamtools merge -in input_alignments_1.bam -in input_alignments_2.bam -in input_alignments_3.bam -out output_alignments_merged.bam |
Merge multiple BAM files into one |
java -jar picard.jar |
java -jar picard.jar SetNmMDAndUqTags I=sorted.bam O=fixed.bam \ |
Fixes the NM, MD, and UQ tags in a SAM file. This tool takes in a SAM or BAM file (sorted by coordinate) and calculates the NM, MD, and UQ tags by comparing with the reference.This may be needed when MergeBamAlignment was run with SORT_ORDER different from 'coordinate' and thus could not fix
these tags then. |
java -jar picard.jar |
java -jar picard.jar PositionBasedDownsampleSam |
Class to downsample a BAM file while respecting that we should either get rid of both ends of a pair or neither
end of the pair. In addition, this program uses the read-name and extracts the position within the tile whence
the read came from. The downsampling is based on this position. Results with the exact same input will produce the
same results.
Note 1: This is technology and read-name dependent. If your read-names do not have coordinate information, or if your
BAM contains reads from multiple technologies (flowcell versions, sequencing machines) this will not work properly.
This has been designed with Illumina MiSeq/HiSeq in mind.
Note 2: The downsampling is not random. It is deterministically dependent on the position of the read within its tile.
Note 3: Downsampling twice with this program is not supported.
Note 4: You should call MarkDuplicates after downsampling.
Finally, the code has been designed to simulate sequencing less as accurately as possible, not for getting an exact downsample
fraction. In particular, since the reads may be distributed non-evenly within the lanes/tiles, the resulting downsampling
percentage will not be accurately determined by the input argument FRACTION. |
java -jar picard.jar |
java -jar picard.jar RevertSam I=input.bamO=reverted.bam |
Reverts SAM or BAM files to a previous state. This tool removes or restores certain properties of the SAM records, including alignment information, which can be used to produce an unmapped BAM (uBAM) from a previously aligned BAM. It is also capable of restoring the original quality scores of a BAM file that has already undergone base quality score recalibration (BQSR) if theoriginal qualities were retained. |
java -jar picard.jar |
java -jar picard.jar CheckIlluminaDirectory BASECALLS_DIR=/BaseCalls/ READ_STRUCTURE=25T8B25T LANES=1 DATA_TYPES=BaseCalls |
Asserts the validity for specified Illumina basecalling data. |
java -jar picard.jar |
java -jar picard.jar BamIndexStats I=input.bam O=output |
Generate index statistics from a BAM fileThis tool calculates statistics from a BAM index (.bai) file, emulating the behavior of the "samtools idxstats" command. The statistics collected include counts of aligned and unaligned reads as well as all records with no start coordinate. The input to the tool is the BAM file name but it must be accompanied by a corresponding index file. |
java -jar picard.jar |
java -jar picard.jar CollectWgsMetricsWithNonZeroCoverage I=input.bam O=collect_wgs_metrics.txt CHART=collect_wgs_metrics.pdf R=reference_sequence.fasta |
Collect metrics about coverage and performance of whole genome sequencing (WGS) experiments. This tool collects metrics about the percentages of reads that pass base- and mapping- quality filters as well as coverage (read-depth) levels. Both minimum base- and mapping-quality values as well as the maximum read depths (coverage cap) are user defined. This extends CollectWgsMetrics by including metrics related only to siteswith non-zero (>0) coverage. |
samtools targetcut |
samtools targetcut [-Q minBaseQ] [-i inPenalty] [-0 em0] [-1 em1] [-2 em2] [-f ref] <in.bam> |
This command identifies target regions by examining the continuity of read depth, computes haploid consensus sequences of targets and outputs a SAM with each sequence corresponding to a target. When option -f is in use, BAQ will be applied. This command is only designed for cutting fosmid clones from fosmid pool sequencing [Ref. Kitzman et al. (2010)]. |
bamtools |
bamtools filter -in <BAM file> -out <BAM file> -length 100 |
filters BAM file(s) |
samtools addreplacerg |
samtools addreplacerg [-r rg line | -R rg ID] [-m mode] [-l level] [-o out.bam] <input.bam> |
Adds or replaces read group tags in a file. |
bamtools |
bamtools sort -in input_alignments.bam -out output_alignments_sorted.bam -byname |
The command bamtools sort sorts a BAM file according to a given option. Output_alignments_sorted.bam is the resulting file, where the alignments are sorted by name. |
java -jar picard.jar |
java -jar picard.jar ScatterIntervalsByNs R=reference_sequence.fasta OT=BOTH O=output.interval_list |
Writes an interval list based on splitting a reference by Ns. This tool identifies positions in a reference where the bases are 'no-calls' and writes out an interval-list using the resulting coordinates. This can be used to create an interval list for whole genome sequence (WGS) for e.g. scatter-gather purposes, as an alternative to using fixed-length intervals. The number of contiguous nocalls that can be tolerated before creating a break is adjustable from the command line. |
java -jar picard.jar |
java -jar picard.jar CollectSequencingArtifactMetrics I=input.bamO=artifact_metrics.txtR=reference_sequence.fasta |
Collect metrics to quantify single-base sequencing artifacts. |
java -jar picard.jar |
java -jar picard.jar VcfFormatConverter I=input.vcf O=output.bcf REQUIRE_INDEX=true |
Converts VCF to BCF or BCF to VCF. This tool converts files between the plain-text VCF format and its binary compressed equivalent, BCF. Input and output formats are determined by file extensions specified in the file names. For best results, it is recommended to ensure that an index file is present and set the REQUIRE_INDEX option to true. |
java -jar picard.jar |
java -jar picard.jar CollectQualityYieldMetrics I=input.bam O=quality_yield_metrics.txt \ |
Collect metrics about reads that pass quality thresholds and Illumina-specific filters. This tool evaluates the overall quality of reads within a bam file containing one read group. The output indicates the total numbers of bases within a read group that pass a minimum base quality score threshold and (in the case of Illumina data) pass Illumina quality filters as described in the GATK Dictionary entry. |
java -jar picard.jar |
java -jar picard.jar AddOrReplaceReadGroups I=input.bam O=output.bam RGID=4 RGLB=lib1 RGPL=illumina RGPU=unit1 RGSM=20 |
Replace read groups in a BAM file.This tool enables the user to replace all read groups in the INPUT file with a single new read group and assign all reads to this read group in the OUTPUT BAM file.For more information about read groups, see the GATK Dictionary entry. This tool accepts INPUT BAM and SAM files or URLs from the Global Alliance for Genomics and Health (GA4GH) (see http://ga4gh.org/#/documentation). |
java -jar picard.jar |
java -jar picard.jar CollectWgsMetricsFromSampledSites |
Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing experiments, but only at a set of sampled positions. It is important that the sampled positions be chosen so that they are spread out at least further than a read's length apart; otherwise, you run the risk of double-counting reads in the metrics. If contig-sized intervals are needed, use INTERVALS argument in CollectWgsMetrics. |
java -jar picard.jar |
java -jar picard.jar CollectWgsMetricsFromQuerySorted |
Computes a number of metrics that are useful for evaluating coverage and performance of sequencing experiments. |