Sam/Bam Manipulation

java -jar picard.jar
Function: Converts VCF to BCF or BCF to VCF. This tool converts files between the plain-text VCF format and its binary compressed equivalent, BCF. Input and output formats are determined by file extensions specified in the file names. For best results, it is recommended to ensure that an index file is present and set the REQUIRE_INDEX option to true.
Usage: java -jar picard.jar VcfFormatConverter I=input.vcf O=output.bcf REQUIRE_INDEX=true
samtools idxstats
Function: Retrieve and print stats in the index file corresponding to the input file. Before calling idxstats, the input BAM file must be indexed by samtools index.
Usage: samtools idxstats aln.sorted.bam
java -jar picard.jar
Function: Replace read groups in a BAM file.This tool enables the user to replace all read groups in the INPUT file with a single new read group and assign all reads to this read group in the OUTPUT BAM file.For more information about read groups, see the GATK Dictionary entry. This tool accepts INPUT BAM and SAM files or URLs from the Global Alliance for Genomics and Health (GA4GH) (see
Usage: java -jar picard.jar AddOrReplaceReadGroups I=input.bam O=output.bam RGID=4 RGLB=lib1 RGPL=illumina RGPU=unit1 RGSM=20
java -jar picard.jar
Function: Chart the nucleotide distribution per cycle in a SAM or BAM fileThis tool produces a chart of the nucleotide distribution per cycle in a SAM or BAM file in order to enable assessment of systematic errors at specific positions in the reads.
Usage: java -jar picard.jar CollectBaseDistributionByCycle CHART=collect_base_dist_by_cycle.pdf I=input.bam O=output.txt
java -jar picard.jar
Function: Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing experiments, but only at a set of sampled positions. It is important that the sampled positions be chosen so that they are spread out at least further than a read's length apart; otherwise, you run the risk of double-counting reads in the metrics. If contig-sized intervals are needed, use INTERVALS argument in CollectWgsMetrics.
Usage: java -jar picard.jar CollectWgsMetricsFromSampledSites
samtools calmd
Function: Generate the MD tag. If the MD tag is already present, this command will give a warning if the MD tag generated is different from the existing tag. Calmd can also read and write CRAM files although in most cases it is pointless as CRAM recalculates MD and NM tags on the fly. The one exception to this case is where both input and output CRAM files have been / are being created with the no_ref option.
Usage: samtools calmd [-eubrAESQ] <aln.bam> <ref.fasta>
Supported input format: BAM, CRAM, SAM
java -jar picard.jar
Function: Writes an interval list based on splitting a reference by Ns. This tool identifies positions in a reference where the bases are 'no-calls' and writes out an interval-list using the resulting coordinates. This can be used to create an interval list for whole genome sequence (WGS) for e.g. scatter-gather purposes, as an alternative to using fixed-length intervals. The number of contiguous nocalls that can be tolerated before creating a break is adjustable from the command line.
Usage: java -jar picard.jar ScatterIntervalsByNs R=reference_sequence.fasta OT=BOTH O=output.interval_list
java -jar picard.jar
Function: Takes a VCF and a second file that contains a sequence dictionary and updates the VCF with the new sequence dictionary.
Usage: java -jar picard.jar UpdateVcfSequenceDictionary
java -jar picard.jar
Function: Convert a BAM file to a SAM file, or SAM to BAM. Input and output formats are determined by file extension.
Usage: java -jar picard.jar SamFormatConverter
samtools idxstats
Function: It retrieves and prints stats in the index file.
Usage: samtools idxstats in.sam|in.bam|in.cram
Function: Convert alignments in BAM or SAM format into fastq format.
Usage: -i test_SingleEnd_StrandSpecific_hg19.bam -s -o bam2fq_out2
java -jar picard.jar
Function: Replaces the SAMFileHeader in a SAM or BAM file. This tool makes it possible to replace the header of a SAM or BAM file with the header of anotherfile, or a header block that has been edited manually (in a stub SAM file). The sort order (@SO) of the two input files must be the same.Note that validation is minimal, so it is up to the user to ensure that all the elements referred to in the SAMRecords are present in the new header.
Usage: java -jar picard.jar ReplaceSamHeader I=input_1.bam HEADER=input_2.bam O=bam_with_new_head.bam
Function: For a given alignment file (-i) in BAM or SAM format and a reference gene model (-r) in BED format, this program will compare detected splice junctions to reference gene model. splicing annotation is performed in two levels: splice event level and splice junction level.
Usage: -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam -o output -r hg19.refseq.bed12
samtools quickcheck
Function: Quickly check that input files appear to be intact. Checks that beginning of the file contains a valid header (all formats) containing at least one target sequence and then seeks to the end of the file and checks that an end-of-file (EOF) is present and intact (BAM only).
Usage: samtools quickcheck [options] in.sam|in.bam|in.cram [ ... ]
samtools cat
Function: Concatenate BAMs. The sequence dictionary of each input BAM must be identical, although this command does not check this. This command uses a similar trick to reheader which enables fast BAM concatenation.
Usage: samtools cat [-h header.sam] [-o out.bam] <in1.bam> <in2.bam> [ ... ]